Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Win32 Handle extension #442

Merged
merged 10 commits into from
Aug 28, 2024

Conversation

Agrael1
Copy link
Contributor

@Agrael1 Agrael1 commented Aug 25, 2024

No description provided.

@Agrael1 Agrael1 marked this pull request as ready for review August 26, 2024 10:42
@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 26, 2024

Question: Do I add vkGetMemoryWin32HandleKHR into allocator functions, or do I add it as a function argument?
First approach is clean wrt argument count, but if I don't have Win32 platform enabled this is kinda pointless. Second is less appealing, but safe, because it relies solely on user providing function, being static or dynamic

@adam-sawicki-a
Copy link
Contributor

I would add it to the struct with functions, just like all other Vulkan functions.

@adam-sawicki-a
Copy link
Contributor

Is it safe to use an atomic? What if 2 threads want to fetch the handle at the same time? Shouldn't you use a mutex around the whole fetching logic?

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 26, 2024

Is it safe to use an atomic? What if 2 threads want to fetch the handle at the same time? Shouldn't you use a mutex around the whole fetching logic?

No, atomics are safe. I also did the relax ordering, because we have no memory to make barrier onto.
If 2 threads are accessing there are 2 situations that may happen:

  1. already assigned -> easy duplicate

  2. not assigned: -> creates handle (maybe twice, but who cares)
    then there is compare_exchange, which is atomic, and if the underlying vlue changes it returns false.
    One compare_exchange always succeeds, others will return false, and handle will not be reassigned. If there is a driver that shouts error (didn't find any major one) this is also OK, since that means there is handle somewhere already. Any excessive HANDLES are removed. We also provide handles for other processes, that's why there is that AMD fix there.

Actually I could tinker with a lot of atomics in this lib and make it lock free. There are lock free queues there, so maybe one of the next steps :)

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 26, 2024

I actually asked Vulkan if that's OK to call on vkGetMemoryWin32HandleKHR multiple times. They pointed out that this is UB and not an error, meaning there is no harm in it, just we have no clue what's gonna be returned.

@adam-sawicki-a
Copy link
Contributor

Isn't this whole effort about making sure we don't call vkGetMemoryWin32HandleKHR twice for a device memory object? As you posted the validation message in the other ticket:

VUID-VkMemoryGetWin32HandleInfoKHR-handleType-00663
If handleType is defined as an NT handle, vkGetMemoryWin32HandleKHR must be called no more than once for each valid unique combination of memory and handleType

If so, isn't your current code unsafe in this regard when called from multiple threads?

When talking about an UB, we typically understand that anything can happen, including entire app or system crash, so is there really "no harm" when we trigger an an UB?

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 26, 2024

Hmm, you're right, though there was a limitation for me, hence why I chose to use atomics, and that being absence of mutex on dedicated memory. Should I add one?

@adam-sawicki-a
Copy link
Contributor

Yes please, I think you need a mutex for this to be safe.

Another topic to think about: What if an allocation ends up as a dedicated allocation?

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 26, 2024

If it has VkDeviceMemory it is not a problem. There is no mutex in this structure, so I think I will add one

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 26, 2024

Added a mutex. There is a really low chance in current setup that it will be used, so it imposes little overhead

@adam-sawicki-a
Copy link
Contributor

It seems you reformatted the entire file with some automatic formatting tool! Please don't do this. It makes it difficult to see what are the real differences you introduce to the code.

Can you please surround all your code with an #ifdef telling about the availability of the VK_KHR_external_memory_win32 extension, similar to e.g. VMA_EXTERNAL_MEMORY?

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 26, 2024

crap, reflexes

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 26, 2024

Done

include/vk_mem_alloc.h Outdated Show resolved Hide resolved
include/vk_mem_alloc.h Show resolved Hide resolved
include/vk_mem_alloc.h Outdated Show resolved Hide resolved
include/vk_mem_alloc.h Outdated Show resolved Hide resolved
include/vk_mem_alloc.h Outdated Show resolved Hide resolved
include/vk_mem_alloc.h Show resolved Hide resolved
@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 27, 2024

Hopefully I defined correct one. Also, I have a linker error with static function, seems like it is only possible to get the function dynamically

@adam-sawicki-a
Copy link
Contributor

Yes, functions from extensions can only be fetched dynamically.

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 27, 2024

do I set it to null, or try to fetch with dynamic code?

@adam-sawicki-a
Copy link
Contributor

Please do like the other extensions do - for Static don't do anything, for Dynamic try to fetch the function pointer, for ValidateVulkanFunctions - assert if not null if the extension is enabled.

@adam-sawicki-a
Copy link
Contributor

Are you willing to develop some simple test for this new functionality?

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 27, 2024

Sure

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 27, 2024

I wonder, why don't we allow for dedicated allocations to have custom pNext?

@adam-sawicki-a
Copy link
Contributor

I wonder, why don't we allow for dedicated allocations to have custom pNext?

To avoid feature creep and not extend the VmaAllocationCreateInfo further. Please note that dedicated allocations can also be created in a custom pool.

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 27, 2024

What do I specify at memoryTypeIndex on pool?

@adam-sawicki-a
Copy link
Contributor

To create a custom pool you need to find the right memory type and choose it explicitly:
https://gpuopen-librariesandsdks.github.io/VulkanMemoryAllocator/html/custom_memory_pools.html#custom_memory_pools_MemTypeIndex

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 27, 2024

I added code for fetching the extension and creating a pool. Can't launch test, however, it throws at some random point

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 27, 2024

Fixed, working like a charm

@adam-sawicki-a
Copy link
Contributor

Thank you for all this code! I need to analyze it some more, but it looks good overall.

There is one more thing I would like to have. I don't want the VmaAllocation_T structure to grow in size. As we need to add a HANDLE to it, I'm thinking about defining some new structure like VmaAllocationExtraData that would hold the new HANDLE and the old m_pMappedData, while an allocation object (m_DedicatedAllocation member) would contain a pointer to it. For most allocations, the pointer would be null. Only when a dedicated allocation needs to store a non-null mapped pointer and/or a Win32 handle, it would dynamically allocate the structure.

Would you like to implement such thing?

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 27, 2024

Well, I'd take it to the new PR.
I'd also like to do several things:

  • Slice the library into different files for easier development, and creating a generator to concatenate them to a single file.
  • Make use of inline and make lib pure header only
  • Make 2 targets, 1 is static and 2 is header only cmake target
  • Formatting bot + .clang-format

Tell me if you are interested in some of them :)

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 27, 2024

Also VmaDeviceMemoryBlock has the same m_pMappedData and handle, should that also be replaced?

@adam-sawicki-a
Copy link
Contributor

OK, I'll analyze your code soon and hopefully I merge it. I agree the optimization to extract new structure VmaAllocationExtraData is better done as a separate task.

I am sorry but I am not interested in any of the big changes you proposed. The library is now mature and used by many developers. I don't want to do a revolution that could break anyone's workflow. If you want to introduce such big changes to the code, please feel free to maintain your own fork. I am happy to link to it from README.

About the Cmake script especially, I don't want to make more major changes, because this is a never-ending story of pull requests from users with various visions about how it should look like. The more such requests I get, the closer I am to removing the Cmake script completely. I regret adding it in the first place. I agree with the arguments of @ocornut about the value of not having it, as mentioned in ocornut/imgui#7892, section "3.6. Lack Of A Build System Is A Feature". Please also note that VMA ships with the Vulkan SDK as "vk_mem_alloc.h" file only, not the entire repository and no Cmake script.

@adam-sawicki-a
Copy link
Contributor

VmaDeviceMemoryBlock doesn't need to have the mapped pointer extracted to a separate structure because I am not concerned about adding a new member to this class. Only the allocation object should stay lightweight.

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 27, 2024

Ah, ok. I thought about making Wisdom Memory allocator, but that would require me to also get D3D12 MA shredded, which is a ton of work.

Maybe one day.

@adam-sawicki-a
Copy link
Contributor

Do you use D3D12MA library as well? Extracting common code (the core TLSF allocation algorithm) from both libraries is another major refactoring (after splitting the library into multiple source files) that is possible but I don't want to do. If you do it on your fork, that would be great.

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 27, 2024

I have Wisdom library, which is super thin compared to others, but super demanding in terms of Vulkan (works no less than 1.3 and requires heavy extensions). I stumbled upon both allocators, and now they are part of its standard. If there is a wish, I could do the Wisdom memory allocator and do the extraction for both.

@adam-sawicki-a
Copy link
Contributor

It is up to you. From my perspective, it is good if we just have the change to VMA discussed in this PR.

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 27, 2024

Will do. I'd also like to have Wisdom mentioned :)
It is alpha, but soon it will be finished

@adam-sawicki-a adam-sawicki-a merged commit 0d55cf5 into GPUOpen-LibrariesAndSDKs:master Aug 28, 2024
@adam-sawicki-a adam-sawicki-a added feature Adding new feature next release To be done as soon as possible labels Aug 28, 2024
@adam-sawicki-a
Copy link
Contributor

Thank you very much for this extensive and high quality contribution.

You didn't test dedicated allocations. Because of that, you didn't meet the bug that happens because VmaAllocation objects are created using a custom CPU memory pool and don't get their constructors and destructors called, so the handle of a dedicated allocation could have garbage data. This is another argument to keep the handle, together with the mapped pointer, in a new dynamically allocated structure VmaAllocationExtraData. I added it.

I also made some other fixes and written documentation for the new feature. Please let me know if it looks good to you.

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 28, 2024

I need to finish one project, then I will continue. Where can I find new documentation?

Glad I could help!

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 28, 2024

Ah, I see, you have already implemented the data pointer, well, then if something comes, I could make a feature or two

@Agrael1
Copy link
Contributor Author

Agrael1 commented Aug 28, 2024

Wow, documentation looks awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Adding new feature next release To be done as soon as possible
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants