Comment by seanmcdirmid
Comment by seanmcdirmid 9 months ago
> A Blender programmer calculating cloth physics would rather have 4x weaker cores than 1x P-core.
Don’t they really want GPU threads for that? You wouldn’t get by with just weaker cores.
Comment by seanmcdirmid 9 months ago
> A Blender programmer calculating cloth physics would rather have 4x weaker cores than 1x P-core.
Don’t they really want GPU threads for that? You wouldn’t get by with just weaker cores.
NVIDIA's physX has its own cloth physics abstractions: https://docs.nvidia.com/gameworks/content/gameworkslibrary/p..., so I'm sure it is a thing we do on GPUs already, if only for games. These are old demos anyways:
https://www.youtube.com/watch?v=80vKqJSAmIc
I wonder what the difference is between the cloth physics you are talking about and the one NVIDIA has been doing for I think more than a decade now? Is it scale? It sounds like, at least, there are alternatives that do it on the GPU and there are questions if Blender will do it on the GPU:
https://blenderartists.org/t/any-plans-to-make-cloth-simulat...
Cloth / Hair physics in those games were graphics-only physics.
They could collide with any mesh that was inside of the GPU's memory. But those calculations cannot work on any information stored on CPU RAM. Well... not efficiently anyway.
---------
When the Cloth simulator in Blender runs, it generates all kinds of information the CPU needs for other steps. In effect, Blender's cloth physics serves as an input to animation frames, which is all CPU-side information.
Again: i know cloth physics executes on GPUs very well in isolation. But I'd be surprised if BLENDER's specific cloth physics would ever be efficient on a GPU. Because as it turns out, calculations kind of don't matter in the big-picture. There's a lot of other things you need to do after those calculations (animations, key frames, and other such interactions). And if all that information is stored randomly in 100GB of CPU RAM, it'd be very hard to untangle that data and get it to a GPU (and back).
In a Video Game PHYSX setting, you just display the cloth physics to the screen. In Blender, a 3d animation program, you have to do a lot more with all that information and touch many other data-structures.
PCIe is very slow compared to RAM.
Cloth physics in Blender are stored in RAM (as scenes and models can grow very large, too large for a GPU).
Figuring out which verticies for a physics simulation need to be sent to the GPU would be time, effort, and PCIe traffic _NOT_ running the cloth physics.
Furthermore, once all the data is figured out and blocked out in cache, its... cached. Cloth physics only interacts with a small number of close, nearby objects. Yeah, you _could_ calculate this small portion and send it to the GPU, but CPU is really good at just automatically traversing trees and storing the most recently used stuff in L1, L2, and L3 caches automatically (without any need of special code).
All in all, I expect something like Cloth physics (which is a calculation Blender currently does on CPU-only), is best done CPU only. Not because GPUs are bad at this algorithm... but instead because PCIe transfers are too slow and cloth physics is just too easily cached / benefited by various CPU features.
It'd be a lot of effort to translate all that code to GPU and you likely won't get large gains (like Raytracing/Cycles/Rendering gets for GPU Compute).