GPU particles
Posted: Sun Oct 23, 2016 5:56 am
I've been working on my game and finding particle systems a big drain on performance, that is until I found a GPU particle example by Jonbro. (viewtopic.php?t=81865).
With his permission, I'm releasing an enhanced version with the following features:
Particle collision via distance field calculation.
Control over various parameters like spawn rate and particle type.
Particle stretching with velocity.
Multiple particle systems working alongside each other.
Multiple types of particles possible within a single system.
Multiple particle textures in a single system.
Translation and scaling of the particles within world space.
Delta time instead of lockstep velocity.
It requires a video card capable of 32 bit floating point textures, although it will run on 16f - it doesn't look good past a certain resolution. Your mileage may vary on mobiles and laptops. Some people have had problems with this system in the past but I had none between 3 different machines personally - although I haven't tested this using a modern AMD video card yet. I'm sure there are multiple issues and performance enhancements that could be made to this, but I've run out of time. I wanted to have it automatically re-compile particle shaders from strings to construct each particle system based on parameters for maximum efficiency.
GPU particles aren't as flexible as CPU systems, but they are a great deal faster. 1 million particles, each with individual collision and texture are no problem. Modern games typically combine light CPU particle systems with GPU particles to make up numbers.
The actual textures in memory storing all the particles:
The distance field:
This provides the collision normals for determining particle reflections. It's a fairly heavy shader pass but it doesn't need to run every frame. I have it set to reconstruct every 50ms, with the previous distance field interpolated with camera movement, which is sufficient for my game - it's not that bad running every frame, but I'd avoid it where possible.
Ideally for performance reasons you'll want to split up your particle effects instead of using one giant shader for all of them (although one system is much more efficient if you're using a lot of one particle). Each particle effect should have it's own shader unless it can reasonable share physics without overloading your shader with comparison statements.
When or If Love develops an asynchronous PBO/getPixel implementation, GPU particles may be used for more complicated purposes, like managing hundreds of thousands of objects which the CPU cannot and similar applications like that. I think in it's current state you could do liquid, but if you want to interact with it from the CPU that makes it difficult to do in real-time.
Known issues:
Collision inaccuracy from either scaling or the 1024x1024 distance field.
RGBA16f could probably still be used for a big speed and compatibility boost, but I'm still not sure how to get it working acceptably.
It's possible for particles to get stuck, especially if the distance field isn't regenerated when scaling the view.
Distance field rendering normals could be improved. The field should actually extend to well within the collider so that particles that overstep the collision barrier can escape.
Particles under a certain velocity don't have the floating point accuracy to actually move. This is not much of a problem generally since you can scale both the system up/down and the particle itself, although 16f positional textures will make it readily apparent.
This is not a library that I will be maintaining, but I thought it might be useful to the community in its current state. Let me know if you have problems or have an interesting effect to share. Jonbro's original work is licensed under MIT and mine under public domain.
edit: updated with a higher compatibility version
With his permission, I'm releasing an enhanced version with the following features:
Particle collision via distance field calculation.
Control over various parameters like spawn rate and particle type.
Particle stretching with velocity.
Multiple particle systems working alongside each other.
Multiple types of particles possible within a single system.
Multiple particle textures in a single system.
Translation and scaling of the particles within world space.
Delta time instead of lockstep velocity.
It requires a video card capable of 32 bit floating point textures, although it will run on 16f - it doesn't look good past a certain resolution. Your mileage may vary on mobiles and laptops. Some people have had problems with this system in the past but I had none between 3 different machines personally - although I haven't tested this using a modern AMD video card yet. I'm sure there are multiple issues and performance enhancements that could be made to this, but I've run out of time. I wanted to have it automatically re-compile particle shaders from strings to construct each particle system based on parameters for maximum efficiency.
GPU particles aren't as flexible as CPU systems, but they are a great deal faster. 1 million particles, each with individual collision and texture are no problem. Modern games typically combine light CPU particle systems with GPU particles to make up numbers.
The actual textures in memory storing all the particles:
The distance field:
This provides the collision normals for determining particle reflections. It's a fairly heavy shader pass but it doesn't need to run every frame. I have it set to reconstruct every 50ms, with the previous distance field interpolated with camera movement, which is sufficient for my game - it's not that bad running every frame, but I'd avoid it where possible.
Ideally for performance reasons you'll want to split up your particle effects instead of using one giant shader for all of them (although one system is much more efficient if you're using a lot of one particle). Each particle effect should have it's own shader unless it can reasonable share physics without overloading your shader with comparison statements.
When or If Love develops an asynchronous PBO/getPixel implementation, GPU particles may be used for more complicated purposes, like managing hundreds of thousands of objects which the CPU cannot and similar applications like that. I think in it's current state you could do liquid, but if you want to interact with it from the CPU that makes it difficult to do in real-time.
Known issues:
Collision inaccuracy from either scaling or the 1024x1024 distance field.
RGBA16f could probably still be used for a big speed and compatibility boost, but I'm still not sure how to get it working acceptably.
It's possible for particles to get stuck, especially if the distance field isn't regenerated when scaling the view.
Distance field rendering normals could be improved. The field should actually extend to well within the collider so that particles that overstep the collision barrier can escape.
Particles under a certain velocity don't have the floating point accuracy to actually move. This is not much of a problem generally since you can scale both the system up/down and the particle itself, although 16f positional textures will make it readily apparent.
This is not a library that I will be maintaining, but I thought it might be useful to the community in its current state. Let me know if you have problems or have an interesting effect to share. Jonbro's original work is licensed under MIT and mine under public domain.
edit: updated with a higher compatibility version