So, I've been playing around with my Sphere SFML and ended up creating a rather fast particle engine by being slow. Sounds interesting right? It's the tortoise vs. hare approach and I'll show you how it's done.
So, first things first we create a particle budget. This is the first trick to the system. We create a static sized array filled with dead particles. A dead particle has all the data fields it needs to do work, but is turned off when it's action is complete. Usually when a particle fades out, we can consider it dead since you cannot see it anymore.
Here's how the particle budget looks:
var ParticleEngine = (function() {
var particles = [], budget = 500, idx = 0;
function Setup(b) {
budget = b || 500;
for (var i = 0; i < budget; ++i) { particles[i] = new Particle; }
}
return {
setup: Setup
};
}
Next we define a particle:
function RenderP() {
this.color.alpha = this.alpha;
this.image.blitMask(this.x, this.y, this.color);
}
function UpdateP() {
var tstep = StateManager.timestep * 60;
this.x += this.vx * tstep;
this.y += this.vy * tstep;
this.time -= StateManager.timestep;
this.alpha = (this.time / this.life) * 255;
}
function SetupP(parent) {
this.alpha = 255;
this.image = parent.image;
this.x = parent.x;
this.y = parent.y;
this.vx = parent.speed * _cos(parent.angle);
this.vy = parent.speed * _sin(parent.angle);
this.time = parent.life / 1000;
this.life = parent.life / 1000;
this.fade = parent.fade;
this.color = parent.color;
}
var _cos = Math.cos, _sin = Math.sin;
function Particle() { this.time = 0; }
Particle.prototype.update = UpdateP;
Particle.prototype.render = RenderP;
Particle.prototype.setup = SetupP;
There happens to be minor speed increases caching functions like the above, but it seems to vary a lot depending on engine used. This is just the safest, fastest approach to use. These particles are simple, they only fade out and move in a single direction. But they don't have to. We can always add more logic to them to make them more complex. It's really up to you.
Next we create an emitter:
function Emitter(obj) {
this.image = obj.image;
this.x = 0;
this.y = 0;
this.angle = 0;
this.life = obj.life || 1000;
this.speed = obj.speed || 1;
this.color = obj.color || CreateColor(255, 255, 255);
this.fade = obj.fade || false;
this.rate = (obj.rate || 25) / 1000;
this.last = 0;
}
Emitter.prototype.emit = function(x, y, num) {
this.x = x;
this.y = y;
for (var i = 0; i < num; ++i) {
var p = particles[idx];
if (p.time <= 0) { p.setup(this); }
idx = (idx + 1) % budget;
}
}
I could have done some minor optimizing but I think it's safe since we use very few emitters in a game compared to particles rendered.
Take a look at the emit() method:
Emitter.prototype.emit = function(x, y, num) {
this.x = x; // here we set an emitters x/y since for neat effects we could have emitters move around the screen.
this.y = y;
for (var i = 0; i < num; ++i) {
var p = particles[idx];
if (p.time <= 0) { p.setup(this); }
idx = (idx + 1) % budget;
}
}
Heavy logic parts ahead, know your O-notation. Here is the good part. This is where dead particles get turned back on when an emitter emits particles. Rather than doing many O(n) lookups to find dead particles, we blindly loop through one at a time and turn them on. All emitters access the same global array in this same linear fashion so all emitters are essentially running the same O(n) complexity at the same time. It also skips over alive particles on the same burst. It turns out in practice that alive particles rarely overlap, unless the buffer is full. Technically we can be more accurate here, but remember on a full buffer it would repeatedly do O(n) lookups and "busy-wait" for a freshly dead particle which is incredibly slow. I'd rather it step over alive particles while it continues to decrement the burst count so that there is always a finite end to each burst (yes it reduces accuracy, but in practice it's not bad, especially with very high particle budgets).
Another neat thing here is the modulus. It makes sure we wrap around the particle buffer freely and efficiently. Otherwise I'd add if statements which are a tad more costly than a modulus and again, depends heavily on the JS engine used. What about not wrapping it (since wrapping really only happens at a fraction of the total size)? Well... we could forget about particles greater than the size limit and instead just skip the burst altogether and reset the count. But then large particle bursts can look ugly if timed incorrectly, so for a gain in accuracy we use the modulus. Also not wrapping still takes if statements and other control logic, and so is slower.
Finally, the meat of the emitter is the "if (p.time < 0) p.setup(this)" statement. It tells that particular particle to turn alive with parameters fed into it by the particular emitter used. This maintains the static particle budget and completely stops GC'ing from occurring. If we deleted and recreated particles each time we would have initially a faster particle engine since we are maintaining smaller arrays, but then there would be an unforeseen cost to GC collection and there would be constructor overhead as memory is acquired. The create/destroy approach is 100 times faster than this approach but scales horribly at the 500+ particle mark. This static method scales beautifully even at 15000+ particles.
So to continue in the vein that slower is better, let's take a look as to why. In a static system we must update and render all particles all the time, always checking if they are alive or dead.
function Update() {
var l = particles.length;
for (var i = 0; i < l; ++i) {
if (particles[i].time > 0) particles[i].update();
}
}
function Render() {
var l = particles.length;
for (var i = 0; i < l; ++i) {
if (particles[i].time > 0) particles[i].render();
}
}
I could micro-optimize this, but nevertheless the speed hit is the same. Whether you draw 1 or the entire buffer, the script speed hit is nearly the same. So your games base FPS will be slower by using this, but it'll end up steadily drawing 10000 particles whereas other methods will crap out at 1000 or less even if they draw the first 50 thirty or more times faster than this method.
It just goes to show that sometimes extremely high FPS is not the answer to a fast 60 FPS game. It's rather a steady framerate that can take pressure under larger loads.Context:
I'm currently creating a space-sim game in Sphere SFML (flexing my math muscles a bit) and the fps in SSFML in that game went from 3500 FPS using my old particle system to the more stable 1000 FPS. At 100 particles the fps was at 200 FPS on old system and at 900 FPS new system. At 1000 particles the FPS was 60 on old system and 850+ on new system. The fps only dropped as more particles were rendered, but makes up for it in the fact the JS is constant time whereas before more JS was executed per particle, blowing up the complexity of the system. That and HW rendering becomes the next major bottleneck (which is rather good so no problem there).
I have likewise used this approach to draw other things too. Like a static sized spaceship pool, GUI element pool, etc. It really is a much faster approach and probably closer to how games were originally coded on low-powered devices like the GameBoy, NES etc.