Skip to main content

News

Topic: TurboSphere (Read 190593 times) previous topic - next topic

0 Members and 4 Guests are viewing this topic.
  • DaVince
  • [*][*][*][*][*]
  • Administrator
  • Used Sphere for, like, half my life
Re: TurboSphere
Reply #285
Good job, I remember it freezing for half a second on Windows!

On a technical note, I have no clue what mutexes and atomics are, and why they affect the performance like they do. Think you can summarize or link to good articles explaining these?

  • Fat Cerberus
  • [*][*][*][*][*]
  • Global Moderator
  • Sphere Developer
Re: TurboSphere
Reply #286
Yeah, mutexes are meant to be used for cross-process signaling (e.g. a lot of apps create a mutex when running that their installer checks for and prevents installation if it exists), so they have a good amount of overhead. So it's good that you fixed that.

@DaVince:
Mutexes and such are used to synchronize operations between different threads and/or processes that need to manage a shared resource (for example, rendering and updating code in a game). "Mutex" is actually short for "mutual exclusion", which is a pretty fair description of what it does, actually. :)
neoSphere 5.9.2 - neoSphere engine - Cell compiler - SSj debugger
forum thread | on GitHub

  • DaVince
  • [*][*][*][*][*]
  • Administrator
  • Used Sphere for, like, half my life
Re: TurboSphere
Reply #287
Thanks for the explanation. Sounds useful for the right situation. :)

Re: TurboSphere
Reply #288
One of the biggest 'issues' with mutexes is that locking and unlocking them is slow(ish). Which is why you need to be very efficient with thread synchronization.

Mutexes are great for protecting a shared resource--they symbolize it, if you have locked the mutex than you have locked every other thread out of the resource, and can do whatever you want to it without worrying about other threads meddling with it.
But they aren't so great for signalling (and also not explicitly made to do so in the way I was using them). They work fine for it, but this is where that whole slowish thing comes in. If you profile TurboSphere, you will see that over half of its time is spent locking and unlocking mutexes. But I've managed to reduce that to about a third (even with the surface thread skewing this--most of what it does is wait for surface operations to be queued, just locking and unlocking mutexes with nothing better to do).

I've changed it so that instead of a mutex that controls whether or not the surface thread should do whatever or work on a specific surface, there is an atomic variable (atomic, as in I can guarantee that I can write or read it from multiple threads and nothing horrible will happen). The atomic is just a flag that the surface thread checks, and the main thread sets and unsets.

There is one last mutex I think I could replace with an atomic. It would probably improve performance even more (this is half of why blitting surfaces directly to screen in slow in TurboSphere), but replacing it would require some changes to the structure of the surface thread and supporting functions in the main thread. I plan on doing it, but it will take serious work. Threading requires a lot more consideration and concentration than most other things I've done. Which is why there are copious comments in the threading code in TurboSphere (which is probably why in turn it worked almost perfectly the first try!).
  • Last Edit: November 18, 2013, 02:48:49 pm by Flying Jester

  • Radnen
  • [*][*][*][*][*]
  • Senior Staff
  • Wise Warrior
Re: TurboSphere
Reply #289
I'm always one for getting your design done right before you optimize (so a good design that's initially slow is a good thing). So it's good to see you optimizing right about now (non-trivially of course).

About the surface caching: I do that too! Whenever you modify the surface a changed flag is set to true. What it does is delay the processing of the surface drawing until you actually do one of three things: convert to image, blit to screen, and save to file (which sets the flag back to false).
If you use code to help you code you can use less code to code. Also, I have approximate knowledge of many things.

Sphere-sfml here
Sphere Studio editor here

Re: TurboSphere
Reply #290
I don't delay the processing per se. The surface thread just goes along working on surface operations in the order they come in, at its own pace. When a certain surface is needed, the surface thread goes through all the operations where the requested surface is the destination, and checks the source surface. If the source surface has older pending operations where it is the destination, those operations are performed first, then the original operation on the requested surface.

But the only time a surface is needed is for blitting, saving, and creating an image. At one point I removed the calls for requiring the surface from all the primitives, but those changes aren't in TurboSphere anymore (I lost my laptop's HDD, and it was over the summer when I had such bad internet that it was a serious ordeal to commit changes).

In TurboSphere it's more like out-of-order, lazy computation than delayed computation.

In my experience, requiring a surface usually flushes out almost all of the pending operations (forcing them to be completed right then and there). It's just a how things work in practice. But the system could be used in a way that lets you do a ton of surface work in the background, fully parallel to the main thread. The test game that comes with TurboSphere does this with the Majestic Map Engine (that's what can cause the engine to hang on startup).

Mainly what caching would entail is just adding a texture to the userdata of the surface, and a flag that determines if the surface needs to be pushed to the texture again. Any operation on the surface will unset the flag, and a blit, save, or conversion will first push the new surface data to the texture.
  • Last Edit: November 18, 2013, 03:38:53 pm by Flying Jester

  • Fat Cerberus
  • [*][*][*][*][*]
  • Global Moderator
  • Sphere Developer
Re: TurboSphere
Reply #291
The thing with mutexes though, at least under Windows, is that they are cross-process. This is a big part of why they are slow. It also means, if one process creates a mutex, another can't use the same name or it'll just get the existing mutex back (which is bad if more than one instance of TS is running).  It's been a long time since I've written any multithreaded code, but I was almost positive there is a process-local alternative to mutexes (semaphores or something? IDR) that is faster and won't trample over other engine instances.

Not sure what you're getting at with the atomics though.  To me that sounds suspiciously like DCLO, which can still cause race conditions under preemptive threading, even with an atomic flag variable.
neoSphere 5.9.2 - neoSphere engine - Cell compiler - SSj debugger
forum thread | on GitHub

Re: TurboSphere
Reply #292
I don't think that is a feature of semaphores. Semaphores are like Mutexes, but don't explode if they are modified by a different thread, and can be locked and/or unlocked a number of times. They are evil, complicated, almost never the right solution, and evil.

The SDL2-exposed mutexes definitely aren't inter-process by name, or at least are mangled in some way to avoid it.

The atomics aren't guarding anything. They are sending signals that change how another thread will operate. Even if they were always wrong, TurboSphere would run fine, just with more pauses or slightly slower in general.
The way it's built right now, the worst thing that can happen is that the surface thread will do one extra operation before working on a specific surface.

You can, theoretically, replace almost all mutexes with atomic flags. But if you do replace data-guarding mutexes, there be dragons. I'm just replacing a signalling mutex with an atomic.

...DCLO?
  • Last Edit: November 18, 2013, 06:17:21 pm by Flying Jester

  • Fat Cerberus
  • [*][*][*][*][*]
  • Global Moderator
  • Sphere Developer
Re: TurboSphere
Reply #293
Okay, phew. I was worried you were using the atomics for synchronization, which as you said, is asking for trouble (on a sidenote, "There be dragons" is priceless and never gets old :)).  And as for this...


...DCLO?


...I was referring to what is known as "double-checked locking optimization" (look THAT one up on Wikipedia, it's there), where you check for a locking condition first, and only if that condition is met do you lock the mutex, otherwise you continue as usual.  Bad enough as is, but I've seen some pretty terrible DCL implementations that simply use a bare variable as a guard, assuming the fact it's atomic is enough (which as we know, it isn't. It really, really isn't).  So yeah, good to know you're not doing that!

On a sidenote, all this progress on TurboSphere is making me want to try making an engine in C++ again.  It's been years since I've done that and the best I ever came up with was a crappy software-rendering engine that could only blit and draw rectangles, with some other gimmicky stuff thrown in like a universal map engine (meant to be extensible but I didn't really have the know-how to pull that off) and such.  I will say though that my time-based update system was pretty awesome.  That allowed the game to run at any framerate, even hundreds of thousands of FPS, and still update everything at a constant rate, completely decoupling the updating and rendering routines without even having to throttle updates.  (The downsides being: to achieve that kind of precision I had to use QueryPerformanceCounter, a facility I'm not sure is available in anything other than Windows; and double-precision was required for almost all gameplay variables).  But yeah, ultimately it was a mess; that, combined with how easy it is to program with Sphere and JS, hasn't exactly made me want to go back and try again. Maybe some day...
  • Last Edit: November 18, 2013, 09:55:58 pm by Lord English
neoSphere 5.9.2 - neoSphere engine - Cell compiler - SSj debugger
forum thread | on GitHub

Re: TurboSphere
Reply #294
I found my solution to ending up with a mess when I program from my years with TurboSphere. Compartmentalize as absolutely far as possible. Not only does it encourage good considerations for code reuse, but it allows you to refactor or replace most pieces fairly easily in case they get out of hand.

The plugin system and the four supporting libraries (sort of five) were my first experiment with that style, and I'm extremely happy with how it turned out. In fact, I've replaced the largest plugin, the graphics plugin (which I consider to be bordering on too largely scoped) twice now with few problems.

Originally, I started TurboSphere with the intention that I would write C++ just this once, so that I wouldn't have to ever again. It didn't really turn out that way...I actually have grown to like C++ and C, despite their slightly-screwed-upness.

  • Fat Cerberus
  • [*][*][*][*][*]
  • Global Moderator
  • Sphere Developer
Re: TurboSphere
Reply #295
I don't really mind C++ as a language, even with all its quirks, and unlike a lot of people I don't consider the fact it allows you to shoot yourself in the foot a bad thing, as it means you can do some pretty evil stuff when needed (there is still such a thing as "necessary evil" :) ).  No, the biggest turnoff for me is its awful clunky header file system.  Headers were okay in C where the only thing you could expose were functions, variables, and the occasional struct) but in something as complex as C++ it's horrifying.  Class declarations have to include all private variables (unless you use something like the pImpl idiom, which hurts performance and requires more work to maintain) to avoid segfaults, which has the nasty side-effect of introducing extra dependencies for anything including the header--and that doesn't even begin to scratch the surface of the tip of that iceberg.

.NET does a much better job of managing dependencies.  All you have to do is add the DLL as a reference and (optionally) import the proper namespace(s) and you're good to go.  No messing with headers or import libs or anything stupid like that.

...and yeah, I think I've ranted long enough. :-X
neoSphere 5.9.2 - neoSphere engine - Cell compiler - SSj debugger
forum thread | on GitHub

Re: TurboSphere
Reply #296
I don't mind headers too much. They are like a promise that the functions will exist at runtime.
There are ways around declaring private data and functions in the headers without PIMPL, but they are not worth effort in my opinion.

I've Update the TurboSphere API page with all the functions I feel comfortable saying are in TurboSphere. Some of them are buggy--I don't entirely trust the rawfile and bytearray objects in particular, and I haven't extensively tested joystick support. I also haven't fully tested the key constants since the change to SDL2, but I haven't seen a case of them being wrong yet.

Known bug: Saving to PNG will probably cause slightly annoying hanging in Windows. That's because there is a horrendous mix of MS tools, make, scons, and cmake involved for me to share the same libpng shared library with SDL2 on Windows. It works on Linux for me, but only on one machine (the other also hangs, even on Linux).

Saving to TGAs is my recommendation when file size is of concern, BMP otherwise. I've tested the TGA implementation fairly well, and the files are fully readable in feh, Windows Image Viewer, GIMP, and gPicView. I'm satisfied with that.

Also, GarbageCollect() finally actually does something since 0.3.5. Not much, but it does send a signal to V8 to do some GC'ing.

Re: TurboSphere
Reply #297
The sprite batcher is coming along. It automatically sorts new textures into texture atlases to limit texture changes. It can read new textures from surfaces or images.

This will be a pretty slick way to draw certain persistent parts of games simply and quickly, and be the graphical backend for the map engine.

  • Radnen
  • [*][*][*][*][*]
  • Senior Staff
  • Wise Warrior
Re: TurboSphere
Reply #298
Now that, that will put "turbo" in turbosphere. :)
If you use code to help you code you can use less code to code. Also, I have approximate knowledge of many things.

Sphere-sfml here
Sphere Studio editor here

Re: TurboSphere
Reply #299
Yes, it will! The biggest thing I can see about the Sphere API that limits performance is that it pretty much looks like a direct translation of array drawing in OpenGL. Which isn't exactly slow, but it's certainly not as fast as using buffers.

The biggest limit on performance in TurboSphere's graphics plugin right now is switching textures. I might just set it up to use the texture-combining part of the sprite batcher for all images/textures.