Read about this on my blog? (http://flyingjester.site11.com/FJ-blog.html?id=5)
I was recently thinking about how hard it would be to make a JIT compiler. The first question is, how would I actually generate code? As in, actually get arbitrary machine code put into memory at run time to execute?
Turns out it's not that hard.
Note that all these snippets assume you have a Unix-like environment, an amd64 CPU, and are compiling for 64 bits.
Here, I'm making up an array of raw bytes. They are all NOPs (no operations, the CPU sees this and does nothing about it), and finally a 'ret' statement. As long as we are in 64-bits or a have a normal calling convention, 'ret' is just like the proper keyword 'return'.
The kind of funky thing is that we not only need to mark the memory we want to execute as executable (which makes sense, here it's the call to mprotect()), we can't do that on just any memory. Normally, all mapped memory is read/write, but not executable.
In Unix/Posix, we can ask for a memory page with mmap(). This ensure we get a whole page, and assures that the address returned meets a bunch of special rules that we aren't too concerned with the details of. The important part is that addresses returned by mmap can be mprotect'd to arbitrary access usage.
Conveniently, we can mmap a page for read/write, copy our machine code to it, and then mark it as executable without too much hassle.
All we have to do then is explain to our C++ compiler that the addresslPage can be called like a function (which it kind of is). Interestingly, you NEED a C-style cast here. C++'s wonderful casts simply don't allow you to cross the data/instruction barrier this way.
But, that's not really a compiler of any sort, it's just injecting arbitrary code into a program.
Well, the array of chars that is our machine code could be modified. Say we want to make up some machine code that adds two arbitrary numbers, but we don't want to load the numbers, we want them written into the machine code itself once they are known.
It would look something like this:
So that's actually much cooler. Now, we are generating machine code on the fly!
But you know what would be even cooler? If we made the code's behaviour even more dynamic. Just modifying data is cool and all, but we could have just coded in addresses and used pointers in our machine code (that also would have been kind of cool, given that now our machine code would have embedded instance-specific addresses...). What if we actually change both instructions and data to generate our code?
Now that's much more like it.
So, what did I learn from this adventure?
Dynamic code generation and execution is frighteningly easy. I didn't expect this to be so simple, or to work so easily.
Of course, this is bordering on the kind of black magic that could destroy any project. It's ridiculous and completely unnecessary. Don't actually do this unless it is the intended product of your program.
...But it's also really fun to do!
It also looks frighteningly cryptic. :P Good luck debugging any non-trivial example.
I do see a good emulator use case for this, I guess.
The more I thought about it, I see two ordinary, non-runtime code generating times this would be useful. They are as a replacement for inline ASM, and as a method for including binary blobs.
Inline ASM has two notations, and lots of slightly ridiculous rules in it--just look up how to avoid register clobbering with GCC's inline asm! Additionally, inline ASM is not supported on some common architecture/compiler combinations, like Visual Studio and amd64 Windows.
This approach has the advantage of only being dependent on the host architecture (can't get away from that with ASM), but also allowing you to call directly at whatever ASM block you have, since you have its address.