Thursday, September 24, 2009

How small is your instruction set?

[these posts are quite short because I typically have a little baby crying for my attention, see my 'Big T, little t' blog]

Let's make some big assumptions and work from there. While these assumptions are big, they give me a starting point to do some small, incremental work. Which is much easier than big upfront stuff - I am an Agilist if you weren't aware.

Imagine that your opcodes are compiled for you. Imagine that your opcodes are executing inside an image that already contains all the objects you want, for example any constants. Imagine that your dependency injection/capability interface provides you with a 'VM' object that can do things like addition and so on with a simple method invocation.

Given those assumptions, how many different opcodes do you need? Not many!

In fact, all you _need_ to do is be able to send a message. Because your message may have variable numbers of parameters, you may need another opcode to build up those parameters. So that's two. I'll add a third later, but for now that is my minimal VM implementation.

It's kinda like RISC vs CISC. I have 2 instructions, and rely on the JITer to make everything work hella fast. You (squirrelfish for example) have a bunch of instructions so you can interpret your opcodes fast. I'm pretty sure I'll end up down the squirrelfish route, but for now I'm more interested in implementing this minimal instruction set.

Here's what 'add' looks like on a number

function add: to
push this
push to
call vm, add_number, null

where 'null' is the target of the result

which may be used as follows

a.result = (1 + 2) / 3 // I quite like this example...

push 2
call 1, add, temp_1
push 3
call temp_1, divide, temp_2
push temp_2
call a, result, null

boom, as they say, shanka. Now you might wonder where all those variables came from. Well they are local variables/constants within the already compiled function - I did mention that everything was compiled and all the objects we need exist...

This makes for a nice, tiny VM that I can write very quickly and get going very quickly. I love small goals and lots of rewards!

The only real complication on top of the two opcodes are the actual data structures involved.

I'll need a general object, and then function objects and number objects (to start with). I'll need to create, in the 'image' (which will be constructed in the test code) the constants 1,2,3 and their parent, the object a and the function that is being executed. Quite a bit of work, but not as much as if I created an exhaustive instruction set.


Above I mentioned another instruction. Because of the Newspeak/NewtonScript inspiration for this VM, I want to be able to send messages to an unknown recipient. To be perfectly honest, this is almost happening when accessing constants and temps above, but things would get very cyclical if I couldn't do that.

So my new instruction is

find_and_call message_name, target

which will first look inside all closures/activation objects (ie the context where the instruction was defined) and then look at the current object's parents.

No comments:

Post a Comment