Monday, December 28, 2009

How to create a program

Right, so I'm back at square one.

I have fibonacci working, again, but this time with Self like objects. So how, now, should I create this fibonacci library inside my VM?

First approach is to create a constructor for the library that, when called, will go through and create everything that is needed for that library. Not too bad when considering such a simple example.

A second way would be to have a special creator function that uses Reflection/Mirrors to create all the objects and their methods. This does a better job of separating the construction logic from the actual application.

The final approach is to use serialization. In this form we have one, common, method for serialising an object graph, and its de-serialising sister. The application that we are creating is built within an IDE of some kind, and then serialised for later use.

I've been trying to think through how this might work in a prototype scenario. I quite like splitting things into a few layers;
  • The application prototype. This contains the definition of all the objects in the application, many of them set up in a prototypical fashion so that users of the program can have a look and see how the thing might be set up. This is essentially immutable.
  • The application instance. (Implying more than one instance...) This extends the prototype and actually does stuff. The prototype should never be touched by the instance (it is shared). This is what is saved when a user wants to save something... using the same serialisation/de-serialisation method.
  • The application transaction scratch space :) where in the application instance does some little bits and pieces on the way to the completion of a computation, but is too transient to bother saving in the instance.
All this requires thinking about the usual serialisation stuff; how do you draw the line around what is to be serialised? How do you link to stuff within that boundary? Outside the boundary?

In this model the application prototype is similar to the class hierarchy side of things in Java etc, but with a little more information about how the whole thing is set up - there's a lot less need for Spring like things as most of the construction has already been done.

I also wonder how/if you might apply purely functional trees (as discussed here) to the application prototype to enable multiple version of the same application running at once, or updates being applied atomically to a currently running app.

private vs public

So my objects have two collections (maps to be precise); the private slots and the public interface.

The private slots can be any object. They could therefore be used for private methods, or anonymous functions which are then used within a method (this reduces the overhead of creating anonymous functions everytime we use them), or even constants within the object (eg numbers and strings).

Using constants in this way means we don't need to provide the object with any knowledge of those constant's factories etc except when the object is constructed. This is a benefit to the image/deserialization approach to application development (more about this later).

While I say 'private', 'protected' may be a better word as they are visible to all children of the current object. They are also 'updatable', but only from the current object's point of view, the original value on the original object does not change.

The public interface is made up of only methods - messages that can be sent to the object by any other object that has a reference.

It is obviously imperative that objects have this split from a security point of view, but they also help with data/implementation hiding, a very important part of modular development.

The problem with Javascript (and I think this is what Bracha was getting at) is that everything in the object is public. Now you can use patterns as defined by Crockford, but these are a bit kludgy, a bit nasty in terms of creating a shit load of objects (all those continuations, for each and every object) and hence lead to multiple approaches to object creation.

Friday, December 18, 2009

Selflike

So I'm almost to the point where things work the way I understand Self to work. I have one bug and one conundrum to contemplate.

The bug is that functions may send messages to their context instead of the enclosing object. Basically, if an implicit receiver is used, the correct message will be found but the object that the message should be applied to is ignored, and the current activation will always be used. Not a hard fix, but have to write some tests to get it right.

The conundrum concerns instantiation of an application. There are two options; explicit builder code that is run inside a secure package (read 'object'), or a serialisation approach where objects are specified concretely, and the VM will reconstruct them from the file.

Since it would be possible to implement the second option in the first, I guess that is what I'll go for, but the mood of the VM is more inline with the objects existing in some freeze-dried state rather than executing a series of opcodes to create everything.

Wednesday, December 16, 2009

What's changed

a list of things that I am changing to get this more Self like
  • Implicit receiver is always 'this'.
  • Changed Object to have multiple inheritance
  • Remove 'context' slot from Context (made its way onto Object), turn into named super slot
Big problem at the moment is setters. Works fine for objects, but what about functions and their contexts. A closure needs to be able to update the value of its context, but we don't want just anyone able to do that...

After thinking a little while, it turns out that it is just setters. Getters work fine, and of course messages are all public so it doesn't matter. But what we really want is a way to say 'this thing is settable by the objects I contain' (be they functions or objects). This opens the object up to nefariousness, so let's say that it can only be imbued in things the object created; you can't willy nilly specify that an Object's nester is any other object.

Ahhh. Sux

KISS

I'm sick of wracking my brain over Newspeak's confounding context/inheritance hierarchy*. It's just too nasty to implement, with so many ifs and buts. So fugger it, I'm going back to Self.

Instead of having complex semantics about how to look up receivers, I'm just going to use multiple inheritance. Setters only effect the object that they are executed within ('this') while Getters go up and down the entire hierarchy.

This may sound slow, but once you've compacted the thing with PICs and so on, it really doesn't matter. I guess you could say that about the Newspeak approach as well, but this is a very simple algorithm and so should be quick to implement.

So where does 'nesting' go? Well, it's essentially a 'super' class, isn't it? In fact, apart from the convoluted semantics for receiver look up, that's all context is in Newspeak as well.

What do we lose? Well, it all ends up much less well defined, which is the power and the pain of Self, and maybe there's a security problem, but I believe most of that will be overcome by making sure that you are Dependency Injected. (Self essentially made everything accessible by containing a slot in Object to the package hierarchy)

So now we start thinking solely in instantiated objects... not pretending that some things are classes and some are contexts. It's all just objects.

Newspeak may prove better, but for my proof of concept I need something that I understand, so this will be it (Sorry Gilad that my brain is just too small)

*eg, I can't get a super receiver to then call an overriding method on the original object - but I'm sure it's just me.

Sunday, December 6, 2009

when all you have is semantics, is it a surprise that all your problems are with... semantics?


Trying to get the behaviour of message and slot lookup/access/modification correct.

Starting with a 'simple' example of inheritance, I use the above relationships (note: no nested classes yet)

So, Func0 can get at obj1, func0Slot, func0Message as well as obj1.obj1Message and obj1.obj0Message.

Obj1 constructor can get at func0Slot, func0Message, obj1Slot, obj1Message as well as obj0Message. Obj1 can update the value in func0Slot.

Obj0 can get at obj0Slot and obj0Message. Or can it?

Most of my tests have worked, with the relationship between obj1 and func0 being quiet well understood and working well. The problem comes in when I get to inheritance from obj0.

While I can find the appropriate message, the behaviour for updating the slot is broken; reading the slot should (initially) return the value in the original obj0 object - but there is no code to search for slots up the hierarchy... This could be fixed by having the message execute within the context of the original object BUT if the message updated the slot - then the original object would effected rather than the inheriting object. Boo.

So we end up with two context setting schemes. When looking up the context hierarchy, we update the slot where we find it. When looking up the inheritance hierarchy, we update the slot at the bottom of the tree!

But now the slots in parent objects are available (if you dig around) in child objects. Not ideal. And how do I determine which scheme to use - does the code have to specify?!!

So I'm a bit miffed.

Wednesday, December 2, 2009

...But in the end it helps out

Gilad's post got me thinking about the difference between objects and hashes, and specifically the encapsulation that objects bring to the table.

This then lead me to realise that ultimately, in my VM and presumably Newspeak, the slots that are available in a function call are exactly the same as the slots available in an object. And the only things that have direct access to these slots are the sub-contexts (closures) that exist beneath this level.

And so objects and functions come closer and closer as closures save the day. There are two differences I've found so far, and both are additional object behaviour, and both are because of inheritance.

Firstly Objects have parents - otherwise there would be no inheritance at all!

Secondly, according to the newspeak message lookup scheme, if the receiver hasn't been found within the context, then it should be found up the object hierarchy (though only of the deepest nested object).

Interestingly this fits well with the access rights. If you can only access things within your context, you can't affect parent objects unless they provide a message to you - which is the desired behaviour.

Anyway, I have this almost implemented (surprisingly easy) and will get back to the compiler as soon as I've
  • written test to make sure that objects created in situ will honour their parents (at the moment they won't because they are actually function contexts...)
  • found a way to enforce contextualisation of objects/functions to only the current context... at the moment you could potentially hack the context that an object is defined within!