Listened to a great interview with Alistair Cockburn last night. I love his 'Agile Software Development' book, and this interview was an inspirational reminder of his approach to agile, and the deep reasoning behind the fabulous agile manifesto.
Give it a listen (tell, don't ask)
Friday, December 31, 2010
Writing Legacy Code
The stuff I'm currently working on was written in a week as an experiment in productivity... it ended up with over 300 bugs, though probably only 200 were in the code itself.
One week of coding, six months ago, who'd have thought that something that 'fresh' could be considered legacy - but it has certainly left a legacy behind.
My last posting at my last job had me writing legacy code. I was stuck in a code base that felt legacy and required legacy hacks to get it to do what you wanted. I should have thrown my toys out of the pram, but didn't have the kahunas. That code was 3 months old when I started hacking on it.
It seems that age doesn't have much to do with the definition of 'legacy code'. If this is the case, what is a good definition of legacy code...
Code without tests
Simplez, yes? If your code has no tests:
This means that if you are writing against a codebase that has no tests, you are writing against a legacy codebase. If you are writing code and not writing tests, you are writing legacy code.
(Just for the record, your tests should be automated, and they should be at least unit tests. If you are doing otherwise, you are wasting a lot of time sitting there manually testing every codepath in your codebase)
One week of coding, six months ago, who'd have thought that something that 'fresh' could be considered legacy - but it has certainly left a legacy behind.
My last posting at my last job had me writing legacy code. I was stuck in a code base that felt legacy and required legacy hacks to get it to do what you wanted. I should have thrown my toys out of the pram, but didn't have the kahunas. That code was 3 months old when I started hacking on it.
It seems that age doesn't have much to do with the definition of 'legacy code'. If this is the case, what is a good definition of legacy code...
Code without tests
Simplez, yes? If your code has no tests:
- You don't know what it should do
- You can't trust it does what it should do
- It's probably highly coupled
- etc, etc
This means that if you are writing against a codebase that has no tests, you are writing against a legacy codebase. If you are writing code and not writing tests, you are writing legacy code.
(Just for the record, your tests should be automated, and they should be at least unit tests. If you are doing otherwise, you are wasting a lot of time sitting there manually testing every codepath in your codebase)
Logging Token Pattern
Ok, Ok, so this is speculative as I haven't tried it yet, but hear me out.
On my last assignment with ThoughtWorks I was working with my good friend Simon Brunning. We were doing some file processing stuff that was apt to fail, if not on a complete file basis, then at least on a line by line basis.
We wanted to know what went right and what went wrong. Originally the code passed around a bunch of lists that would contain working things and failing things and exceptions (yes, the actual object) and so on.
Brunning created a customised object that tracked the progress of the file processing. It printed nice results when it finished, outlining what happened at what stage. It made the code much nicer, and the reporting much nicer.
For some reason, at the same time, I was thinking about the over abundance of log messages in your typical log, and how useless they are to the poor old Ops staff.
These two concepts gelled. Why not log the bejesus out of your process, but only print things to the log if something goes wrong.
An initial cut could be a 'transaction based' logger... 'start' a log stream, and then either 'render' if something goes wrong or 'drop' if it's all ok.
In this way the Ops team get a shite load of logs, but only if something went wrong. Surely a huge improvement? And/But the developer could put in heaps more info than they might otherwise.
The next level would be to create these custom objects that know how to talk about what they were doing and what went wrong. This would require people to think about stuff, so seems unlikely :)
Both these approaches would have to be thread based (as in: thread local variable), but the second one would have to know what sort of logging/tracking token was being updated within the current Thread. That's not easy, but Felix is up to this kind of thing.
So, given I haven't tried it, what do you think?
On my last assignment with ThoughtWorks I was working with my good friend Simon Brunning. We were doing some file processing stuff that was apt to fail, if not on a complete file basis, then at least on a line by line basis.
We wanted to know what went right and what went wrong. Originally the code passed around a bunch of lists that would contain working things and failing things and exceptions (yes, the actual object) and so on.
Brunning created a customised object that tracked the progress of the file processing. It printed nice results when it finished, outlining what happened at what stage. It made the code much nicer, and the reporting much nicer.
For some reason, at the same time, I was thinking about the over abundance of log messages in your typical log, and how useless they are to the poor old Ops staff.
These two concepts gelled. Why not log the bejesus out of your process, but only print things to the log if something goes wrong.
An initial cut could be a 'transaction based' logger... 'start' a log stream, and then either 'render' if something goes wrong or 'drop' if it's all ok.
In this way the Ops team get a shite load of logs, but only if something went wrong. Surely a huge improvement? And/But the developer could put in heaps more info than they might otherwise.
The next level would be to create these custom objects that know how to talk about what they were doing and what went wrong. This would require people to think about stuff, so seems unlikely :)
Both these approaches would have to be thread based (as in: thread local variable), but the second one would have to know what sort of logging/tracking token was being updated within the current Thread. That's not easy, but Felix is up to this kind of thing.
So, given I haven't tried it, what do you think?
How I MVC
So, I'm working in a .Net company that makes applications. Weird. I've only done webapps for the last 10 years or so.
Nice thing about webapps is it forces you to at least consider separating presentation from content, behaviour from interface and so on (often that consideration is ignored and we just write crud, but one is encouraged by the CSS/HTML/Javascript/Webserver separation to at least think about it)
App side of things? No such luck. Everything just goes everywhere.
Now, thing is, I'm working on some projects that started as prototypes, so it is unfair to judge development practices on them (I hope...). They have no tests. They are not written with tests in mind. They have no 'domain model' and very poor separation of concerns.
My company just gave us some 'downtools' time to work on our own stuff, so I did a (prototype) reimplementation of much of the application I've been working on. I did this so I could see how I could write this app with tests, and in turn using the MVC approach that smalltalk encourages.
(I also tried for a tell don't ask approach, and achieved this to a degree, but didn't really push it)
By thinking in MVC terms I end up having to have a 'domain model' - it's the model bit :) This is already a boon as I start separating out the model data and behaviour from the human interface that manipulates it. This means I can use a different (eg testing) interface to manipulate it. Whamo, got some tests. Red Green Refactor Happy
Next comes the view. I hate views. The bit the user clicks on is pants to test, and hence I've always tried to keep this untestable, horrible bit of code as thin as possible. Sure I can have some happy path integration tests, but really, they suck and don't cover nearly the ground a good few unit tests will.
This also separates out the manipulation of GUI elements into one place - and I love separation.
If the view is so thin, what does the UI work? Well, the controller (that's all that's left, right?). So it ends up linking the view and the model, translating between them, maintaining any state associated with the UI (and not the domain) and invoking commands on the domain as needed.
Boy that's a lot. Too much.
So I split the Controller up into the, um, Controller and the ControllerModel (actually I called it ViewModel).
The ControllerModel maintains any state that may be needed for the particular view you are dealing with (eg list of pages in a wizard, internal representations of the GUI elements (and hence translations) etc). This makes it a nice, testable chunk of code, and simplifies the Controller.
The Controller has NO state - it only knows about the things it is linked to (view, model, controllermodel, services). While it makes behavioural decisions, these decisions are based on information drawn from elsewhere.
While still hard to test (lots of mocks/stubs!) you end up just testing that 'when event A is received, object B is called'. The tests themselves end up much better worded - explaining behaviour and interactions explicitly, not having to worry about translations and state.
So how do I MVC? I MVCVM (Model-View-Controller-ViewModel).
(A final note... the controller sits at the centre as abig fat svelte mediator sending commands to all its minions and listening for those minions events. The minions never call/know about the controller)
Nice thing about webapps is it forces you to at least consider separating presentation from content, behaviour from interface and so on (often that consideration is ignored and we just write crud, but one is encouraged by the CSS/HTML/Javascript/Webserver separation to at least think about it)
App side of things? No such luck. Everything just goes everywhere.
Now, thing is, I'm working on some projects that started as prototypes, so it is unfair to judge development practices on them (I hope...). They have no tests. They are not written with tests in mind. They have no 'domain model' and very poor separation of concerns.
My company just gave us some 'downtools' time to work on our own stuff, so I did a (prototype) reimplementation of much of the application I've been working on. I did this so I could see how I could write this app with tests, and in turn using the MVC approach that smalltalk encourages.
(I also tried for a tell don't ask approach, and achieved this to a degree, but didn't really push it)
By thinking in MVC terms I end up having to have a 'domain model' - it's the model bit :) This is already a boon as I start separating out the model data and behaviour from the human interface that manipulates it. This means I can use a different (eg testing) interface to manipulate it. Whamo, got some tests. Red Green Refactor Happy
Next comes the view. I hate views. The bit the user clicks on is pants to test, and hence I've always tried to keep this untestable, horrible bit of code as thin as possible. Sure I can have some happy path integration tests, but really, they suck and don't cover nearly the ground a good few unit tests will.
This also separates out the manipulation of GUI elements into one place - and I love separation.
If the view is so thin, what does the UI work? Well, the controller (that's all that's left, right?). So it ends up linking the view and the model, translating between them, maintaining any state associated with the UI (and not the domain) and invoking commands on the domain as needed.
Boy that's a lot. Too much.
So I split the Controller up into the, um, Controller and the ControllerModel (actually I called it ViewModel).
The ControllerModel maintains any state that may be needed for the particular view you are dealing with (eg list of pages in a wizard, internal representations of the GUI elements (and hence translations) etc). This makes it a nice, testable chunk of code, and simplifies the Controller.
The Controller has NO state - it only knows about the things it is linked to (view, model, controllermodel, services). While it makes behavioural decisions, these decisions are based on information drawn from elsewhere.
While still hard to test (lots of mocks/stubs!) you end up just testing that 'when event A is received, object B is called'. The tests themselves end up much better worded - explaining behaviour and interactions explicitly, not having to worry about translations and state.
So how do I MVC? I MVCVM (Model-View-Controller-ViewModel).
(A final note... the controller sits at the centre as a
Wednesday, August 25, 2010
Separation of concerns
So I got myself stuck a bit. Much of this was due to a lack of tests (boo) and a problem trying to think how to write tests. And much of the problem with the tests was to do with setting up the environment; I tried too early to get things compiling themselves.
I did went down the DynamOS in DynamOS route because I had a lot of code where I hard coded functions and objects (eg lists and numbers) and I don't want that, it looks bad. Sorry not being very clear.
Anyway, where I've ended up is that I realised I need to separate the data from the code; I need to _start_ with the image and test against that. I can write tests against the contents of the image as well. But that's where I have to start, rather than the poxy 'Environment' object I currently have.
And it separates the testing too. I can test the loading of the image completely separately from its execution. And I can test it's execution completely separately from the contents of the image itself (sort of).
This makes me happy, and provides a new direction for me to work in.
BTW this was largely inspired by JSqueak, not because it particularly does this, but simply by having a look and thinking about things a bit differently.
I did went down the DynamOS in DynamOS route because I had a lot of code where I hard coded functions and objects (eg lists and numbers) and I don't want that, it looks bad. Sorry not being very clear.
Anyway, where I've ended up is that I realised I need to separate the data from the code; I need to _start_ with the image and test against that. I can write tests against the contents of the image as well. But that's where I have to start, rather than the poxy 'Environment' object I currently have.
And it separates the testing too. I can test the loading of the image completely separately from its execution. And I can test it's execution completely separately from the contents of the image itself (sort of).
This makes me happy, and provides a new direction for me to work in.
BTW this was largely inspired by JSqueak, not because it particularly does this, but simply by having a look and thinking about things a bit differently.
Thursday, March 18, 2010
Another name for scripting languages
So, 'scripting language' is an awful name for things like Ruby and Python and so on. These languages are usually dynamic, usually quick to write, and usually aren't compiled.
In fact, one of the big things about this class of languages is that the program exists in a file which is then interpreted in sequence, one might say _serially_. Expressions are executed as they are encountered, Functions are created on the fly, and Objects are 'opened' and updated as the file is consumed.
And so I propose calling this kind of language a 'serial' language.
Once I've named it like this it makes me think about other languages, and how to contrast/compare when considered in this way.
'Normal' languages such as C and Java and so on are all compiled. So while expressions are executed as they are encountered, the way they are encountered is determined by a predefined structure of objects and structs and functions. These languages are typically less dynamic, because they don't have to be; there is no requirement that classes be defined on the fly because they can be created at compile time.
The final group of languages on this spectrum are those that exist within an Image, such as SmallTalk and Self. Here the objects and functions are constructed, essentially by hand, and then exist within the image. There is no need to compile or serially create them ever again.
Why is this a spectrum - for one reason, and that is the degree to which the structure (not behaviour) of the program is executed at runtime. Ie, how much processing happens before the program is ready to actually do what it was written to do. Serial languages create all structure at runtime. Compiled languages create classes at compile time, but require things like DI (eg Spring) to build up the structure of relationships between these constructs. Image based environments have no processing necessary before the program starts; all objects and their relationships have been defined and stored within the Image.
This then leads to another thing to think about; if Image based environments require so much less code to setup them up, does this mean that they are better because 'every line of code is a liability?'
In fact, one of the big things about this class of languages is that the program exists in a file which is then interpreted in sequence, one might say _serially_. Expressions are executed as they are encountered, Functions are created on the fly, and Objects are 'opened' and updated as the file is consumed.
And so I propose calling this kind of language a 'serial' language.
Once I've named it like this it makes me think about other languages, and how to contrast/compare when considered in this way.
'Normal' languages such as C and Java and so on are all compiled. So while expressions are executed as they are encountered, the way they are encountered is determined by a predefined structure of objects and structs and functions. These languages are typically less dynamic, because they don't have to be; there is no requirement that classes be defined on the fly because they can be created at compile time.
The final group of languages on this spectrum are those that exist within an Image, such as SmallTalk and Self. Here the objects and functions are constructed, essentially by hand, and then exist within the image. There is no need to compile or serially create them ever again.
Why is this a spectrum - for one reason, and that is the degree to which the structure (not behaviour) of the program is executed at runtime. Ie, how much processing happens before the program is ready to actually do what it was written to do. Serial languages create all structure at runtime. Compiled languages create classes at compile time, but require things like DI (eg Spring) to build up the structure of relationships between these constructs. Image based environments have no processing necessary before the program starts; all objects and their relationships have been defined and stored within the Image.
This then leads to another thing to think about; if Image based environments require so much less code to setup them up, does this mean that they are better because 'every line of code is a liability?'
Monday, January 25, 2010
Well that was hard work, but was it worth it?
Not sure... I've replace the java implemented List with a DOS version, and it is immutable as well. It is pretty much a LISP list made of head and tail nodes. No need yet for much support, so only supplies those two values.
The difficult bit is that list is used to store both opcodes and arguments when constructing a new function/constructor. This means that these list changes had very far reaching effects. But it also means that a very important part of the system is now written in the system, rather than Java.
I have a feeling that more and more of the system will be converted in the coming (days? seems a bit optimistic). The more that's written in the VM opcodes, the smaller the VM will be, the better! Or at least that's the approach for this iteration. Once we have a feel for how small the VM is, we can start optimising some things (like replacing list with a native version...!).
Here is list (it has to be hacked a bit because we can't create new functions until list has actually been defined... so the comments are used to extract the functions to be manually created and applied to the emptyList object)
(constructor emptyList
// function start
(function prepend: object
newListWithHead: $object tail: $this
)
// function end
// function start
(function newListWithHead: head tail: tail
listConstructor: $head tail: $tail prototype: $this
)
// function end
// constructor start
(constructor listConstructor: head tail: tail prototype: listPrototype
parent: $listPrototype
)
// constructor end
// function start
(function head
$head
)
// function end
// function start
(function tail
$tail
)
// function end
// function start
(function at: index
$head // obviously wrong!
)
// function end
)
Thursday, January 21, 2010
Happy with that!
Here is the definition of zero, the prototype for all numbers (integer ones at least...)
(constructor zero: vm
(function addValue: addedValue
numberFrom: ($vm add: $addedValue to: $value)
)
(function plus: number
addValue: ($number value)
)
(function minus: number
numberFrom: ($vm subtract: ($number value) from: $value)
)
(function isLessThan: number
$vm value: $value isLessThan: ($number value)
)
(function value // not so happy with this, but not end of world..
$value
)
(function numberFrom: value
numberConstructor: value prototype: $this
)
(constructor numberConstructor: value prototype: numberPrototype
parent: $numberPrototype
)
$value: .0
)
I really like it. I was initially doing all kinds of stuff creating prototypes, but that's too classical inheritance - zero IS the prototype. I like.
I think this is easier to read as well. (you didn't see it before, did you :)
Also I've put list onto Activation, but haven't immutabled it yet
I have a compiler going!
And the test for this is 'Number' - so I have essentially bootstrapped the Number library. It's all a bit first iteration; the opcode output is appallingly inefficient. Makes me shudder to think about it, but it works.
I'm in a conundrum (that's all this blog is about really :). Primitive things, such as Undefined, Null, List, Map, Number and String, are essential to nearly all programs. It seems painful to require that everything have these injected. If there was class nesting it wouldn't be such a faff because only the outermost class would need to include the primitive stuff.
Ahhhh.
So what I need is a way of nesting classes :) Back to newspeak. Not so bad though; I can simply add the enclosing class to the list of traits for that class. Call it something imaginative like 'EnclosingType'.
The biggest problem here being that EVERYTHING needs List to be able to create functions (which require a list of arguments). So the compiler will need access to this even if the programmer doesn't think it's necessary... oh dear. So maybe just list gets put onto RootObject.
Actually, how about ArgumentList gets put on - and all it can do is be used as an argument list. Oh hang on! I can add it to Activation, which is used to create functions anyway. That feels better.
Oh, BTW, I've decided that all primitives (as listed above) must be immutable. So the List object will return a new List object when you add something to it.
So what I'm going to do (right now) is
- Add ArgumentList to Activation
- Create 'primitive' object that can be passed as a parameter and be used to access the primitives
- Make current primitives immutable
- remove primitives from environment, so they must be accessed via the parameter (basically, get the compiled version of fibonacci working)
- Create String primitive
- Implement compiler in OpCodes!
Subscribe to:
Posts (Atom)