Can anyone justify closure treatment in self? It would seem to be more in the spirit of self to merge closures with methods and perform a purely dynamic message lookup, that is search block locals/parameters, then follow parent slots (outer closures - and now deprecated 'methods'). This would have an interesting side effect of erasing the difference between closures, methods and objects - any of these could be stuck into a slot and sent at least a 'value' message.
A method implementation would become therefore just another object (with local, parameter and parent slots) which is sent a 'value' message by default, but can accept other messages!
Then it is up to the programmer to decide whether to write a 'block' inside a 'method' (an anonymous closure), create a method (a named closure) or send messages to object or methods (that are also objects) interchangeably.
As for the non-local return out of closures, is there a really good reason for it? Why not return to the outer scope (other than perhaps a clumsier syntax)?
Forgive my possible stupidity - I am stuck in x86 world and haven't been able to explore self personally. I am amazed at the clarity of reasoning behind self, with this minor exception. Perhaps someone can clarify this.
Victor,
Can anyone justify closure treatment in self? It would seem to be more in the spirit of self to merge closures with methods and perform a purely dynamic message lookup, that is search block locals/parameters, then follow parent slots (outer closures - and now deprecated 'methods').
At the language definition level (the implementation is an entirely different story) things are pretty close to what you proposed. But methods are considered prototypes which must be cloned in order to get a closure.
At least in theory, a prototype and its clone would be considered the same type or class of object. So it would be correct to say that methods and closures are already merged.
To see why method invocation must generate a clone, just imagine a recursive method or one used in several parallel threads. The different execution contexts would want to store different values in the argument slots and local slots, so we can't get by with just one.
This would have an interesting side effect of erasing the difference between closures, methods and objects - any of these could be stuck into a slot and sent at least a 'value' message.
Would fetching something from a slot automatically involve sending 'value' to what was found there? If so, blocks would no longer work properly (more below). If not, methods would no longer work properly.
A method implementation would become therefore just another object (with local, parameter and parent slots) which is sent a 'value' message by default, but can accept other messages!
A closure (context object, activation, whatever name you want) can already accept many messages, though the current implementation doesn't deal with method slots inside methods.
Then it is up to the programmer to decide whether to write a 'block' inside a 'method' (an anonymous closure), create a method (a named closure) or send messages to object or methods (that are also objects) interchangeably.
Blocks turn out to be much more complicated than they appear at first. In fact, we need three separate objects to support them:
block literal: this bundles up all the information in source and gets stored in the method object
block context: this is the name I gave this in my tinySelf interpreter, but it really isn't a context object at all. It must point to the block literal for later use, point to the currently executing context for later use, and have 'traits block' as a parent so we can send all kinds of interesting messages to it ('value' is the most important, but we need 'whileTrue:', 'millisecondsToRun' and many others)
block method context: this is created as the result of sending 'value' to the block context. It is cloned from the method found in the block literal but its parent slot is set to point to the context saved in the block context instead of the receiver like for normal methods. This way we have a chain of closures like you wanted.
For a single block literal we might create several block contexts since the method in which it appears might be invoked multiple times. For a single block context we might create seveal block method contexts since it might receive several 'value' messages.
As for the non-local return out of closures, is there a really good reason for it? Why not return to the outer scope (other than perhaps a clumsier syntax)?
Unless you have a "^" before the last expression in a block, it does return to the outer scope.
[ 1 + 2. 3 + 4 ] value
But that is not always what you want. Sometimes you need to "unwind" a deeply nested computation:
.... arg < 0 ifTrue: [ ^ 1 ]. ....
This lets us relax the rules of strict structured programming. Otherwise we would have to do:
.... arg < 0 ifTrue: [ 1 ] False: [ ....]
For many real programs, being unable to escape inner scopes in an unstructured way can become quite painful (see Pascal vs C).
Forgive my possible stupidity - I am stuck in x86 world and haven't been able to explore self personally. I am amazed at the clarity of reasoning behind self, with this minor exception. Perhaps someone can clarify this.
These are very good questions and I hope I understood them correctly and was able to explain some of the issues involved. I have been trying to simplify blocks or unify them with methods since my 1984 Smalltalk design (http://www.lsi.usp.br/~jecel/st84.txt) and would very much like to see a simpler solution for Self.
-- Jecel
Jecel, thank you for a quick response.
Firstly, I have no problem with the cloning of the activation records, etc. That makes total sense. I do have issues with lexical scoping in interpreted environments generally, and especially with pure object-oriented languages. Other than the lexical issues, self is a clear gem.
My problem starts with self curtailing the lifespan of closures to the enclosing method. Although it saves the implementation from dealing with moving activation records off the stack, it cripples the language. I very much like the idea of passing closures around and sticking them into slots.
Also, at the risk of getting off-topic, I need to get to the bottom of the closure/method issue. Once again, apologies for not being up on self. This is the way I envision the ideal OO language (and self is really close...)
1) There is an environment supporting objects. 2) Objects contain slots and a piece of code. 3) Each slot consists of a name and a pointer to another object. The name dynamically binds the referenced object and allows code to refer to it. 4) The code is in some manner human-modifiable and executable by the machine (ignore implementation details here). 5) Through code, the environment can modify both the shape of an object (number and naming of slots) as well as the contents of the slots. 6) Object visibility: The topology of an object, at run-time, defines the visibility of objects other than those referenced by its own slots (dynamic scoping of objects). The name-binding process is equivalent to a search of named slots, starting at the object in question, followed by objects referred to by its slots, etc (somehow avoiding loops).
This seems to roughly describe an object-oriented environment capable of supporting a decent language. Now the details of interpretation I am not solid on right now, but there are a few ways of doing it:
- Find the target object by name - search proceeds as in 6) above. - Execute its code, returing the target reference. - Find the 'message' withing the dynamic scope of the target object. If present, evaluate it with whatever parameters.
Sending a 'no message' message returns the value (equivalent to the execution of a method, sending 'value' to a block or fetching a data slot). Note that an object can be referred to by more than one name if it's bound to more than one slot.
For example, let's say we have an object foo with slots named A B and C. We can assign an integer 4 to A; a method [A * 2] to B; and a closure myClosure to C. Now, there is no difference between blocks and method, eh? The code is completely decoupled from any bindings until runtime.
This arrangement has several benefits over the smalltalk/self model: you can totally redirect objects to proxies, or do anything at all at the point of lookup and closures are in fact real objects.
bar. (returns the result of evaluating an object at a slot name bar in self or above). bar increment. (returns the result of evaluation of an object at the slot named 'increment' found at or above an object at the slot named 'bar' in self or above). bar := 4 (send := 4 to the object at a slot named 'bar' in self or above)...
Sorry about the rambling, but I've been thinking about this for too long the wrong way, and the recent brush with self is beginning to clear my head (but it's not quite clear yet...)
Firstly, I have no problem with the cloning of the activation records, etc. That makes total sense.
Ok. One complaint some people have about the "cloning" is that the methods don't have ':self*' slots but activation records do. I guess these slots (note that they are argument and parent) could be added to methods to make things more uniform while wasting a little space.
But actually, activation records are continuations and not just closures. They have an instruction pointer and a local data stack, neither of which was in the original method object.
In some of my implementations I have dealt with this by slightly redefining the "method story". The method is indeed a prototype for the closure, but when it is cloned a second object is also cloned and installed at its parent:
( | :self*. <pc> <- 0. <stk> <- vector copySize: 16 FillingWith: 0. <sp> <- 0. | )
where <name> indicates a slot name that can't be accessed via normal message sends. This object plus the cloned method would form an activation record.
I do have issues with lexical scoping in interpreted environments generally, and especially with pure object-oriented languages. Other than the lexical issues, self is a clear gem.
I created a Self-like language, NeoLogo, with dynamic scopes. It is cleaner, but gets the programmers into trouble more easily.
My problem starts with self curtailing the lifespan of closures to the enclosing method. Although it saves the implementation from dealing with moving activation records off the stack, it cripples the language. I very much like the idea of passing closures around and sticking them into slots.
Here is a quote from "Towards a Universal Implementation Substrate for Object-Oriented Languages" by Mario Wolczko, Ole Agesen, David Ungar:
"Life-time of blocks. Self blocks cannot be invoked after their enclosing method returns. Lifting this restriction makes it easier to translate Smalltalk blocks that use this feature, as well as translating other languages with closure-like constructs."
Also, at the risk of getting off-topic, I need to get to the bottom of the closure/method issue. Once again, apologies for not being up on self. This is the way I envision the ideal OO language (and self is really close...)
- There is an environment supporting objects.
- Objects contain slots and a piece of code.
Either one can be missing in a given object, though.
- Each slot consists of a name and a pointer to another object. The
name dynamically binds the referenced object and allows code to refer to it. 4) The code is in some manner human-modifiable and executable by the machine (ignore implementation details here). 5) Through code, the environment can modify both the shape of an object (number and naming of slots) as well as the contents of the slots.
One major decision is how common this will be. If "normal" code doesn't do this, but only the IDE as a result of a direct request by the programmer, then it might be acceptable for these modifications to take up to a few seconds.
- Object visibility: The topology of an object, at run-time, defines
the visibility of objects other than those referenced by its own slots (dynamic scoping of objects). The name-binding process is equivalent to a search of named slots, starting at the object in question, followed by objects referred to by its slots, etc (somehow avoiding loops).
You don't want to search all slots, just parents or some equivalent notion. Searches through "normal" slots can happen manually, as you explained below.
I would just like to point out that there are other links in the object topology that we normally don't think about: the literals inside the object's code. I find myself appreciating more and more their importance.
This seems to roughly describe an object-oriented environment capable of supporting a decent language. Now the details of interpretation I am not solid on right now, but there are a few ways of doing it:
- Find the target object by name - search proceeds as in 6) above.
This would be an implicit self send.
- Execute its code, returing the target reference.
- Find the 'message' withing the dynamic scope of the target object.
If present, evaluate it with whatever parameters.
This would be a normal send.
Sending a 'no message' message returns the value (equivalent to the execution of a method, sending 'value' to a block or fetching a data slot). Note that an object can be referred to by more than one name if it's bound to more than one slot.
Ok, except for assignment your "story" seems to work as an explanation for Self's syntax and semantics.
For example, let's say we have an object foo with slots named A B and C. We can assign an integer 4 to A; a method [A * 2] to B; and a closure myClosure to C. Now, there is no difference between blocks and method, eh? The code is completely decoupled from any bindings until runtime.
It seems to work with your examples, but things would be more interesting if you had a method with arguments and/or local slots in B.
This arrangement has several benefits over the smalltalk/self model: you can totally redirect objects to proxies, or do anything at all at the point of lookup and closures are in fact real objects.
bar. (returns the result of evaluating an object at a slot name bar in self or above). bar increment. (returns the result of evaluation of an object at the slot named 'increment' found at or above an object at the slot named 'bar' in self or above). bar := 4 (send := 4 to the object at a slot named 'bar' in self or above)...
If you are trying to do assignment, it won't work this way. By the time object 'bar' gets the ':= 4' message, you no longer have a reference to the object with the 'bar' slot, and that is what you want to change, I think.
Sorry about the rambling, but I've been thinking about this for too long the wrong way, and the recent brush with self is beginning to clear my head (but it's not quite clear yet...)
If it was easy, someone would have done it already and we wouldn't have things like C# as examples of great advances in language design. I put some ideas I have been working on here:
http://www.merlintec.com:8080/software/
I probably won't be doing the object format or bytecodes as described there. These are just some thought experiments.
-- Jecel
Jecel, thanks for your answers.
Let's postpone the activation record discussion (I am very curious about the double-cloning and istalling as own parent thought..) as an implementation issue.
Now, the lifespan of a block, as you pointed out, is limited to the lexically enclosing method. As an implementer, I can understand the desire to keep activation records on the stack for efficiency, and that implies that the record cannot survive the stack frame of the method. However, it does preclude me from passing a block out of a method and sticking it into a slot of a more permanent object. That seems like a deficiency as I often find myself (in smalltalk) in situations where I really wish to replace a method with another one after the state of the object changes (instead of a slower conditional). Am I missing something here - is there another mechanism in self to replace the contents of a slot with another (precompiled or dynamically generated) entity? I guess you can replace the contents of a slot with anything you want, so that would be the way to do it. It just really yanks me that the creation of a block and a method is not syntactically identical.
As for scoping, do I really understand it correctly that a block statically binds to objects of the enclosing method at compile time? It would be just so much cleaner to bind at run time... Ah... It seems like there are at least two distinct uses of a block: partitioning a method (requires lexical access to method slots), the other a generic submethod to be passed around (such as a sort block), requires dynamic access to run-time method's slots that can be faked by passing parameters. Perhaps these should be treated as different animals, or the programer be given the choice of lexical/dynamic binding of a block? Of course, dynamic binding works in both cases as the partition-type block is executed within the same method it was compiled in...
To clarify my seminonsensical last message, it seems that there is a more generic way to implement an oo environment within which self can be instantiated, just like smalltalk can be instantiated in self... The object with slots and data/code (data of course is code interpreted by some program or hardware...) seems like about as low as I would be willing to go. From there on, it is mostly the question of when to bind. Bind everything at compile time and you have Forth. Bind objects at compile time and messages at runtime, and you have Smalltalk. Bind everything at runtime, and what do you have?
You are totally right about my missing out on the assignment issues. The big problem here is name binding again - do names bind the slots or objects inside those slots? Is a slot, perhaps, an object capable of assignment? I have to think about it some more as I am quickly approaching middle age and my mind is not working as well as it did (or at least as I remember it doing).
Now, the lifespan of a block, as you pointed out, is limited to the lexically enclosing method. As an implementer, I can understand the desire to keep activation records on the stack for efficiency, and that implies that the record cannot survive the stack frame of the method. However, it does preclude me from passing a block out of a method and sticking it into a slot of a more permanent object.
The quote in my previous message was meant to imply that the main Self people are unhappy with the situation you describe above and that this is very likely to change in the future.
That seems like a deficiency as I often find myself (in smalltalk) in situations where I really wish to replace a method with another one after the state of the object changes (instead of a slower conditional). Am I missing something here - is there another mechanism in self to replace the contents of a slot with another (precompiled or dynamically generated) entity? I guess you can replace the contents of a slot with anything you want, so that would be the way to do it. It just really yanks me that the creation of a block and a method is not syntactically identical.
It is very simple to have many slightly different objects in Self, so many design patterns that use non-LIFO blocks in Smalltalk can easily be implemented with "anonymous objects" in Self. Smalltalk's
.... sortBlock := [ :a :b | a age > b age ]. .... sortBlock value: arg1 value: arg2 .... sortBlock := [ :a :b | a weight > b weight ].
can easily be implemented in Self as
.... sortObj: ( | compare: a And: b = ( a age > b age ) | ). .... sortObj compare: arg1 With: arg2 .... sortObj: ( | compare: a And: b = ( a weight > b weight ) | ).
It is more awkward in the simple cases, but actually becomes more usable than the Smalltalk alternative when you want the parameter objects to understand more than one message.
As for scoping, do I really understand it correctly that a block statically binds to objects of the enclosing method at compile time?
No - a clone of the block's value method is bound at run time to a clone of the enclosing method. That is a rather different thing from saying that the block object is statically bound to the enclosing method object at compile time (which is true). See my explanation in the previous email about the three different objects needed to implement blocks.
It would be just so much cleaner to bind at run time... Ah... It seems like there are at least two distinct uses of a block: partitioning a method (requires lexical access to method slots), the other a generic submethod to be passed around (such as a sort block), requires dynamic access to run-time method's slots that can be faked by passing parameters. Perhaps these should be treated as different animals, or the programer be given the choice of lexical/dynamic binding of a block? Of course, dynamic binding works in both cases as the partition-type block is executed within the same method it was compiled in...
Not if I understood what you meant by "partition-type block". Consider
.... x < 0 ifTrue: [ x: 0 ].
This block is actually executed inside the 'ifTrue:' method in the boolean object 'true', not the current method within the current receiver ('self'). For dynamic scoping to work we need to have some link going from the 'ifTrue:' clone (closure) to the current execution context (where 'x' is defined) *and* 'ifTrue:' itself must *not* define its own 'x' slot. The first problem is also present in lexical scoping (and it less elegantly solved there) but at least we get rid of the second problem.
To clarify my seminonsensical last message, it seems that there is a more generic way to implement an oo environment within which self can be instantiated, just like smalltalk can be instantiated in self...
That is what I am looking for.
The object with slots and data/code (data of course is code interpreted by some program or hardware...) seems like about as low as I would be willing to go. From there on, it is mostly the question of when to bind. Bind everything at compile time and you have Forth. Bind objects at compile time and messages at runtime, and you have Smalltalk. Bind everything at runtime, and what do you have?
I like Forth :-) You lost me in the part about Smalltalk binding objects at compile time.
You are totally right about my missing out on the assignment issues. The big problem here is name binding again - do names bind the slots or objects inside those slots? Is a slot, perhaps, an object capable of assignment? I have to think about it some more as I am quickly approaching middle age and my mind is not working as well as it did (or at least as I remember it doing).
How do you know that you mind doesn't work better now but your memory of how it used to be has been damaged? ;-)
You might enjoy a previous thread I started about assignment and arguments on this list:
http://groups.yahoo.com/group/self-interest/message/1066
-- Jecel
Thanks for all the insights. I have to seriously think about it and try out a few things in Mr. Gliebe's port (Thank you!!!) so I don't sound like a complete fool (too late).
By the way, what I meant by compile-time binding of objects in Smalltalk is that objects that receive messages are either explicitly returned from somewhere (this part is dynamic, of course) or are statically bound to a named instance (or class) variable or a pool. I think I was suggesting an extra object lookup (conceptually anyway as it can be optimized out) if an object is referred to by name, which I realize self in fact does by searching the slot hierarchy. The litmus test for me is whether it's possible at runtime to modify the object's inheritance path and have it function correctly in the new environment (that is, have the messages sent to the newly-bound objects if necessary).
Thanks for hammering the cloning thing into me, it does make a difference.
What was I talking about anyway? I do suffer from CRS (Can't Remember S**t)
self-interest@lists.selflanguage.org