[self-interest] Re: bytecode formats

Fri Jan 15 17:10:58 UTC 1999

Stefan Matthias Aust wrote:
> [...]  You have however to distinguish the slot accessor methods
> needed for SELF from normal methods.

Ooops - you are right. Otherwise this isn't possible:

     frame
       "returns my current frame"
       ^ frame

> >[global objects]
> [...] The typical idiom "Smalltalk
> at: aSymbol" must be supported.

Ooops again! Yes, it would be easier simply to have Smalltalk be a
dictionary and implement globals the traditional way. The dictionary
protocol could be emulated for "global" objects, but it wouldn't be
worth it. Besides, with my idea any addition of a global variable
would cause all methods that use *any* global variable to have to be
recompiled.

> BTW, I recently read that SELF handles Symbols (unique strings) inside the
> VM. A Smalltalk VM does this typically in Smalltalk.  Would this be a problem?

Self calls them cannonical strings, and Smalltalk calls them symbols.
In practice, they are the same thing. Self could do things like
Smalltalk does and there would be no problems at all. Note that even
Smalltalk needs some support from the VM for symbols.

> >I think Mario's Smalltalk simply ignores cascades. They can emulated
> >easily enough with hidden temporary variables:
> 
> As you might have seen, I suggested nearly the same transformation :-)

I like your use of blocks, but it would be more work for the
native/threaded code generator to figure this out.

> I think, when we start to discuss variants of the instruction set and its
> encoding, we first need to decide whether this set shall be optimized for
> interpretation or compilation.  Squeak's Jitter can probably perform simply
> macro expansions, but I don't know whether it can perform more complex
> inlining and unrolling operations which would be probably needed to reach
> an acceptable execution speed.

As it is, Jitter can handle non of these things. I was thinking of
a major extension (which would be a lot like the code that would be
removed from the parser, so total complexity would be roughly the
same).

> Therefore, it might be worth considering an instruction set tailored for
> interpretation together with a parser/codegenerator which would even
> flatten if, while and for statements.

Great idea, but the current Squeak bytecodes already do this
pretty well. Maybe just a SELF_SEND and SET_DELEGATEE would be
needed to fully support the Self semantics?

> You proposed a clever and compact encoding. But it's tailored towards
> compiling. An interpreter would have to decode the instruction bit instead
> of using a simple jump table.  You need to maintain a current literal
> pointer. You need to extract the argument count from message arguments
> (probably not that difficult. If the first character is a letter, just
> count the ":". Otherwise the argument count is 1). And your parser needs to
> macro-expand delegation into primitives. Finally, your interpreter must
> interpret a non-local-return as the end of execution even for methods
> because otherwise you cannot deal with instruction streams with sizes other
> than n*16.

Yes, the format is much better for compiling than for interpreting.
I first thought of the non-local-return as an end of exeuction, but
then I couldn't distinguish between

              [|:x| y: x+1. x-1]

and

              [|:x| y: x+1. ^ x-1]

So I changed things so that a SELF_SEND when the literal pointer was
past the end of the literal vector means end of execution.

-- Jecel

------------------------------------------------------------------------
eGroup home: http://www.eGroups.com/list/self-interest
Free Web-based e-mail groups by eGroups.com