[self-interest] Re: compact encoding

Jecel Assumpcao Jr jecel at lsi.usp.br
Sun Aug 29 20:40:17 UTC 1999


Stefan Matthias Aust wrote:
> [suggestion that non_local_return instruction isn't needed]

Great idea! It is interesting that this was what I did in my
1984 bytecodeless Smalltalk (http://www.lsi.usp.br/~jecel/st84.txt).

Your C code for parsing this encoding show that it isn't very
complicated.

>[alternate encoding]
> 0 special instruction
>  - normal return
>  - non-local return
>  - push self
>  - push special literal (nil, true, false, -1, 0, 1, 2, 10)
>  and these are useful for interpreters
>  which optimize a few control flow methods...
>  - drop stack
>  - dup stack
>  - conditional branch
>  - unconditional branch
>  - push next block
>  and still 48 instructions free!

Once I thought of making the bytecodes complete enough so you
could actually write primitives in them. It was a pretty neat
idea (and would make porting much easier), but some primitives
(like one to manipulate the MMU in the processor) would still
require special treatment, so I gave up on this.

> 1 push literal
>  - push next literal
>  - push special literal (1..63)

Here are the ten most popular literals (as counted by references
from bytecodes):
  
  3872   'ifTrue:'
  3388   ','
  2520   'e'
  2466   '='
  2292   'value'
  1714   'copy'
  1692   's'
  1667   'value:'
  1550   'm'
  1518   0
  1429   'fb'

I was very surprised not to see 'i' among these :-)

> 2 stack send
>  - send next literal
>  - send special literal (1..63)
>    (probably things like #traits, #at:, #clone, #copy, #size, etc.)
> 
> 3 implicit self send
>  - send next literal
>  - send special literal (1..63)
>    (same as with 2 send)

Adding up the number of times that the 63 most popular literals
appear in the literal frames (not quite the same as the count
above, but the results are very close) we would be saving a total
of 43903 words with your encoding. Not bad at all!

> Either of the return instructions end the method's (or block's) execution.
> This means, the shortest possible code is "push self", "normal return".
> 
> The method "clone = (cloneSize: 10)" whould be encoded as "implicit self
> send"+#cloneSize:, "push special"+10, "normal return" - still one word
> (assuming that cloneSize: is one of the special literals).  Jecels encoding
> would need two words - as mine would without the special literal encoding.

I think you got the order of your bytecodes wrong - you first
have to push the arguments on the stack and *then* send the
message.

> >[singleton objects could be their own maps]
> 
> I don't think that this saving is worth the additional time.  I don't know
> the original self system, but I'd guess that oddballs aren't that common.

Every single method in a system is a differente *type* of object
and has its own map, so there are quite a few "oddballs". I'll
explain in more detail when I answer your other emails tomorrow.

-- Jecel



More information about the Self-interest mailing list