[self-interest] bytecodes (was: 4.1.2 compatibility with 4.0)

Tue Oct 31 19:52:36 UTC 2000

Jecel Assumpcao Jr wrote:

> On Tue, 31 Oct 2000, Marko Mikulicic wrote:
>
> > What are the additional bytecodes?
> > Can you briefly describe them, so I don't download the sources (I have a
> > slow connection) ?
>
> If you have a running Self 4.1.2 system, just call up the
> "bytecodeFormat" object and look around.

Currently I have no access to a running Self :-(

>
>
> The opcode is now 4 bits (instead of 3) and the index field is also 4
> (instead of 5):
>
>     0  index - extend the index field of the next bytecode
>     1  literal - push the literal onto the top of stack (tos)
>     2  send - send the message with the literal as selector and tos as
>                      receiver
>     3  implicitSelfSend - send the message to self with the literal as
>                      selector
>     4  extended - see below

>
>     5  readLocal - access local slot
>     6  writeLocal - change value of local slot

why ?
Why the compiler cannot figure out what is a local slot and inline the
access, instead of using the parser ?
Is it to speed up the compiler ?

>
>     7  lexicalLevel - change what "local" means for previous
>                             instructions

??
What are the possibilities ?
Are these some kind of registers for frequently accessed objects or
are they used for local slots in the method activation ?

>
>     8  branchAlways - jump to indicated bytecode (literal must be
>                                   smallInt)
>      9  branchIfTrue - only jump if tos == true
>    10 branchIfFalse - only jump if tos == false
>    11 branchIndexed - tos is an index into a "branch vector"

Which is the literal of the bytecode ?
Is it a self vector ?
What contains this vector ? smallInts as in the bytecodes 8,9,10

>
>    12 delegatee - changes the next "send" into a directed resend
>    13 undefined
>    14 undefined
>    15 undefined
>
> If the opcode is "extended", then the index field is the real opcode:
>
>      0  pushSelf - puts the current receiver on the tos
>      1  pop - eliminates the tos

I imagine it can be used when multiple expressions are used ("some code .
something ").
But, since methods should be small, I see no advantage.  Where is it used ?

>
>      2  nonLocalReturn - returns from this block's "home context"
>      3  undirectedResend - like "super" in Smalltalk
>      4  undefined
>          ....
>     15 undefined
>
> I had to look at the VM sources, interpreter.c in particular, to figure
> out the meaning of the bytecodes.
>
> > Are they used by normal self code or are provided for
> > easier Java/Smalltalk emulation?

Is the Java in Self emulator available to the public ?
I have readed something about. Is it at the base of the HotSpot java VM ?

>
>
> You can send 'disassemble' to a method mirror to have a nice view of
> its bytecodes. That showed me that the parser uses most of these
> bytecodes. I didn't look too deeply, but the branch bytecodes don't
> seem to be used. Looking at the '_PrimitiveList' it isn't clear if the
> interpreter is used or not (I didn't see anything to enable/disable it
> like the other compilers).
>
> For those who missed it, I had made a proposal which used only 4
> bytecodes (0 = push literal, 1 = send, 2 = selfSend, 3 =
> nonLocalReturn) and used primitives for resends.

Power of simplicity :-)
I think I missed it. Some considerations:

What if the literal index is greater than 2**6, you don't have an "index"
bytecode ?
And what is used to pushSelf ? a pushLiteral("self") wich is translated by
the compiler in push FIRSTARG(%ebx) ?
You could also use 2 bytecodes:
0 = pushl iteral, 1 = send . Implicit send of "message" is:
push("self"),send("message") , and "self message" is: push("self"),
send("self"), send("message")  , and the nonlocalreturn with primitives.
Or also with 1 bytecode (simply without codes). Only sends. Primitives are
used to push literals on the stack, encoding the literal in the primitive
name ("_PL<hey>") :-)
I'm just kidding :-)

I think the point is that bytecode should be compact. Using 2 bits for the
code you have to use 2 bits for the literal index if you want to
put two bytecodes in a byte (nibblecodes ?).  It is a little inconvenient. So
you limit to 2 bits the code and have 6 bits for the literal index,
wich is too much, because rarely used.  Without the pushSelf bytecode and
resends your code would be larger than the current code by a factor of 10%
(from head). I don't really see a reasion for limiting the bytecodes to 4.  I
think the minimum is 6 (pushSelf,implicitSend,Send,pushLiteral,
nonLocalReturn,directedResend (undirected is simply directed to self,
directee is in tos), but if 6 it can be 8.

I think bytecodes must encode in the most efficient way the behaviour of a
method; they must not follow the phylosophy of self step by step.

I'm interested to see how many "index" bytecodes are in the 4.1.2 world as
opposed as in the 4.0 world,
and also the mean length of methods in the two systems. Can anyone try to get
this infos ?

What are the advantages of the 4.1.2 bytecodes. Is it because of the
interpreter ?
I have implemented the VM using 8 bytecodes. Do you think it could be helpful
use the 4.1.2 sheme ?

Marko