I decided to check out if the compact encoding I proposed a while ago (with two bits per instruction mixed in the the literals) was a good idea or not.
So first I got me a list of all the methods in the system (actually, their mirrors):
enumerating all enumerate asList copyFilteredBy: [|:e| e isReflecteeMethod]
This yields a list of 26320 elements in the normal Snapshot. Then, in this list, I evaluated the following expression to see how the coding would effect each method:
| d <- dictionay copyRemoveAll | do: [ | :m. delta <- -4 |
"delta starts out with -4 since we will no longer need a bytecode object for each method, and each bytevector object has 4 words of overhead"
delta: delta - (m codes size /+ 4).
"here we also discount the bytes themselves, coverting them to a rounded up number of words"
delta: delta + (m codes size /+ 15).
"on the other hand, the instructions now add words to the literal vector. But now we pack them 15 to a word"
m byteCodesDo: [ | :bci. :op. :lit | delta: delta + 1.
"we suppose that every single instruction will need exactly one literal. Note that in the case of pushSelf that originally didn't need a literal, now we have changed it to implicitSend 'self' which does need a literal"
op = bytecodeFormat opcodes resendOp ifTrue: [delta: delta+1].
"by converting resends to use primitives, we will need an extra literal for them"
op = bytecodeFormat opcodes return ifTrue: [delta: delta-1].
"return instructions don't need a literal, so we correct our mistake here" ]. delta: delta - m literals size.
"since we want to know how many words are added to the literal vector, we have to discount how many were in there before"
d at: delta Put: (d at: delta IfAbsent: 0) + 1.
"count yet another method with delta amount of memory words added" ]. d
So now we have a 70 element dictionary that has as keys the number of words that my enconding scheme would add to Self and as values the number of methods that will increase by that amount. Some simple expressions will reveal that:
number of methods that increase or stay the same: 959 memory increase due to these methods: 8200 words number of methods that decrease in size: 25361 memory decrease due to these methods: 102380 words
total memory gain due to new encoding: 94180 words
Since that is almost half a megabyte, it might turn out to be a good idea after all.
Another space saving measure (I haven't tested this one to see if it is any good) would be to allow singleton objects to be their own maps. Only when a new object is copied for the first time would we bother to separate it from the map. That might save some memory for all these methods.
-- Jecel
Hi!
Jecel Assumpcao Jr wrote:
I decided to check out if the compact encoding I proposed a while ago (with two bits per instruction mixed in the the literals) was a good idea or not.
I think, this was your encoding:
00 NON_LOCAL_RETURN 01 PUSH_NEXT_LITERAL 10 SEND_NEXT_LITERAL 11 SELF_SEND_NEXT_LITERAL (or end of method marker, if no more literals are available)
Instructions and literals both stored in one 32-bit word array.
I don't remember what your idea was when you've more than 15 instructions, but I think, you could use the following algorithm:
You actually don't need the "00" instruction. Non-local returns only occur in block objects and are always the last instruction. You could encode in the block object whether it's a block with "^" non-local return or a normal block. Now you could use "00" as mark for the end of the instruction stream. If you now reach the 16th instruction and it's no "00", you will use the word following the last used literal as source for the next instructions. Here's some (pseudo) Java-code to illustrate this. Note that instructions are stored "the wrong way":
int ip = 0; // instruction pointer int lp = 0; // literal pointer loop:while (true) { int iw = method.words[ip]; for (int n=0; n<16; n++) { int i = iw & 3; // instruction iw >>>= 2; // always shifts-in 0 and doesn't propagate sign bit switch (i) { case 0: break loop; case 1: int literal = method.worlds[++lp]; //... //... } ip = ++lp; } } if (method.hasNonLocalReturn()) // ... else //...
However, I wonder how your idea compares to this variant (which I wanted to use for my mySelf interpreter):
The idea is to minimize the number of literal references, which need 4 bytes each time. One need to identify the 63 most common literals, selectors and/or primitive selectors. These are directly encoded. Only otherwise a literal vector is used, encoded into the byte vector similar to the above idea.
Each instruction is byte-sized, which also helps to efficiently interpret them, I think. Everything is stored in 32 bit words. Upto four instructions are followed by zero or more (at most four) literal references. Each instruction encodes a type in 2 bits and carries a 6 bit argument.
0 special instruction - normal return - non-local return - push self - push special literal (nil, true, false, -1, 0, 1, 2, 10) and these are useful for interpreters which optimize a few control flow methods... - drop stack - dup stack - conditional branch - unconditional branch - push next block and still 48 instructions free!
1 push literal - push next literal - push special literal (1..63)
2 stack send - send next literal - send special literal (1..63) (probably things like #traits, #at:, #clone, #copy, #size, etc.)
3 implicit self send - send next literal - send special literal (1..63) (same as with 2 send)
Either of the return instructions end the method's (or block's) execution. This means, the shortest possible code is "push self", "normal return".
The method "clone = (cloneSize: 10)" whould be encoded as "implicit self send"+#cloneSize:, "push special"+10, "normal return" - still one word (assuming that cloneSize: is one of the special literals). Jecels encoding would need two words - as mine would without the special literal encoding.
It's difficult for me to guess what would save more space.
In any case, I agree that Jecel's encoding is an improvement over the old system.
Another space saving measure (I haven't tested this one to see if it is any good) would be to allow singleton objects to be their own maps. Only when a new object is copied for the first time would we bother to separate it from the map. That might save some memory for all these methods.
I don't think that this saving is worth the additional time. I don't know the original self system, but I'd guess that oddballs aren't that common.
bye
bye -- Stefan Matthias Aust // Bevor wir fallen, fallen wir lieber auf.
Stefan Matthias Aust wrote:
[suggestion that non_local_return instruction isn't needed]
Great idea! It is interesting that this was what I did in my 1984 bytecodeless Smalltalk (http://www.lsi.usp.br/~jecel/st84.txt).
Your C code for parsing this encoding show that it isn't very complicated.
[alternate encoding] 0 special instruction
- normal return
- non-local return
- push self
- push special literal (nil, true, false, -1, 0, 1, 2, 10)
and these are useful for interpreters which optimize a few control flow methods...
- drop stack
- dup stack
- conditional branch
- unconditional branch
- push next block
and still 48 instructions free!
Once I thought of making the bytecodes complete enough so you could actually write primitives in them. It was a pretty neat idea (and would make porting much easier), but some primitives (like one to manipulate the MMU in the processor) would still require special treatment, so I gave up on this.
1 push literal
- push next literal
- push special literal (1..63)
Here are the ten most popular literals (as counted by references from bytecodes):
3872 'ifTrue:' 3388 ',' 2520 'e' 2466 '=' 2292 'value' 1714 'copy' 1692 's' 1667 'value:' 1550 'm' 1518 0 1429 'fb'
I was very surprised not to see 'i' among these :-)
2 stack send
- send next literal
- send special literal (1..63) (probably things like #traits, #at:, #clone, #copy, #size, etc.)
3 implicit self send
- send next literal
- send special literal (1..63) (same as with 2 send)
Adding up the number of times that the 63 most popular literals appear in the literal frames (not quite the same as the count above, but the results are very close) we would be saving a total of 43903 words with your encoding. Not bad at all!
Either of the return instructions end the method's (or block's) execution. This means, the shortest possible code is "push self", "normal return".
The method "clone = (cloneSize: 10)" whould be encoded as "implicit self send"+#cloneSize:, "push special"+10, "normal return" - still one word (assuming that cloneSize: is one of the special literals). Jecels encoding would need two words - as mine would without the special literal encoding.
I think you got the order of your bytecodes wrong - you first have to push the arguments on the stack and *then* send the message.
[singleton objects could be their own maps]
I don't think that this saving is worth the additional time. I don't know the original self system, but I'd guess that oddballs aren't that common.
Every single method in a system is a differente *type* of object and has its own map, so there are quite a few "oddballs". I'll explain in more detail when I answer your other emails tomorrow.
-- Jecel
self-interest@lists.selflanguage.org