[self-interest] Re: Caching method lookup smalltalk

Wed Sep 1 20:49:44 UTC 1999

Jecel:
>Me:

>> It's a well known optimization for Smalltalk systems, to cache
>> 
>> <class, selector> -> <method>
>> 
>> mappings.  Class is the receiver's class, not the method's implementation
>> class.  This means, more than one key could reference to the same method.
>
>Exactly. But note that the class of "self" is different for
>executions of the same method associated with different keys.
>This suggests compiling different native codes for the same
>source method, which is the idea behind "customization".

Either you didn't understand my example or I don't understand your answer
(probably the latter, I guess).  You're talking about Smalltalk, right?
Why the class of "self" is different and want keys do you refer to?

>> Can I simply replace "class" with "map object" in Self and use the same
>> caching with the same results?

>Yes, maps can be a part of the key for the cache.

?

>> Now, using #doesNotUnderstand in Smalltalk is not that uncommon and using
>> DNU should be cachable, too.  I'm not sure how to extend the mechanism, but
>> I think, mapping
>> 
>> <myclass, #foobar> -> <Object>>doesNotUnderstand:>
>> 
>> after you noticed that #foobar doesn't exists, should be sufficient. Am I
>> wrong? 
>
>Supposing that myclass overrides #doesNotUnderstand:, it would
>be better to map <myclass, #foobar> to <myclass>>doesNotUnderstand:>,
>right?

Of course.  Here's the example (for Smalltalk):  Suppose we've A and B,
subclass of A and C, subclass of B.  Suppose that A implements method x
(which I write as A>>x).  Let a,b,c instances of A, B and C.

If we have "c x", this generates  <C, #x> -> <A>>x>
If we have "b x", this generates  <B, #x> -> <A>>x>

If we now add x to B, the caches need to be flushed and the same message
sends generate different keys now.  Is this what you were refering to above?

"c x" -->  <C, #x> -> <B>>x>
"b x" -->  <B, #x> -> <B>>x>
"a x" -->  <A, #x> -> <A>>x>

If B>>x would contain a "super x", and we have to predend to by a
superclass(receiver), that is an A to do the lookup.  The invocation is of
course made for b.

>The answer is to do away with all of this and use reflection with
>some agressive optimization (partial evaluation, for example).

Do you have a concrete example as why reflection would help here and how?

>As I mentioned in my other email, this is very complicated. The
>error ends up in a method in the process object, and it checks
>if the object happens to understand the message 'undefinedSelector:...'

I'd do it the othe way round. First lets the VM determine wether there's
the right method.  If yes, let's call it directly without noticing the
process (or thread object as I would probably call it)  Otherwise, it's the
right way (at least the way I'd have expected) to notify the process that
there's an object that doesn't understand what it should.

This should improve the performance as the VM is probably that part of the
system that can do the fasted method lookup and invocation.  Anything I
missed?

BTW, what if there's no process object or that objects doesn't understand
the right message? Kernel panic?  ;-)  Perhaps the right time to reinvent
the "guru meditation"...

>In Self 4.0, all methods are connected via doubly linked lists
>to all slots that were involved in its compilation. If *any*
>of them are changed (eliminated, new one with same name added
>in child, etc.) then it is purged from the compiled code cache.

Probably the only way to do it.  Does anybody have a better idea?  Dave
perhaps?   Anything you'd change for a new self system?

>PICs are very hard to integrate into an interpreter. Note that
>your cache works at the receiving site (and so mixes the results
>of all callers), while ICs and PICs work at the calling site, and
>so there are more of them and they are more precise.

I know.  This is why inline caches are considered to do even better caching
ratios.  Creating a double-linked list of all PICs could do the trick.
This -- as written without further checking -- should do the trick

if (receiver.map != map_I_expect_here) {
  push(receiver);
  map_I_expect_here = receiver.map;
  method = do_the_normal_lookup(selector);
}
invoke_remembered(method);
// "map_I_expected_here" and "method" static local vars
// need aditional storage for "prev_cache" and "next_cache"

which would normally be some kind of assembler.  One problem could be that
in the caching case, we need a jump which is bad for modern processors.  So
logic should be inverted.  However, in all cases, we'd need a jump over the
static variables if they're stored local to the invocation.  One could
setup an extern cache array which would then store map, method pairs (prev
and next aren't needed in that case) but this would break locality of the
code.  Self-modifying code which would hide the vars in constant
assignments to some dummy register would be yet another solution.

bye
--
Stefan Matthias Aust  //  Bevor wir fallen, fallen wir lieber auf.