[self-interest] Re: Caching method lookup smalltalk

Tue Aug 31 14:12:37 UTC 1999

Stefan Matthias Aust wrote:
> It's a well known optimization for Smalltalk systems, to cache
> 
> <class, selector> -> <method>
> 
> mappings.  Class is the receiver's class, not the method's implementation
> class.  This means, more than one key could reference to the same method.

Exactly. But note that the class of "self" is different for
executions of the same method associated with different keys.
This suggests compiling different native codes for the same
source method, which is the idea behind "customization".

> Can I simply replace "class" with "map object" in Self and use the same
> caching with the same results?  Map objects are used to share common slot
> layout and behavior of objects.  They're transparent to the user.

Yes, maps can be a part of the key for the cache.

> Now, using #doesNotUnderstand in Smalltalk is not that uncommon and using
> DNU should be cachable, too.  I'm not sure how to extend the mechanism, but
> I think, mapping
> 
> <myclass, #foobar> -> <Object>>doesNotUnderstand:>
> 
> after you noticed that #foobar doesn't exists, should be sufficient. Am I
> wrong? 

Supposing that myclass overrides #doesNotUnderstand:, it would
be better to map <myclass, #foobar> to <myclass>>doesNotUnderstand:>,
right?

> I recently read in comp.lang.smalltalk, that DNUs aren't slower
> than normal method calls anymore in most Smalltalk system so it can be
> optimized.  But how?

Certainly using the cache they way I indicated above would help,
but you still would have an indirection in that the method
myclass>>doesNotUnderstand: would do some kind of dispatching
before getting to the actual code of #foobar, and that would be
slower than the built-in message dispatching.

The answer is to do away with all of this and use reflection with
some agressive optimization (partial evaluation, for example).

> By the way, how does Self notify the user?
> 
> Are 'doesNotUnderstand:' and 'ambiguousMessage:' (to make up two names, I
> don't know the right names at the moment) part of "defaultBehavior"?  What
> if an object doesn't implement or inherit these messages?  Are these
> messages sent to another object?

As I mentioned in my other email, this is very complicated. The
error ends up in a method in the process object, and it checks
if the object happens to understand the message 'undefinedSelector:...'
or 'ambiguousSelector:...' or so on. If it does, then the process
uses the _Perform: primitive to send on of these messages to
the object, letting it handle the error itself. If the object
doesn't implement the corresponding message (doesn't understand
#doesNotUnderstand ;-) then the debugger is invoked. So error
handling works even if the object doesn't inherit from anybody
at all.

> Let's assume it would work so far.  What if the inheritance graph changes?
> Flushing all caches would be a solution (of course) but can we find some
> better way please?

This is the way I would do it in a first implementation.

> If a map gets or loses a parent slot or a parent slot's value is modified,
> we have to flush all cache entries for that map and all maps which inherit
> from that map (further complicated because of the fact that inheritance
> isn't a nice tree but directed graph, not even anti-cyclic).

In Self 4.0, all methods are connected via doubly linked lists
to all slots that were involved in its compilation. If *any*
of them are changed (eliminated, new one with same name added
in child, etc.) then it is purged from the compiled code cache.

I don't like all this overhead, but you might consider some
variation of this idea to optimize flushing your lookup cache.

> Polymorphic inline caches are said to give even better caching results but
> there are more difficult to flush I think, because they are distributed
> over the whole code.  You'd probably have to maintain a linked list of
> these caches.

PICs are very hard to integrate into an interpreter. Note that
your cache works at the receiving site (and so mixes the results
of all callers), while ICs and PICs work at the calling site, and
so there are more of them and they are more precise.

Please look at the papers at http://www.cs.ucsb.edu/oocsb/papers.shtml
(specially the ones about Dynamic Dispatch).

-- Jecel