OO Machines

Wed Feb 5 12:19:51 UTC 1992

>  This work is several years old and I confess to not keeping up with more
>  recent developments.  In fact, I heard a rumor that the researchers at
>  Linn have moved on to other problems.  Perhaps others on self-interest
>  can provide and update.

Linn Smart Computing, the subsidiary of Linn (who make top-end hi-fi)
responsible for the Rekursiv, was folded up, oh, about 18 months ago
I'd guess.  Last I heard, Harland was at the Design Research Centre in
Edinburgh; I don't know what he was working on.

>  > If I remember correctly there was an article in Byte on this project
>  > serveral years ago.  Seems to me it supported Smalltalk directly in
>  > hardware.
>  
>  It *claimed* to support Smalltalk directly in hardware.  I don't believe
>  they ever got a full Smalltalk system (as opposed to some small examples
>  that didn't deal with I/O or large-scale memory management) running.

Along with about 20 other groups, my group had a Rekursiv board for
evaluation (still do, but we don't use it).  We had a researcher, Ian
Piumarta (ikp at cs.man.ac.uk), implement Scheme on it.  Other groups
were working on Smalltalk, Eiffel, OODBs and other things.  I don't
recall hearing of anything being made to work (apart from Scheme), but
I may be wrong.  Eliot Miranda (eliot at dcs.qmw.ac.uk) was working on
Smalltalk -- are you out there, Eliot?

The hardware is microprogrammable, and provides direct support for a
virtual object memory (paging is transparent to the microprogrammer).
Most of the projects involved writing some microcode for an
instruction set that was deemed "appropriate" for the language, then
compiling to that instruction set.  I believe Eliot was going to use a
threaded (micro)code implementation.  One of Harland's publications (I
forget which) outlines the microcode for a few of the Smalltalk Blue
Book bytecodes.  I don't know if a more complete implementation was
tried by Linn.

I believe the architecture has been to all intents discredited as a
way of building efficient object-oriented systems.  (If there's anyone
out there who believes otherwise, I'd like to hear why.  The lack of
published work is depressing.  My group is as much to blame for this
as anyone, but the funding structure for the evaluation projects was
absurd, making it almost inevitable that many things didn't get
finished properly.  But that's another story.)

One aim of the design was to make the microcoder's task easy, and I
think this was largely achieved (the name derives from the fact that
you can write recursive microcode).  The price paid was that the
architecture is exceedingly complex.  The board we received was a full
triple-height double-extended Eurocard (the largest board you can get
into a SUN rack) bursting with ICs, including four large custom chips.
The amount of logic is immense.  

I believe much of the hardware does not "pay its way" in terms of
performance benefits.  Just as an example, for every object access a
parallel unit does a bounds check to see whether the offset is valid.
For Smalltalk, most (~99%) of accesses are to named instance
variables, which are guaranteed by the compiler to have valid offsets;
the bound-check hardware can only be used for the at: and at:put:
primitives.  So this piece of hardware will at best improve
performance by a fraction of 1%.

Other features of the design which I think were mistaken are:
1. I mentioned that paging is transparent to the microcoder.  That's
   because when an object is accessed that is not in memory, *the
   clock stops* while some external mechanism (in our case a SUN
   workstation) loads the object into the Rekursiv's memory.  This
   means that nothing executes for the duration of a page fault.
2. Paging objects one-at-a-time is great for memory efficiency (ie,
   not much of primary memory is wasted), but is hopeless at keeping
   the page fault rate down.  This, combined with point 1, makes the
   virtual memory hierarchy extremely slow.  As part of our own
   architecture work we did extensive simulations of this
   object-swapping scheme (as used in LOOM also) vs conventional paged
   memory and our own dynamically grouped virtual memory.  The
   object-swapper was worst, except when memory was ridiculously tight
   (eg running Smalltalk in 16Kbytes).
3. The garbage collector was also implemented in the bowels of the
   system, and was not really of modern design.

The bottom line is that a 5MHz Rekursiv gave a 25MHz SUN 3/280 a run
for its money, but was easily outperformed by a 16MHz SPARC.  This
despite the Rekursiv using relatively sophisticated implementation
technology (the gate arrays were state-of-the-art).

Mario Wolczko

   ______      Dept. of Computer Science   Internet:      mario at cs.man.ac.uk
 /~      ~\    The University              uucp:    mcsun!uknet!man.cs!mario
(    __    )   Manchester M13 9PL          JANET:         mario at uk.ac.man.cs
 `-':  :`-'    U.K.                        Tel: +44-61-275 6146  (FAX: 6236)
____;  ;_____________the mushroom project___________________________________