[self-interest] self-ish factoring

Fri Jan 12 18:38:54 UTC 2001

On Thu, 11 Jan 2001, Kyle wrote:
> The histograms of this are interesting.  Nearly 25% are between 10 and 20 
> lines.  That actually seems a bit surprizing.  I would have thought that the 
> bulge would be in the 5-10 range.

Actually, there is a very large spike for 10 line methods. That did
seem odd, so I took a closer look at them. There were a lot of repeated
methods - only 65 were unique, while 251 had 6 repetitions each:

    1     65
    2     53
    3     59
    4     43
    5     62
    6   251
    7       5
    8       2
    9       5
   10      4
   12      7
   27      1

Most of these seem to be machine generated instead of written by hand,
though it isn't easy to be sure without further investigation. In any
case, they all have the pattern

     _some_primitive_IfFail:

     [|:e| ('badTypeError' isPrefixOf: e)
        ||  ['deadProxyError' isPrefixOf: e]
            ifFalse: [ ^fb value: e]
               True: [
                      ( reviveIfFail: [|:e| ^ fb value: e])
                          _some_primitive_IfFail:  fb
      ]]

If we eliminate all these from consideration, then methods in Self are
much shorter than they initially seemed.

> It is not clear whether lines of code or bytes correspond more closely to 
> FORTH-style factoring.  In fact, I wasn't considering methods at all as part 
> of Self's factoring, but objects with all their methods.  Perhaps looking at 
> methods is a better metric?  It is not clear.  FORTH doesn't really support 
> objects directly (though you can extend it to do so easily enough).  It is 
> quite inefficient in most OO languages to define objects with just a few 
> methods (for instance two).  Generally, objects start "acquiring" more 
> methods and grow as time goes on.

I think there is a relation - you use what is easy to do. Making new
words in Forth is trivial, as is making new kinds of objects in Self.

> A better mapping might be objects in OO and vocabularies/wordlists in FORTH?

Probably not. Though words and methods are different things, they don't
feel too different. Vocabularies are very heavy weight things, so Forth
programs tend to use just one or two.

> Graphics are nice, but there is a loss of semantic density.  The proceedings 
> from InterCHI '93 have some stuff on this I think (it was the only one I 
> attended).  There are some things that are simply easier to communicate with 
> special shorthand symbols.  For instance,
> 
> 	a[42]->y(x);
> 
> (In C).  This is very compact.  In just a few bytes I can represent an array 
> operation, a structure field access and a function call with specified 
> argument.  How do you show this graphically?

I was comparing

   ( |  meth1 = ( | x | ....
        | ).
        meth2: z = ( | a <- 1 | ....
        | ).
   | )

with the outliners. Though the outliners take up less pixels, they are
cleaner (thin lines instead of characters) and don't get in the way as
much as the textual representation. In text, you want longer methods
since then you have a lower percentage of "overhead" characters. With
outliners, this isn't a problem.

> Actually, your point about OO being somewhat _more_ difficult to understand 
> is interesting and particularly relevent.  I'll have to think on this more.  
> I have long thought that the manner in which the program was presented was 
> much more important to understanding than grammatical issues.  

Here Self has a great advantage over other OO languages in that its
natural representation is a graph of box-like objects. If the meaning
of a program is in this graph, then having it be invisible is a real
obstacle.

> I have seen both extremes.  In one, first time OO programmers (with a 
> procedural background) do as you note.   In the other, they make everything 
> in sight an object and then try to use objects as functions in their main 
> routine.  It generates... interesting code. I have had better luck explaining 
> that OO is a really nice way of writing clean looking ADTs.

The idea of the FlyWeight Pattern is that it is possible to go to far
in "objectifying" stuff. Should every character in a text be a real
object? Most people would say no, but I will repeat Dave Ungar's motto
here: "anything worth doing is worth overdoing" :-)

> FORTH is "clean" because it lets you define what to do very easily.  It gives 
> you no intrinsic tools to describe what you do it with.  That makes FORTH 
> programs all verbs.  There are few nouns in FORTH.  OO programming is about 
> noun verb combinations and seems closer to natural language.

One thing that is important to remember about Forth is that it limits
your ambitions. Sort of like when you just have wood and cloth to build
an aircraft with, you will think differently than if you had composite
materials (I was going to say Titanium, but it seems that Apple is
using all there is ;-)

So I don't expect to see a Kansas written in Forth, and that is ok. The
world needs lots of small stuff.

Anyway, Forth is a pyramid of words while Self is a graph of objects.
It is interesting that there is any comparison between them at all.

-- Jecel