[self-interest] self-ish factoring
kyle_hayes at pacbell.net
Thu Jan 11 04:01:25 UTC 2001
On Monday 08 January 2001 12:44, you wrote:
> for the 37380 methods in the Demo snapshot, we have this distribution
> of source length (31886, or 85%, are less than 1000 bytes long):
> 0-99 10776
> 100-199 4778
> 200-299 3791
> 300-399 4110
> 400-499 2780
> 500-599 1773
> 600-699 1212
> 700-799 1063
> 800-899 887
> 900-1000 716
> 1000-1999 4173
> 2000-2999 593
> 3000-3999 407
> 4000-4999 99
> 5000-5999 28
> 10500 1
> 219700 193
Interesting. Note quite what I was thinking about, but still quite
interesting. These are a bit shorter than I was thinking, but these are
bytes not method calls.
> Repeating the same thing but counting in number of lines we have
> (20085, 53%, are less then 10 lines long):
> 1 5615
> 2 2101
> 3 2721
> 4 2630
> 5 1728
> 6 1582
> 7 1391
> 8 1323
> 9 994
> 10-19 10441
> 20-29 3274
> 30-39 1860
> 40-49 622
> 50-59 259
> 60-69 268
> 70-79 128
> 80-89 131
> 90-99 91
> 100-109 25
> 120-129 2
> 238 1
> 3449 193
The histograms of this are interesting. Nearly 25% are between 10 and 20
lines. That actually seems a bit surprizing. I would have thought that the
bulge would be in the 5-10 range.
> The 238 line method is one that creates a frameMorph full of little
> icons, all "spelled out" in details. The 193 very large methods seem to
> be code to recreate objects for the tutorial.
They can probably be dropped off the data since they are known exceptions.
> When browsing with an outliner, you can directly see the code for all
> methods which are just one line long when they, plus the method name,
> are short enough to fit in the outliner's width. My impression is that
> from one fourth to one half of the methods I see are in that group,
> which seems to agree with the numbers above.
It is not clear whether lines of code or bytes correspond more closely to
FORTH-style factoring. In fact, I wasn't considering methods at all as part
of Self's factoring, but objects with all their methods. Perhaps looking at
methods is a better metric? It is not clear. FORTH doesn't really support
objects directly (though you can extend it to do so easily enough). It is
quite inefficient in most OO languages to define objects with just a few
methods (for instance two). Generally, objects start "acquiring" more
methods and grow as time goes on.
A better mapping might be objects in OO and vocabularies/wordlists in FORTH?
> I have written some very large methods in Self, unfortunately. These
> were either due to a lack of experience or the need to do complex
> object initializations. So I would expect future Self programs to be
> nearly as well factored as Forth programs. Note that it was easier to
> deal with longer methods in the old, text based Selfs but with shorter
> methods (you don't have to open them) in Self 4.
Graphics are nice, but there is a loss of semantic density. The proceedings
from InterCHI '93 have some stuff on this I think (it was the only one I
attended). There are some things that are simply easier to communicate with
special shorthand symbols. For instance,
(In C). This is very compact. In just a few bytes I can represent an array
operation, a structure field access and a function call with specified
argument. How do you show this graphically?
> Forth has an advantage - there is little overhead for defining words
> and none at all for using them. But Self has objects, not just words.
> And this is a very, very important point: a truly object oriented
> program has most of its "smarts" in the way the objects are connected
> to each other, not inside each object. This makes them hard to
> understand by reading the code - the methods are so short and mostly
> seem to be delegating the same messages to other objects instead of
> actually doing anything.
Ideally each object's interactions with the objects it uses for
implementation is clear from the source. Sometimes it isn't. FORTH
definitely has no advantage here. It is far too easy to write write-only
code in FORTH.
Actually, your point about OO being somewhat _more_ difficult to understand
is interesting and particularly relevent. I'll have to think on this more.
I have long thought that the manner in which the program was presented was
much more important to understanding than grammatical issues.
> The other day I was trying to explain this to a person and comparing
> objects with neural networks. People coming from a procedural language
> tend to create one giant object with a few, large methods and several
> dumb objects (nothing but data slots).
And here I always thought that first time C++ programmers always encapsulate
a char in an object :-) Then they start up the steep slope of copy
I have seen both extremes. In one, first time OO programmers (with a
procedural background) do as you note. In the other, they make everything
in sight an object and then try to use objects as functions in their main
routine. It generates... interesting code. I have had better luck explaining
that OO is a really nice way of writing clean looking ADTs.
FORTH is "clean" because it lets you define what to do very easily. It gives
you no intrinsic tools to describe what you do it with. That makes FORTH
programs all verbs. There are few nouns in FORTH. OO programming is about
noun verb combinations and seems closer to natural language.
More information about the Self-interest