[self-interest] oldParser. using ?

Mon Oct 22 23:26:22 UTC 2001

Sorry about the long delay. The local "high tech fair" I was busy with 
(see http://www.merlintec.com) ended just yesterday.

On Monday 15 October 2001 09:11, Marko Mikulicic wrote:
> I made some hacks to implement resend but I have added categorization
> so now the code is more readable. The parser is surprisingly elegant
> and simple and it
> would be a good example of the power of self

That is something that was specifically disclaimed for the previous 
parser :-)

> (although it doesn't use
> many self-related features; it could
> in fact been rewritten in smalltalk without too many problems, but in
> self looks more elegant).

That is usually the case. I like to point out to people (I did so a lot 
in the last few days...) that the end result is not very different for 
Self and Smalltalk, but the process of getting there is. If that isn't 
important for them (people who only study programs instead of actually 
writing them) then they don't see what is the big deal.

>  The parser builds only the parse tree. A bytecode generator would

Were you going to write "easy to write"? I would agree with that.

> They use strings, but they separate object annotations from slot
> annotations.
> If I understood well the annotation mechanism, the parser takes the
> nested annotations and flattens
> it in a string which is stored in the object's map. I want to know
> how it stores the relationship
> between the slot annotation part and the collection of slot names it
> references.

My understanding is that there is one annotation for the object and 
separate annotations for each of the slots. See this code fragment from 
the VM:

  struct slotDesc {
          stringOop name; // the slot's name
          slotType type;  // type and properties (slotType.h)
          oop data;
          oop annotation; // the VM does not care what is here; just 
for the Self world 

When an annotation is associated with a bunch of slots, I think each 
slot has its own pointer to it.

> For example:
>
> (| {} = 'Comment: cmt' slot1 . {'Comment: slot comment' slot2. slot3}
> . slot4|)
>
> this object will have an object comment, and the slots slot2 and
> slot3 will have the same comment ('slot comment').
>
> When i send asMirror annotation to this object I get only 'Comment:
> cmt'. Where are the other comments stored ?

Try asking for one of the slots and see if you can't get its annotation.

> I know that I can access the slot annotations trough slot reflection,
> but I want to know how this is stored in the object annotation (I
> only readed that objects have annotation. slots doesn't seem to
> directly have annotations).

I am not sure about this, but I think they do. Which makes maps even 
larger than they were before :-(

> Wich separator
> is used ? if you use 16r7f then how the can the framework-level
> annotation groups (ModuleInfo;, Comment: ...) been distinguished from
> the flattening of nested annotations ?
>
>  At least I want to know if the annotations are really strings and if
> they are really stored in the object, and not in a per slot fashon.

It seems that they are stored in a per slot fashion.

> Yes I know about Mango and but I heard that the Self syntax is not
> well suited for parser generators (not LAR or whatever is called).

Not really.

> I remember you have said that you started but I also remember you
> said that you don't remember why you didn't finish it, probably
> because the grammar was ambiguos
> (I think expecially the keyworded send and the resend with the obuse
> of dot notation. perhaps annotations
> also bring some problem; the progref says the grammar of annotations
> is ambiguos).

I didn't have any problems with annotations. And I think I did ok with 
the resends and stuff, but I never actually ran it through Mango and 
there might be lots of bugs. The real problem was dealing with the 
lexical level since "." is hard to parse. But it is mainly things like 
numbers and strings which are missing and they shouldn't be hard to add.

> Also I don't know how big the resulting parser will be and witch
> parts of the framework it will use.

It wouldn't be too bad, but certainly larger than a handcrafted one. 
Note that you can add special filters in Mango (as the typedef one in 
the C example) to handle special cases where otherwise the parser would 
become too large or even impossible to build.

> For a bootstrapping a handcrafted parser is better suited, I think.
>  Also, I don't know how better is Mango than bison but maybe there is
> a reasion why the parser in the VM was handcrafted and not generated.

Mango is much nicer and uses a "structured BNF" which is simpler to 
read.

> If I finish that parser it would be nice to try to make the VM switch
> to it for evaluating code from its primitives.
> The hardcoded parser will be certainly better in error handling for
> now but I think it would be great if more  code migrates to Self from
> the VM. I don't want to Squeakize Self but at least the parser....

I am trying to have 100% of the VM in Self (the compilers, parser and 
so on) so you have my complete support. I don't think Squeak goes far 
enough (it is written in Slang, a subset of Squeak, and not in the full 
language) and that they are paying a very heavy price for this.

-- Jecel