Hi!
I was trying to understand the Self grammar to implement a Self parser. I failed. Now I've a couple of questions regarding the ambiguities of the grammar. I'm working with the grammar shown in appendix 2.C of the Self 4.0 language reference.
Currently, I've two problems: a) How can I distinguish a simple expression enclosed in parenthesis from an object definition - and b) how to detect the end of a slot list?
I expanded the BNF rules to deal with message precedence as follows:
expression = keyword-message keyword-message = binary-message {keyword-send} | resend... keyword-send = small-keyword keyword-message {cap-keyword keyword-message} binary-message = unary-message {binary-send} | resend... binary-send = operator unary-message unary-message = primary {unary-send} | resend... unary-send = identifier primary = [receiver] receiver = identifier | constant
now, to allow expressions in parentheses, we need to add a rule
receiver = "(" expression ")"
however, now there's no chance to distinguish this from the regular-object definition, which is one alternative of constant, because "( 1 + 2 )" could also be a valid method object. Did I miss something here?
The second problem is with operators. Appendix 2.B says about operators:
op-char = ... '-' | '^' | '|' | ... operator = op-char {op-char}
which especially means that ^ and | are valid operators and furthermore, that for example '=-1' is ambigious. Let's ignore _that_ problem for now. My trouble are ^ and | as operators.
Currently, my parser will fail on (| a = 1 |) because it thinks that there's a slot "a" to which a binary expression is assigned of which the second part is missing. It doesn't detect the end of the slot list.
My parser also fails on this, which might be a legal Self expression according to the grammar: (| | x = ( ... ) |) If "|" is a valid operator, it must be possible to use it in a binary-slot definition.
Here's another problem with '^': ^12 can be parsed both as "self ^ 12" and "return 12".
My solution would be to bann ^ and | as valid operators, but what does the Self spec say, what does the offical Self parser do? I need help.
Thanks in advance, bye -- Stefan Matthias Aust // Are you ready to discover the twilight zone?
------------------------------------------------------------------------ E-group home: http://www.eGroups.com/list/self-interest Free Web-based e-mail groups by eGroups.com
Currently, I've two problems: a) How can I distinguish a simple expression enclosed in parenthesis from an object definition - and b) how to detect the end of a slot list?
It's bad habit to quote oneself, but I'd like to add that b) isn't an issue anymore. I overlooked paragraph 2.4.5 which explicitely removes | and ^ from the set of valid operators.
(Paragraph 2.4 also defines that "2=-1" shall be parsed as "2 =- 1" (longest sequence of matching non-whitespace characters) and not as "2 = -1".)
thanks for reading, bye -- Stefan Matthias Aust // Are you ready to discover the twilight zone?
------------------------------------------------------------------------ E-group home: http://www.eGroups.com/list/self-interest Free Web-based e-mail groups by eGroups.com
Hi Stefan,
I'm working on a parser too. So far, I've only implemented the lexer portion. Your recent email on that shows that you have discovered how to handle that. The '^' and '|' as invalid operators is not obvious. As for the longest whitespace, personally that will work, but I think that any of the '=-' or ('=' operator) should be illegal. C did this for the same reasons. '=- 1' vs '= -1' is hard to read, the standard '-=' is much easier to read.
Thanks for posting your questions (and answers). The info has saved me _a_lot_ of time.
Dru Nelson Redwood City, California
------------------------------------------------------------------------ E-group home: http://www.eGroups.com/list/self-interest Free Web-based e-mail groups by eGroups.com
self-interest@lists.selflanguage.org