unification (was Re: mailing list)

Tue Jul 12 03:12:39 UTC 1994

   Date: Mon, 11 Jul 1994 11:24:43 -0400
   From: tlb at endor.harvard.edu

   > Two objects would be unified by unifying all their slots. There are
   > some interesting questions here, since slots are identified by name
[deleted]

   I think this is less useful than you might expect in designing
   programs. A very nice feature of objects is that for a given
   appearance of an object, it may have varying formats for the contained
   data, for implementation reasons. You still want two objects to unify
   if they represent the same thing, but have different stuff in their
   slots.

I see your point, but if the slots have different stuff in them, how
do you define representing the *same thing*. This involves semantics,
and OO languages are not usually very strong in semantics (at least
not declarative semantics). It also depends a lot on what you mean by
the "same thing" - I think there is much room for mis-understanding
here. If two objects belong to the same class, then they are certainly
the same *kind* of thing, but if they are distinguishable on the basis
of their contents then they are clearly not both representing one
thing (speaking with my declarative hat on).

In the unification scheme I described (later in the posting), I
suggested that two slots which were already initialized should unify
iff their values are *equal*. The obvious way to test for equality is
to use the appropriate equality-testing methods of the objects in the
slots. If these methods have been defined correctly, then they will
automatically take account of the different implementation formats. If
an object of class A does not know how to compare itself to an object
of class B, then either they cannot represent the same thing, or the
designers of the classes overlooked something. (I am not sure how this
should be stated for a classless language.)

This view seems to conflict with the declarative view I gave above,
and I think in this light the declarative view is too restrictive. The
conflict stems from the use of *distinguishable* which is too strong,
unless you define the semantics so that distinguishable means unequal
as measured by the objects' equality method. Doing this brings the two
points of view together, but if anything approaching declarative
semantics is to be obtained the equality tests have to be based on
rigorous axioms (i.e. the sort of axioms used for ADTs).

   > For logical and efficiency reasons it would be nice if the
   > representation of values was unique i.e. two slots could have the same
   > value if and only if they referred to the *same* object. That means it

   Then any object will only unify with itself. You don't ever need to
   check its slots. You have to be able to unify two distict graphs of
   objects based on them having the same appearance.

That is why it is efficient! Note that I said two slots could have the
same *value* iff they referred to the same object. What about a slot
that has no value (yet) ?

Of course an object should only unify with itself, otherwise it would
be "unifying" with a different object! The point is that we don't know
the exact identity of an partially-initialized object (c.f. Prolog
variable) until it becomes fully instantiated i.e. until all its slots
have been initialized. If unification is used in place of assignment,
then further instantiation has to occur via unification, so there is
no way to end up with two copies of one object.

You *do* need to check the slots, because some of them might be
uninstantiated. If I start with the Nil object (or your
UninstantiatedLogicVariable), that object will turn out to be
identical to whatever object I choose to unify it with. From the point
of view of propositional calculus, which is what most logical
languages use, the two objects are simply the same thing. Classical
logic is static and timeless: the fact that they were initially
represented somewhat differently is a matter of implementation, not of
logic.

Of course I can also have an object with several slots, some of which
are instantiated, and some of which contain
UninstantiatedLogicVariable. This object can unify only with objects
that have the same values in shared slots that are instantiated. When
such unifications occur they may bring further instantiation of slots
in either object.

   I actually built a little system to do a weak form of unification in
   Smalltalk. You just imbue your classes with a 'unifyWith:'
   method. Numbers and Arrays and Symbols are easy. I
   added an UninstantiatedLogicVariable class, which turns into a
   delegate for an object when it gets instantiated. I defined reasonable
   semantics for the application objects I cared about.

This is very interesting. I am unsure whether Smalltalk allows
instance variables to be declared without being initialized
immediately. It would be nice if all uninitialized variables behaved
as UninstantiatedLogicVariable. If Smalltalk does provide a
representation for uninitialized variables, would that be the best way
to represent UninstantiatedLogicVariable s ?

   However, I balked at the problem of ensuring termination with
   non-disjoint graphs. There are known solutions to this from the Prolog
   community, but I didn't have the gumption to implement them (in fact
   most commercial prologs have only a weak version of guaranteed
   termination).

I think you are talking about the so-called "occur check": a variable
should not be unified with a term containing that variable. It is
almost impossible to do it in linear time, so it usually doesn't get
done at all!

I am interested in the way you think about these problems in terms of
graphs. I suppose you are seeing unification in terms of graph
matching ?

   Example in a typical prolog:

   ?- A = x(B), B=x(C), C=x(A), L=x(M), M=x(L), A=L.

   produces an answer, while in fact the correct answer is arguably 'no',
   because you cannot unify a 2-cycle with a 3-cycle. In my
   implementation, unification would never have terminated.

The query above does not have a Herbrand model. There is a
non-Herbrand solution where all the variables are equal to the
infinite term x(x(x(........ , but logic programmers usually
concentrate on Herbrand models. Non-Herbrand solutions of this kind
violate Clark's Equality Theory and cause all sorts of theoretical
problems. Computers can't generally handle infinite terms anyway.

   If you just want to use this sort of unification for pattern matching,
   then it is probably perfectly adequate. The only tricky parts are
   arranging for the instantiated UninstantiatedLogicVariable to handle
   delegation (much easier with St V become: semantics), and defining
   unification for your application classes.

I imagine (mostly guessing) that the delegation would be easier in
Self too. Why not define unification by member-wise unification of
slots or instance variables ?

   In fact, I also wrote a nondeterministic execution system based on
   continuations (which aren't too hard to implement in Smalltalk - I
   don't know about Self). I then arranged to trail the binding of
   UninstantiatedLogicVariable instantiations, and got a complete little
   logic engine. I kept finding surprises in the semantics, though. Also,
   it turns out that you need cuts, which I didn't implement.

Great, I thought non-determinism could be rather hard. Obviously you
need to use something like Solutions/3 to avoid the non-determinism
leaking into Smalltalk objects.

Why do you want to use cuts, though ? Logically speaking they are a
Bad Thing, because they destroy soundness. We have a Prolog here,
called NU-Prolog, which uses delaying of insufficiently instantiated
goals to avoid almost all need for cut.

   Most of all, I found it was much harder to use than I thought. I like
   and appreciate Prolog, but the Smalltalk class library just doesn't
   really work with logic variables.

   e.g. Does a Dictionary with 3->4 unify with a dictionary with 1->8 to
   produce a dictionary with both? How about OrderedCollections, one of
   which is the prefix of the other? But then collection1 size changes!
   You get the idea.

Prolog does not have ordered collections or dictionaries, but it does
have lists. Two lists unify iff they have the same number of elements
and the elements individually unify. Shouldn't unification of other
collections work the same way ? I am not sure about dictionaries
though, because position might be less important. Maybe dictionaries
should unify iff all the elements with equal keys unify.

Regarding things changing after they have been unified, bear in mind
my comments about unique representation. Before they are unified the
objects have separate representations, but they better not have
separate representations after unification! So if the size of
collection1 changes after it has been unified with collection2, there
is only *one* object which is referred to by two slots.

I am only a Smalltalk beginner, and I am using GNU Smalltalk 1.1.1
because we don't have a proper Smalltalk here. However I would be very
interested if you could share your unification code with me.

Tim.