programming and knowledge representation

Tue Jan 16 15:12:18 UTC 2001

Dear members,

the group lately adressed unification of languages (vs. "Babel 
effect") and metaprogramming and other philosophical concepts. So I 
felt that it might be a good place for a discussion of some thoughts 
about programming and more general about knowledge representation. 

Please forgive in the case that I even use whole phrases that were 
already written here (or elsewhere) without quoting properly - in the 
end nothing I have to say is possibly really new - maybe/hopefully 
except of the overall picture. But I can say that I definitly got 
inspired and encouraged by the Self-, the Merlin- (Jecels not Suns), 
the Tunes project and of course of all of what I read from you that 
are devoted to thinking about programming in this group (and a couple 
of others).

A first helpful step to deal with the "Babel effect"-problem could be 
changing the view on what programming is. I see Programming (perhaps a 
bit more general as usual) as gaining and writing down knowledge. You 
could associate "gaining knowledge" with "designing/architecturing 
programs".

I don't see so much use any more in the partitioning of knowledge in 
data and programs/algorithms. An example: every program code that 
contains constants contains "plain" data. Every database that contains 
a kind of rules (I am thinking of values of string-fields that contain 
code) contains programs.

So what I am aiming at ?

-
Knowledge can be represented as a directed graph, which' nodes could 
be seen as extremly fine grained objects - possibly only "containing" 
one value of a POD-Type like String or Number. Maybe there has to be 
also a Bitarray Datatype but 'only' for efficiency in storing 
"multimedia data" streams.
All other common named properties of objects like attributes, methods, 
inheritence parents (each also representable by slots) and types are 
realized by named edges. An edge can be named because edges can also 
be connected - with edges or objects. The names are just objects (a 
string-value-node).
This special graph (I would like to call it hypergraph but the term is 
already defined differently - so I will just refer to it as graph) can 
be seen as an "amalgamated structure" of semantic networks, object 
networks and hypermedia in respect of what it can represent. I see it 
as kind of "mother of representations" or *the* model (like in 
model-view).
One motivation for this represention structure is that I found that 
the relations between objects are at least as important as the objects 
itself.
For example is deleting an object almost always really the unlinking 
of two objects (that a historizing mechanism can and should take care 
of).
-
The basic language concepts (message passing, delegation to prototypes 
and classes, ...) can be plugged in (and perhaps sometimes even out 
of) the environment (that means getting assigned to the 
"environment?-object").
I am not sure if this is partly not even already state of the art in 
Smalltalk/Self/other environments.
-
To represent code of the conventional textual programming languages 
each language concept can be assigned to an arbitrary syntax (of 
course carefully, without violating inherent constraints like "no two 
concepts may be assigned to the same syntax within the same 
environment") or even a set of alternative syntaxes. For example would 
I prefer to write down mathematical code expressions in the common 
mathematical symbol language (probably enhanced to be 
computer-interpretable).
As far as I know there are functional languages that have this feature 
of syntax-adaption. I am not sure at all (because I never designed a 
language) if this "separating of concepts" works at all. I just have a 
good feeling that it should anyway ;-) 
-
Each language concept is assigned to (and gets processed by) an 
interpreting machine (part of a compiler or interpreter or VM) - this 
machine solely works with the underlying graph (seeing it as the 
traditional AST-structure).
-
All the other common representations (e.g. textual program code, but 
also "more visual" representations like UML, just everything that is 
known as hypermedia) can be generated (temporarily) as *views* an the 
graph by transforming-adapters.
-
A persistence-mechanism (mapping between different "layers" of 
storage) only deals with graphs (resp. queried portions of the "whole" 
graph). That means that the underlying graph/OO-database (virtual 
memory management ? - I get confused what the different "parts" of the 
concepts behind these terms really are) just works with graphs.
-
Historizing information is embedded in the graph and the processing 
and usage of it is a crucial part of the KlDE (knowledge development 
environment) as it seems to be very important to being able to track 
the evolution of knowledge (especially the evolution of the linking).
-
Also like Historizing is Personalization (think of ownership, 
authorship and privacy) a basic concept that has to be serviced by the 
system.
-

After all could this system be the foundation of something like "open 
knowledge" (in contrast to "open software") development - aiming at 
breaking down the borderwalls of heterogenic knowledge representation 
systems (possibly just yet another holy grail dream :-/ ...).

I shurely hope this ideas are not to far away from your interests and 
would love to discuss them with you.

Regards,
Thilo.