[Self-interest] Proposal for a modernized Self dialect

Mon Dec 13 11:28:58 UTC 2021

Hello everyone!

I am a student from Germany who discovered Self a few years ago, and
since then I've been fascinated by the language.

I just read the recent entries about the future of Self on this mailing
list ([1][2]), and, if I may, would like to add my two cents on this
topic. This is a long E-Mail with a lot of daydreaming, so I hope that
I'm not going a bit too far with my ideas. Still, I'd love to hear your
opinion on this.

When I first found out about Self back in 2017 I was a teenager,
rummaging through old computer science videos on YouTube. I already had
some programming experience back then, but the "popular" languages like
C++, Java and Python were starting to bore me, so I decided to have a
look at some older languages.

Just by chance, the algorithm led me to the videos now featured on
Self's website, and I was immediately amazed by the idea behind it. The
UI and inheritance system was something I've never seen before, so out
of curiosity I cloned the repository with the Self VM and decided to
give it a go.

However, after tinkering with it for a while, I found myself asking the
question why Self never really took off - it is a great system with a
lot of potential after all - and if there are ways to revive it. To me,
it seems like Self has already solved a problem decades ago that we're
facing more and more these days: the complexity of modern software
development.

Just to give you an example: If you wanted to learn how to create an
Android app, you'd have to learn at least one programming language
(Kotlin, Java, ...), install the huge and slow IDE, familiarize yourself
with the layouting, compiling, packaging and debugging processes, set up
the emulator, install libraries, etc. - in short: Any aspiring
programmer would most likely be put off by such an experience. The same
applies to programs like the Unity Engine, web development frameworks,
and a lot more.

 From my experience with the Morphic UI and other parts of the system,
Self provided a completely different approach to development - instead
of sitting in front of "dead" code (that has to be rebuilt and compiled
before it even gets a chance to run) Self invites you into a world of
"live" objects that you can watch and change in real time. That change
in perspective was amazing, and after a while I really started to miss
it in other languages.

So I too was wondering how Self could be improved and maybe even gain
enough traction to become a widely used language. It took me a while to
come up with a few ideas that I personally think are an improvement to
the language, and I feel like it's time to share them with you.

First of all, I'd like to address the problems that I see with the way
we write code these days:

 From my experience, a large percentage of the code we write today tends
to convert data structures into strings and vice versa. May it be JSON,
HTTP, HTML, XML, binary files or some other format, many objects contain
code that converts them into byte sequences (and back) in order to
read(...) or write(...) them to permanent storage, a network connection,
or somewhere else. Not only does this lead to the fact that most
software imports a huge number of external libraries (which, in turn,
suffer from the same problem) to convert the data, but turning the data
structures into strings also leads to issues when a process wants to
access an object on another computer - how does the garbage collector of
a language like Python or Java know when an object should be freed if
the reference is only an agreement between two ends of a byte-based
connection - as opposed to a real pointer? In some way or another, the
two processes have to communicate - often by importing a library that
does this job. But not only does this increase the program's size and
complexity, it may also introduce a whole class of new vulnerabilities
and weaknesses to the code. And if a process decides to connect to
objects on two other computers on the net, communication becomes an even
greater issue for the program.

On the contrary, Self always had the idea of having live objects
"floating" around, and assigning a new value to a slot should be as
simple as dragging an arrow from one object to another. If this
philosophy could be expanded to slots referencing objects on the network
(or on the disk), and if synchronization and data transfer wouldn't be
the programmer's responsibility anymore, then the complexity of most
programs could decrease by magnitudes, making them smaller, faster and
more reliable than their current counterparts.

I'd therefore suggest the implementation of a mechanism that allowed
Self objects to be stored, referenced and accessed from everywhere
(without destroying the illusion of all objects being "live" and in the
same world): a remote computer, the hard disk, a database, even a
microcontroller could provide Self objects to other Self VMs connected
to it.

Of course, implementing such a mechanism inside of a Self VM could
certainly turn out to be a very complicated task, but I am vaguely aware
that a few networking experiments using Self have been conducted in the
past, and I think that a network-spanning Self world (maybe even an
internet-spanning Self world) might be an interesting project and a fine
addition to the system.

I have also noticed the "ui3D" directory in the Self repository. Having
a three-dimensional world where objects can be inspected and
reprogrammed would be a feature that is not very widespread yet -
although projects like JanusVR [3] have been trying to implement it for
a while now. Especially in connection with VR headsets and the ability
to reference objects via the network could this UI open doors to new worlds.

Another topic that I think is important is security. Currently, all Self
objects are just loosely referencing each other - which makes Self
vulnerable to various kinds of attacks. One thing that immediately comes
to my mind is a problem that JavaScript suffers from: Prototype
Pollution. So a better security mechanism is probably needed. I came
across [4], which is a paper elaborating on adding the concept of a
Security Kernel to a language like Scheme. I really liked the idea and
thought about a few ways to add similar features to Self. However, the
paper leaves open an interesting point: Under section 4.3.5 it lists the
problem that objects which are stored inside of files can not be guarded
by the security mechanism. This brings me back to my earlier statement
that strings are not a good way to represent objects - neither for
intermediate nor for long-term storage - and that the Self VM should
hide the fact that some objects are "frozen" in a file as well as it can
to provide a certain security (and ease of access, of course).

Many people that I have introduced to Self have been put off by its lack
of types. "But how can I be sure that my code won't crash randomly due
to a 'message not understood'-error?" is a question that I often get
when showing Self to others. So I think that Self needs two things:
First of all, it needs a simple, reliable, extensible and - most
importantly - optional way to define and introduce types and interfaces.
The TypeScript language has made a few steps in this direction, and I'm
sure that a similar mechanism could not only provide a "handrail for
Self programmers" that doesn’t interfere with its prototype-based
inheritance, but it might also allow the compiler to optimize the
generated code a bit better. And secondly, Self's object environment and
UI should be geared a lot more towards what modern IDEs do: Code
suggestions, interactive highlighting of errors, automated refactoring,
and more. Some of these concepts already exist in the current
implementation, but only in a reactive way, i.e. the user has to
manually trigger certain actions and does not get a lot of automatic
suggestions by the environment.

And last but not least, I think that the inheritance and cloning system
needs an update. After looking at the implementation of Self's "copy"
method for morphs I am convinced that writing a copy algorithm for every
data structure in the system does not scale well and is a repetitive and
erroneous task that could (and should) be done by the VM itself. For
that, the VM needs information about which object corresponds to which
data structure, e.g. the "link" instances of a linked list should belong
to the list head that created them, such that copying the list head
traverses all referenced objects, recursively copying the ones that were
explicitly associated with the list head. If applied to objects and
their parents, this algorithm could even work as a replacement for
Self's current copy-down solution (which I consider to be extremely
cumbersome to use), and the fact that subparts of a data structure
belong to its main object enables the implementation of certain security
features.

Especially if the Self world grows larger and many implementations of
the same concept (e.g. points using polar vs points using cartesian
representations) appear there needs to be a feature for easily switching
between different representations of the same concept in order to
provide a sense of generality. There are many cases where the original
programmer didn't consider different representations of related
concepts, and adding a conversion method later on doesn’t seem to be an
elegant solution if there happen to be dozens of differing
implementations of the same concept. Otherwise a point using the slot
names "x1" and "x2" will crash the code that’s using "x" and "y" to
address its coordinates - and vice versa. Temporarily converting the
interface of a "Point" to a "Vector2D" or a "PolarPoint" should be
trivial, and converting an I/O stream for bytes into a UTF-8 character
stream for output (by fetching the output part and wrapping it in an
UTF8Writer object) should also be something that the language can do
almost automatically without the need of an "asUTF8Writer" method. So a
way of disguising or switching out the interface of an object is a
feature that is really needed.

I understand that all of the ideas I mentioned above are quite ambitious
and complex, and implementing such a system might be a huge undertaking.

However, I have spent a lot of my spare time working on a dialect of
Self (called Eco) that has the goal of accomplishing all of these
concepts, and even though I had to change a few syntactic and semantic
features of Self and many optimizations are still missing from my
current VM, the results already look fairly promising. For now, I'd just
like to call it a personal project of mine, but I have the hope that in
the future it could serve as a new and practical kind of environment for
programmers and users alike.

Hopefully I have peaked your interest with this E-Mail, and I'm looking
forward to your responses. I am especially interested in looking into
the technical details of such a system with you, and I'd like to hear
your opinion on the improvements I'm suggesting.

Also, this is my first time sending a message to a mailing list, so
please forgive me if I messed something up ;-)

All the best,
Eric Nijakowski

  [1]:
http://lists.selflanguage.org/pipermail/self-interest/2021-August/004843.html
  [2]:
http://lists.selflanguage.org/pipermail/self-interest/2021-August/004860.html
  [3]: https://www.janusvr.com/
  [4]: https://dspace.mit.edu/bitstream/handle/1721.1/5944/AIM-1564.pdf