A Bold Vision

Letter on a clay tablet sent by the high-priest Lu’enna to the king of Lagash It tells the king of his son's death in combat. ~2400 BC

Image via Wikimedia Commons

We live in a time full of technological wonders. Humanity has built computers that can do 30 quadrillions things per second and devices that can send a selfie from the top of the Eiffel Tower to a village on a remote island in Papua New Guinea within a few seconds. We have self-driving cars and we have artificial chat bots that can fool people into believing they talk to a human. Almost the entire knowledge of humanity is at our fingertips in any place at any time. Yet, we do not process most information much differently from how the Sumerians did 5000 years ago after they invented written language. Of course, instead of clay tablets we use computers. We have much more storage space and are able to distribute information much faster. Computers are able to present information much nicer than clay tablets could. Linking and searching allow us to browse information much more efficiently. But regarding textual information, today's computers are not that different from clay tablets. In essence, a computer understands a text as much as clay tablet would, namely not at all. All the gadgets around textual information, links, pictures, comments, fancy designs, like buttons, tag clouds and even automatic translations, do not change the fact that our devices have no idea what the meaning of the content is. Vision What’s the fuss about? Why is it important whether a computer understands content or not? After all, all that counts is that we, human users, understands it, right? Well, not exactly. Because if computers would understand information, they would be much better at helping us to understand and process those information. If, for example, a computer would understand a news story, it could collect background information about involved persons and entities. It could try to put that story into a larger context, automatically present us related information. It could help us to interpret and analyze the story, even considering different viewpoints. If it knows who has written and published the story, it might also tell us how trustworthy it is. If a computer could understand general information it could also understand us much better, it could understand much better what we want it to do. Computers could behave more like human assistants, becoming better and more powerful tools. The focus would shift from application to task, we would select the information, define what needs to be done with it and the computer would figure out how to do it. Once we have computers that understand us, there is no reason why they should not be able to understand each other in the same way. They could process tasks that involve complex communication strands autonomously. We’d have machines that execute daily tasks automatically, communicating autonomously with other machines. To say it with the words of Sir Tim Berners-Lee: […] our daily lives will be handled by machines talking to machines, leaving humans to provide the inspiration and intuition. Something deeper, more profound would happen as well. As it turns out, the same technology that makes computer understand information, transforms the information themselves fundamentally. It voids information of the protocol and structure we humans require; information become pure. They start to behave differently from what we are used to. They merge almost magically and require no order to make sense anymore. They can be stored randomly in universal repositories without the need for any structure. Different pieces of information from different sources extend each other, automatically creating a bigger picture. The more information we join the bigger the picture gets. Every single bit of information is fundamentally connected to every other bit of information in a similar way as every little piece that makes up our physical universe is connected with every other one. It will be the transformation of the internet from a place where information is scattered, waiting to be collected by anyone who can find it, into a place where information just is, automatically available to anyone on request. It will be the birth of the information universe. Technology A lot of computational resources and a vast translational knowledge base is required for computers to understand natural language text even rudimentarily with today’s technology. This is why digital assistants are running in the cloud and not on user’s devices. Text is a very inefficient data source for computers. For effective information processing, computers need information to be expressed in a way that is much closer to the hardware than plain text is. We need a semantic, structureless data format that is as close to the hardware as possible. On top of that data format, we need to build a universal computer language that is as flexible and expressive as natural language. This language acts as an intermediary between computers and humans. Computers “think” and speak it natively, humans with the help of a translator. This is the main insight Project Samarai is based on. Universal data Project Samarai is proposing a data format called Universal Data. It is closely related to RDF, the technology that drives the semantic web, but is more hardware oriented, a feature that we hope is making it future-proof. Universal data is fully semantic, content-neutral and supports structure-free information storage; it meets all the requirements for the underlying technology of a universal computer language. One of the nice things about working with universal data is that we do not need to worry anymore where to save information. Well, at least much less. Universal repositories, the databases for universal data, can hold any type of information without any obvious order; there is no need for any structure. Structure and order are created when needed, depending on our task and requirements. We can start to organize our data in ways that are much more natural. For example, we can have different repositories for information that we want to keep long term and for information that we only need for a certain time. A bit like our brain has a long and short term memory. We believe that universal data has the potential to change our fundamental conception of information itself. Universal data expressions are not words that need to be read in a defined sequence in order to make sense, they are a chaotic collection of information-bits that always make sense, with or without order. Unlike text, universal data expressions are never stand-alone, they are automatically, inseparably connected with any related information that is available in our information universe. Also unlike text, universal data expressions merge automatically and universally. If you would cut out every single word out of two books and threw all the snippets into a big bowl, you’d end up with a meaningless mess. If you would do the same with two books written in a language that is based on universal data, you would end up with a merger of both books, no information would be lost and related information would automatically extend each other. Universal data exists outside of structure and order, it just IS. To us, this is mind-boggling. Havel Havel is our proposal for a universal computer language. It is a so called Semantic Graph Language, a computer language that is based on universal data. Havel is unlike any other language. It is inspired by universal human logic, it therefore allows expressions that resemble human thoughts. Information can be deep, complex, multi-dimensional, conditional and context dependent. It can express simple records but also complex situations consisting of thousands of involved entities, events, relations and interactions. Havel supports self-describing information; a Havel expression can, like a natural language text, explain itself. Havel is natural language independent and Havel expressions are self-translating into natural language. Multi-level encryption, encryption key management and data access control are integral parts. It allows for automatic information trustworthiness assessment. Havel is primarily a universal information modelling language. This means, that it is a universal tool that allows us to express arbitrary information - just like natural language, with the difference that computers understand Havel. Of course the level to which a computer understands Havel expressions is relative, after all a computer is not conscious, it does not have a mind or feelings. It does not consciously know, for example, what an address really is. But it does know what it can do with an address record when it recognizes one; and it can collect and find them when required. Today, when we say a computer understands, we mean it can recognize the type of information it is dealing with, it knows how to interpret it and what can be done with it. It is conceivable though that future computer systems will reach a complexity that will resemble consciousness. If that will happen, we suspect that a semantic graph language or a very similar technology will be driving it. With Havel, we are introducing the somewhat novel concept of interpretable information. In its raw state, even simple records can be relatively complex in Havel. The interpretation process reduces this complexity to a requested level and consolidates semantics, it determines the contextual meaning of information. This approach does not only enable almost unlimited complexity to be manageable, it also allows for very subtle semantics in user context and for multi-semantic expressions that can change their meaning depending on viewpoint. No human will ever speak Havel directly. It expresses information using long lists of huge numbers, a format that is not intuitive to humans. In order to “speak” Havel we need a translator. At the beginning, this translator will be a graphical user interface. Creating Havel expressions will be a similar process as drawing mind-maps using applications called Modelling Environment. They will also help us to manage and visualize information. With time, we will have translators that will understand natural language, spoken or text, and we will be able to communicate with our devices naturally, at least whenever it is convenient or more effective. Because Havel is universal, it can also express processing instructions. It is therefore also a programming language, a feature that cannot be overstated. At the beginning, anyone who can “speak” Havel will be able to create simple functional modules. A few years later, describing a functionality will be equal to programming it. This needs to be considered especially in connection with protocol-free communication, the ability of computers to communicate without fixed protocol, which is an effect of semantic graph languages. The possibilities for communities and business, being able to setup custom collaboration platforms, are limitless. Anyone with an idea how to collaborate can not only setup her own platform but can adapt it whenever requirements change. Havel will evolve over time. At the beginning, it will be an innovative replacement for existing technologies. Information and functionality sharing communities will form, commercial providers will sell high quality content and advanced functionality. With time, Havel will be extended, new applications will be invented taking advantage of Havel’s unique features. New forms of communication and collaboration will drive communities and businesses, private citizens will find new ways to manage and share personal data. All this will be achieved with technologies that today either already exist or are conceived. Future Will semantic graph languages like Havel make computers more intelligent? Short term certainly not. Computers will seem more intelligent to us because they will be better, more cognitive tools. But fundamentally, on their own, semantic graph languages do not create real intelligence. They are just a smarter way to store and process digital information. Medium term, we might see new artificial intelligence technologies that build up on Havel. Joining semantic graph languages with artificial neural networks is expected to produce surprising and unexpected results; certainly we will see a new generation of Havel interpreter, but maybe we will also get closer to artificial consciousness. Havel itself might evolve into something very different. Once a certain threshold of implemented features and available functional modules is reached, we might see higher-level languages emerge. For them, Havel will just be the underlying framework; they will have independent concepts, grammars and purposes, all based on Havel but not recognizable as such from the outside. Such languages could become more powerful and expressive than our own natural languages. They could become the base for advanced information technologies which are difficult to even imagine today. Project Samarai and the Community Project Samarai (Verein Samarai) is a non-profit association according to Swiss law, based in Zurich. We believe that semantic graph languages and Havel have enormous potential; that they can have a lasting impact on education, science, business and communities. We also believe that such a technology, because of its potential impact on society, should be free and open, that is should be developed and owned by the global community for the global community, as independent as possible from corporate and governmental interests. This is especially important considering that it could be pushed into a direction that does not necessarily lie in the best interest of the public. Havel and all related projects are open source, the source code will be released under an appropriate license and the development opened for the community in due time. Project Samarai wants to introduce different community driven projects once the technology is ready. WIS is the working name for a community driven content platform that will provide ontologies, dictionaries and general knowledge; it is thought to be the official, independent and neutral basis for all other Havel content. Samarai Share will be a functionality and content sharing platform. While its focus will be on free content, paid offers will be possible and sales profits could be used to run the platform and to help to finance Havel development. Samarai Two is a community driven modelling environment with a focus on organization and productivity. It is thought to be a free alternative to commercial ERP systems for communities and small companies.

A Bold Vision

Havel and the Semantic Web

A Bold Vision

Universal ERP

Universal Journal

Universal Repository

Applications

Havel And Artificial Intelligence

Information-Centric Computing

Information Trustworthiness

Privacy and Data Sovereignty