News & Analysis
A New Kind of Computing
Bernard A. Hodson
5/11/2005 12:00 AM EDT
The industry today is plagued by a variety of problems, including insecure operating systems, viruses, worms, spam, theft of identity, intrusion into personal systems, wireless data interception, satellite data interception, hackers, and so on. The costs to industry from spam alone are high, and viruses have played havoc with business activity, even putting some companies out of business. Security threats to individuals, companies, and countries are increasing. It is high time that we addressed potential solutions and acted upon those that offer the most promise.
This article describes one possible solution and outlines a programming paradigm that could be developed as a standard. The solution has already been used successfully on several levels of computers, from main frames and microcomputers to 8-bit RISC chips for smart cards and embedded systems. The paradigm proposed is a standard that can apply to all levels of programming activity, with considerable flexibility for customization. It has the potential to eliminate most of the problems mentioned earlier and is small enough that the entire system could be encrypted for each computer and server. It would simplify or eliminate all operating systems.
The paradigm utilizes an expandable language, which can be converted to a byte string on any computer. The byte string is completely independent of the target computer for the application. Using the rules established for the paradigm, a simple compiler can process any application written in terms of the expandable language. In fact, the rules are so simple that we can develop an application without the need for a compiler, generating a byte string for acceptance by the run system. The end of the article explains how readers can obtain a copy of the simple compiler and a basic expandable language, to try that phase out.
The run system processes the generated string of byte codes. To do this, it uses a double numeric system, which uniquely identifies every element needed within an application. This technique makes the virtual processor run system very tiny, from three or four thousand bytes for 8 bit RISC chips using typical smart card and embedded systems applications, to seven or eight thousand bytes for a microcomputer with simple graphics, to somewhat more for the processing of video images and other more demanding applications. The numeric coding system used uniquely identifies every activity that the system will carry out, making for a fast running application. The numeric codes of the paradigm are unique, which makes it possible to add new capability without affecting what was already developed.
The compiler for the paradigm of this paper is itself very small and can be placed, if desired, at the front of the run system, taking just a few hundred more bytes (as the compiler and run system have mutual routines). In that situation, the language elements, rather than the byte codes, are presented to the system, which generates the byte codes first and then runs them. This mode is particularly useful for safety critical applications (where the compiler and the application have to be tested whenever a change to either occurs). For the remainder of this article, let us assume that the compile is complete and the system was presented with a string of byte codes. Figure 1 shows the compile operation.
All elements of the run system are static. The only variable part of the paradigm is the generated byte stream, which will vary from application to application.
Some typical examples of language elements are:
looping 1 1 100 adr grt
screen ^hello world to^ name
bitmap
arith alpha = beta + gamma / delta + 13
The language elements shown are "looping," "screen," "bitmap," and "arith," which will have a numeric code associated with them such as 3,5,+, or *.
For the element "looping" the numbers 1 1 100 represent looping parameters going from 1 by steps of 1 to 100. The symbols adr and grt represent transfer points for the true or false result of the operation. Such a language statement may result in the byte code sequence 2(1(1(d25.
The (1 indicates that a numeric number has been converted to its binary equivalentin this case a 1. The 25 indicates that the second and fifth named language statements are to be transferred to depending on the result of the looping arithmetic (this is done automatically during the compile process). Other language elements give alternate forms of loop control.
The element "screen" might result in the byte code sequence 511, indicating that the first literal is to be placed on the screen, followed by the first variable, which would likely contain the name of a person receiving the message. It has been ascertained that few applications contain more than 256 variable names. While this is the limit in the initial system, extension to just less than 65536 variable names can be accomplished without disturbing what has been developed previously.
The element "bitmap" would have the single byte code +, which would trigger a sequence of activity in the run system asking for the name of the bitmap image that should be produced.
The final language element "arith" might generate the byte codes
where the fourth variable has the result of taking the seventh variable, adding the sixth divided by the third, to which we add 13.
In this case, the % indicates that what follows is a floating point number whose length is 5 with the positive sign and whose value is 13.0. Again, the relative numbers used are a function of the compiler, and the programmer doesn't have to be aware of the coded sequences.
An initial reading may suggest that the structure is complicated. However, it is this simplethe very tiny compiler does the numeric conversions and the run system processes them. The programmer does not need to specifically know the coding system. The run system, from that byte code stream, does exactly what is required.
One important observation is that a spurious byte code introduced nefariously would likely cause the application to abort. For applications that are more critical, we could add a check sum at the end of the byte codes, giving the total value of all bytes, and more or less guaranteeing security from hackers and virus activity. This would be checked at the beginning of an application.
Most of the modules require only a few bytes of machine code, the only exceptions being modules such as bitmap and the software floating point routines for add, subtract, multiply, divide and test floating point numbers (which are similar to but more accurate than the IEEE format). The technique of numeric coding for the static part of the system is what makes such a small VP size possible. The coding also enables the VP to go directly to both the module required and its associated parameters.
Most of the modules are concerned with data moves from direct or indirect addresses, and with binary arithmetic and logic routines. These are all that were necessary from a review of compiler-generated code from many applications in a business environment.
The application programmer does not need to know the internal structure of the elements, this being only of concern to the very small number of people who are involved with system expansion. The general form of the elements is illustrated, as shown in Figure 2.
|
Each element, whether language or internal is given a mnemonic associated with its function, such as Name1 in the figure. This applies to either a language or an internal element.
A, B,... indicates an identifiable mnemonic calling for a VP module with appropriate parameters, such as "screen" or "looping."
m, n... indicates a branch operation depending on whether the result of the previous activity was "true" or "false."
Name2, name7... refers to other internal element strings that will be used. The element called must not be a language element. This feature also contributes significantly to the very small size of the system, as several layers of internal elements may be addressed before the system returns to the VP module following the call to another internal module.
endit is a special function indicating the end of an element, it need not be placed at the end of the element but should be the logical termination point of the element.
The various items A, m, namex are assigned by system developers, the system itself being designed to process the byte codes generated by the compile operation. As was mentioned earlier the byte codes can be generated on any system with an appropriate compiler, or even be generated manually.
The numbers between 32768 and 65500 refer to the use of modules within the VP. One part of the number indicates the specific module to be used, while the balance of the number uniquely identifies the location of the parameters to be used. It should be stressed once again that these numbers are allocated during the compile stage from the language statements of the application, and need not be known by an application developer.
Use of multiple applications with the same software does involve some control of the application names, to avoid duplication and ambiguity, but this is relatively simple to accomplish. The multiple applications can also be assigned one, two, or three priority levels, if desired. In this context all priority one's are processed once, then a priority two, repeating the cycle until all priority two's have been processed once, at which time a priority three gets processed. This round robin priority ensures that all applications do see some light of day during the course of on-going operations. This is achieved by a simple "roll-in roll-out" process of variable data within the application, including the stack process that controls the multi-layer operation of the internal and language elements.
Another useful feature is that the language elements can be in any ethnic language, and the multiple applications do not have to be in the same ethnic language. Even within a single application, it is possible to use more than one ethnic language using synonyms. Use of such synonyms adds slightly to VP processing but not significantly so.
For those with less sensitive needs various check digits can be incorporated both for the VP and for the element segments, these check sums being verified at each run if necessary, or at random intervals. It is unlikely that any intrusion of elements or VP would go undetected.
The earlier concepts demonstrated successful applications on main frames, on mid size and microcomputers and on microcontrollers with RISC chips.
It will take several years for these concepts to become dominant in the industry but dominate it they certainly will. In the first instance, they should be adapted to microcontrollers for smart cards and embedded systems, which constitute over 90% of all installed computers, but which are not dominated by monopolistic software vendors, and where only limited interaction is required between processors.
This will necessitate establishing simple networks based on the concepts (communications and control of high speed networks were part of earlier development), in particular with the use of smart cards for a variety of purposes such as health, financial transactions, personal identification and the like. This would place the concepts in the chips on the cards, in the card readers and in the servers controlling the network.
At the same time, the numerical approach should be introduced to the embedded systems arena, by specific industries such as the automotive or aerospace.
Once the numeric approach is introduced at the microcontroller level, it could then move to the larger systems, at first integrating with their various operating systems, but then replacing them, as they will become redundant. Again, it would best be done by industry (servers, graphics, video etc.) but after successful implementation with the microcontroller world most industries will, by that time, be ready to move.
In order for the this to be done in a controlled fashion by the PC+ industrial groups, which tend to favour propriety in software, it would be useful to establish a working group to oversee the orderly development of the numeric approach.
One of the reasons for moving to the numeric approach is the need to get away from this multitude of problems. The numeric approach offers:
- A very small VP which is static, with no opportunity to introduce spurious code if check sums are included.
- A highly efficient VP that can be encrypted if necessary.
- A static set of elements that numerically describe suites of applications in a form where each number within an element points directly to the VP process required.
- A static set of numeric elements that can be verified through check sums.
- Even in the unlikely event that a spurious number was introduced into a numeric element, without affecting the check sum; applications would abort, due to the critical relationship that exists between each number within the element structure.
The result is a single, very small piece of software that enables all applications to be built with it and which can act as its own operating system. Placing it on a chip creates the Turing Universal machine.



