Design Article
Embedded software stuck at C
Rick Merritt
9/27/2007 3:35 PM EDT
"Eighty-five percent of all embedded developers use C or C++. Any other language is a non-starter," said David Kleidermacher, chief technology officer of Green Hills Software. "I don't have much hope a new parallel language will get a foothold," he added.
In the embedded space it took a long time to move from assembly language to C, and it was quite a struggle," said Tomas Evensen, chief technology officer of Wind River Systems. "So for the next ten years in embedded there won't be a lot of new languages," he added.
Michel Genard, vice president of marketing for Virtutech, agreed. "The apps moving to multi-core will be existing apps, so C will likely be the implementation language," he said.
Embedded software developers will avoid the pain of moving to parallel languages by simply running at a high level separate programs or modules in parallel on multi-core CPUs, he said. Such loosely coupled programs will not require much synchronization.
Nevertheless, "the biggest challenge for embedded software developers will be in figuring out how to partition their applications," Evensen said.
"It will be almost impossible for any new language to get a foothold," agreed Eric Heikkila, a project director at market watcher Venture Development Corp., who moderated the panel.
"The inability of C/C++ code to parallelize coupled with its ubiquity throughout the embedded market is a major issue for multi-core going forward," Heikkila wrote in a follow up email to EE Times. "Any alternative parallel programming languages certainly won't materialize in the embedded market, but instead will more likely gain momentum in a more mainstream computing market before making its way into embedded applications," he added.
On the other hand, embedded systems need better standards on several fronts to plow the way for multi-core chips, Kleidermacher said.
"Communicating with accelerators is a nightmare because every vendor has their own API," he said, indicating the need for an over-arching standards effort particularly between graphics processors and DSPs.
"That's not happening right now. Some chips may be too applications specific, but I think we could find more commonality here," he said.
In other areas, such as inter-processor communications, there also are too many proposed standards including MPI and the TIPC effort from the Multicore Association. He called on companies such as ARM to get more engaged with the Multicore Association.
"Right now there are no good standards and a lot of ad hoc solutions for communicating between cores and OSes," said Evensen.
Kleidermacher also called on the Linux community to adopt Posix as a baseline for supporting standard communications.
"Linus Torvalds recently came out against Posix compliance for Linux, and that was a big mistake," he said. "It's one thing to say a standard needs to be modified, and Linus can do that, but you don't just give up on it," he said.
The software issues need to get sorted out soon. VDC estimates sales of multi-core embedded processors will rise from $372 million in 2007 to 1.33 billion in 2009. So far, engineers are giving embedded software a low grade of just 2.06 out of five in terms of its readiness for multi-core.
"When people first port their app to a multi-core processor they are surprised they don't get a two or four-fold improvement, but something more like a 1.2x improvement," said Evensen.
Genard of Virtutech said many multi-core environments exhibit non-deterministic behaviors and effects he called "Heisen-bugs" that make coding difficult.
"By the very fact you try to debug or instrument a device you may actually change its behaviors," he said.




Eric Verhulst
10/5/2007 2:04 PM EDT
I received this EE Times article through ACM while here at the embedded systems week in Salzburg. The topic of the article is echoed here as well, and let me tell you, people seem to have a short memory because the answer has been given in the late '70s.
There will never be a real 'parallel' language. Actually what people mean is a compiler that turns a sequential program into a parallel one. In the best case, we will have something like the parallising fortran compilers. These compilers look for the loops and the split them over multiple processors. The issue is that one can never extract more parallellism than was originally put in the program. For a lot of scientific programs or even for some graphic applications there is some potential, but for most applications the potential is very limited. Even then, a lot more parallelism could be found if a parallising compiler was not an exercise in reverse engineering. The original problems often have a lot of real-world parallelism. E.g. fluid dynamics code starts from a model where millions of small "voxels" and their interactions are integrated to obtain a sequential code. in the process, the "parallel" information gets lost.
In the embedded world we have fortunately already RTOS code using multi-tasking.
This is often quite natural as the real world is composed of concurrent (a better world for parallel) entities that interact. Hence real code is composed of concurrent entities that interact. This is the logical model. In real-time applications one has also to add the time dimension and that what makes it all seemingly harder than programming in C or Java on the PC. Real code is concurrent, interacting and has time-properties. There are also other properties like resource usage but we can ignore this for the moment. No single programming language can capture that automatically. It has to be part of the programming paradigm. It also has to be supported adequately in the hardware.
To start, multi-core architectures are already common in the embedded world.
It's almost the default architecture. Your mobile phone has likely a RISC, a DSP and maybe a couple of vectorising accelerators. I work with a company that makes smart sensors (Melexis). They put a 16bit microcontroller with just 32KB program code together with a 4bit controller handling I/O.
So what else is needed? Fast context switching and low latency I/O or data moving going on in parallel with the CPU. As Lothar Thiele put it again at the conference. Software = computing + communication + resource management. Hence, another computing paradigm is not going to solve it as they leave out two important aspects. Note that time here is a resource as well, but all three aspects are orthogonal and should remain as such.
Concurrency then becomes natural if one programs explicitly concurrently from the beginning. That shouldn't be an issue as good software engineering (not the same as writing code) is modeling in the first place, verifying that the design is correct and then writing the concurrent code is trivial. It can even be done by another program. No human is needed.
In the beginning of this letter, I alluded to the 70's and of course I was referring to the INMOS transputer that was based on Hoare's CSP process algebra.
It worked extremely well. The transputer did context switches in a single microsecond at 20 MHz. Even a single line instruction could be scheduled as a process. Unfortunately, serious marketing mistakes killed the transputer and its programming model although a small community of converted people is still alive and kicking.
How do modern processors compare? I should really say, modern processors and the software running on top of it. Mostly very badly. On top of that, people use them as references. The example that we all use subconciencelessly is the PC. We think Intel+Windows, but any other processor with Linux doesn't do much better.
E.g. Windows is using a 15 ms timeslicing scheduler whether it is a 100 MHz Pentium or the latest 3 GHz Machine. A lot of applications communicate by polling. Even on a "single core" this looks like a lot of waste. It is only justified by the fact that the interaction with our PC is happening at 25 Hz and that it is a good enough solution for heavy gaming graphics. But a true priority driven preemptive scheduler would have gotten a lot more out of the machine.
This was recently made very clear to us when developing a virtual prototype for a SoC where labview was combined with a CPU register level simulator en communicating over shared memory. When running it on a dual core PC (no source code changes), the simulation speed when up with a factor of about 200. So, even on a single core PC we are not getting the bang for the buck we deserve.
So, what's the final message here? If we want to exploit the power of multi-core architectures - and they are the natural architectures - we have to abandon the pure von Neumann architectures that is still reflected in the programming languages we use. We have to raise the level of abstraction and use sound system and software modeling methodologies rather than programming straight away in C,
C++ or Java. Which construction engineer would develop bridges by C++ putting the stones himself without even making a plan? Of course, this means we need to change the way (software) engineers are educated and trained. There is not much new to be invented as we just have to pick up the thread of the transputer and its CSP model. Shouldn't be that hard as RTOS'es have been using the concepts in an ad-hoc way for decades (but not always very efficiently). The major effort should now focus on much more rigor and formalism to achieve a "correct by design" methodology. Even in this domain, a lit of the building blocks are present but they need to be "productised" and people need to be trained. The major obstacle to using the potential of multi-core architectures is actually in the mind. People and this applies as well to engineers are mostly good at repeating the same thing they already know. Now they need to learn a new "language" (read: a way of thing in concurrency) and that is hard although learning a language comes natural when we are young.
Eric Verhulst
Eric.Verhulst @ OpenLicenseSociety.org
www.openLicenseSociety.org
Sign in to Reply