Design Article
Wireless, embedded devices find common ground in mesh network
Cliff Bowman, Software Engineer, Ember Corp., Boston, Mass
4/18/2003 9:33 AM EDT
For embedded designers considering wireless communications as a way to connect distributed devices together, interest in mesh networks is especially strong, especially where efficiency in power, memory, and processing power is of particular concern.
It is easy to understand it rising popularity in networked embedded designs. A mesh's redundant message paths provide the sort of inherent, fail-safe reliability that industrial control1 and military applications demand. Because the nodes in a mesh are also relay points, network infrastructure grows with the network; consequently, mesh networks support incremental installation and require minimal up-front investment. Deploying a mesh network is usually easier than deploying networks of other topologies, especially when propagation varies widely over a geographic area or over time. Furthermore, a mesh network has the unique ability to take advantage of "good" variations in propagation once it is deployed.
Beyond their general advantages, mesh networks hold particular appeal for the embedded developer. For example, a typical embedded requirement is achieving maximum energy efficiency for battery-powered applications. A mesh architecture helps to meet this challenge by reducing total transmitter output power for a given network span, thereby lowering the overall power drain attributable to rn (r superscript n) path loss. A limited computational platform is another common embedded constraint; fortunately, there are elegant, loop-free, memory- and processor-efficient routing algorithms available for mesh networks, making it possible to implement large-scale networks on modest processors.
Unfortunately, newcomers are sometimes discouraged when they first consider moving their systems to a wireless mesh. Comparing performance specifications, these designers find that their existing messaging models don't fit with the mesh devices they can buy. Devices offering throughput comparable to that of familiar wired systems are either unavailable or too expensive; cost-competitive devices, on the other hand, seem hopelessly slow. Of course, there is something missing in these cursory observations: the realization that the design of many embedded protocols is tightly linked to a wired medium.
Two properties that are useful in distinguishing among communications networks are topology and medium access. When comparing mesh networks to the more commonly used bus and star networks, these properties must always be kept in mind, because, in every case, network topology determines the path by which messages can travel.
In the bus topology, every node can hear every other node. Unfortunately, routing on a shared bus isn't quite as simple as it first appears. If two nodes were to transmit at the same time, their messages would appear simultaneously on the bus and the information would get garbled. To minimize this danger, some type of medium access must be used. Some systems limit themselves to query/response messaging (e.g. Modbus) where a "master" node owns the bus, and a "slave" may transmit only when the master sends it a query. Other systems use scheduling schemes (e.g. 802.11 ad hoc mode) where each node has a "window" in which it is permitted to transmit. The original implementation of Ethernet used a third strategy called "Carrier Sense Medium Access" (CSMA), which relies on carrier-sense hardware in each node to detect whether another node is already transmitting.
Because the message travels directly from source to destination, relay failure is not an issue for bus networks. Instead, the vulnerability of bus systems lies in their medium access strategies or in the integrity of the bus itself.
The vulnerability of a star topology is obvious: if the master node should fail, all communications will stop. In shared-medium systems like Bluetooth, member nodes can select another master and communications will resume after some delay. If this is not possible (for example, if the single hub of a wireless LAN fails), no recovery is possible. In addition, if the path between the master and a node is blocked, that node will no longer participate in the network.
Many paths
In a mesh network, there are multiple paths that a message from A to F can take. For example, the message could either go A-B-F or A-E-F. (In fact, there are many other paths, but these two are the simplest.) In a well-connected mesh network, the failure of a single node ("B" for example) only affects communications for that node; any messages might have gone through the failed node are automatically rerouted.
It is interesting to note that link failure - cutting the wire or blocking an RF path - should not have much effect on a mesh network. Because of the mesh's redundant routes, traffic can navigate around the broken link. As a result, no node will be "cut off" by link failure in a well-connected network.
The nodes in a wireless mesh network typically use a shared RF channel, so there must be some means to manage medium access. The usual strategy is CSMA, and supporting hardware is often built into each radio to make this easier. As we observed earlier, CSMA is a distributed strategy, so the network's medium access scheme is protected against the failure of a single node.
This robustness of the medium access strategy is similar to what we observed with Ethernet LANs, but there is an important difference. Wired Ethernet LANs are usually bus networks, so only one node can transmit at a time. In wireless mesh networks, nodes relay for one another, so we can use low-power transmitters. By reducing transmit power to the point that transmissions reach only a few nearby nodes, we avoid tying up the channel for nodes that are far away. This gives rise to a phenomenon called "spatial multiplexing," in which multiple messages can travel simultaneously in different parts of the network. For example in Figure 3, traffic can pass between "A" and "D" at the same time that "C" and "F" exchange their own messages. As we shall see below, the use of spatial multiplexing increases the effective data capacity of the network.
A number of commercially available mesh- networking suite are based the emerging IEEE 802.15.4 standard. Routing and other network behavior of these products is governed by gradient routing (GRAd), which is an algorithm currently being considered for adoption by the ZigBee Alliance.
Efficient solutions
The precise ratio between network capacity and channel rate in mesh networks will always depend on the implementation, but factors similar to those just mentioned are probably universal, suggesting that the ratio between capacity and channel rate will always be small. In the design of mesh-based solutions, we must keep this observation in mind and strive to make the most of the available network capacity.To this end we offer a few design principles that have proven invaluable in the design of efficient mesh solutions:
- Principle 1: Distribute rather than centralize both the tasks and messaging in the network. Centralizing tasks has a number of undesirable effects. First, it means that the bulk of network traffic will tend to focus on the node controlling the task - either by originating or by ending there.
If control is distributed to several points in a mesh network and if these points are far away from one another, traffic around each point flows independently. This means that multiple messages can be handled simultaneously and the effective network capacity is multiplied.
Another reason for distributing tasks is that messages don't need to travel as far. Multiple control points mean that the average distance from message source to destination is shorter. Also it is done to increase the inherent robustness of the system. If there is only one control point, the entire system goes down when that point fails. In distributed systems, some degree of functionality is retained even when individual components malfunction. Since reliability is often a key motivator for the selection of mesh networks, distributing tasks distributing tasks offers a natural design advantage.
- Principle 2: Push data from the source. Use exception-based messaging. Exception-based messaging can drastically reduce network traffic by eliminating superfluous messages. Instead of exchanging messages according to some arbitrary and, typically, worst-case, schedule, nodes communicate only when they have something interesting to say. Furthermore, the query messages that usually initiate scheduled exchanges are completely eliminated. Exception-based messaging can reduce both the number of message exchanges and the number of messages per exchange.
- Principle 3: Rely on the MAC. Query/response messaging and token passing are unnecessary and degrade efficiency. Although this design principle is really just a corollary to exception-based messaging, it is worth stating explicitly. There is a strong temptation to re-use familiar messaging models, but non-CSMA strategies rarely perform well in mesh network designs. In particular, embedded protocols that are designed specifically for bus architectures tend to use query / response messaging or token passing to manage bus access.
- Principle 4: Eliminate unnecessary detours. Let sensors talk to actuators.
This is another corollary, but it too is worth stating explicitly. When we distribute tasks we seek to push decision-making down to the lowest possible level. One effective strategy is placing actuator control logic into the sensor. For binary or limited-state actuators, it is as simple as adding a table of threshold values, so a lot of decision-making that's currently programmed into PLCs can be distributed to sensors without difficulty.
This article was excerpted from ESC paper 527, titled "Architecting Communications in a Wireless Mesh Network."



