Design Article
Bridging the System to RTL Continuum
Mitch Dale
5/19/2005 12:00 AM EDT
The rapidly evolving semiconductor industry has always relied on innovation to sustain advancement. From the creative forces behind the myriad of consumer electronic products to the technology improvement behind sub-micron silicon running at gigahertz speed, innovation makes electronic systems possible.
Gartner Dataquest (2004) states that more than half of IC designs are System-on-Chip (SoCs) meaning they contain some type of processor and memory subsystem. With the adoption of IP, from internal or external sources, additional importance is placed on system-level design and integration. Designers are being pushed to work at higher levels of abstraction while at the same time meet strict power and performance requirements. In addition to these pressures, design teams are faced with the familiar challenge of getting their SoC working within a tight project schedule.
It is clear that the semiconductor industry must adopt system-level design and verification methodologies. However, before design teams can move forward there is a prerequisite on tools and technologies that support a RTL to system level transition.
The system to RTL continuum has two axes of abstraction: sequential and data. System-level design demands engineers move up in both directions (Figure 1).
|
Sequential abstraction ranges from detailed timed description to algorithmic un-timed description. This range can be demonstrated with the example of PCI. As the ubiquitous PC bus for the last decade, PCI has well-defined functionality. PCI64 is a PCI implementation across a parallel bus infrastructure, Where as, PCI Express implements the same functionality with an underlying high-speed serial protocol. At the system level, operations like read and write are functionally equivalent. Yet, PCI Express has a completely different sequential architecture. In this analogy, PCI64 and PCI Express are detailed timed descriptionsthe lowest level of sequential abstraction. At a higher level of sequential abstraction PCI transactions have only input/begin and output/end specifying the temporal relation of each command. At the highest level of abstraction PCI functionality is modeled as simple data movement or commands with no notion of time.
Data abstraction shields designers from fine grain detail and allows higher levels of conceptualization. For software engineers the notion of collecting, abstracting and passing aggregated information is part of basic programming. These concepts are just as useful to hardware engineers in addressing complexity. Hardware designers exploiting data abstraction manage "words" and "data structures" instead of "bits" and "busses". VHDL, SystemC and SystemVerilog all contain type and object semantics to encapsulate data and promote higher levels of data abstraction.
The ability to move within the range of sequential and data abstraction creates the system to RTL continuum. Higher levels of abstraction allow for design alternatives to quickly be formulated and qualified. The exploration of sequential architectures provides detailed information on system characteristics. Navigating the continuum is iterative. As algorithms are turned into RTL implementation, a series of informed decisions lead to an optimal design.
|
There is evidence that a sequential shift has begun. Many engineers, regardless of whether they are designing with VHDL, Verilog, SystemVerilog, C, SystemC, or even using behavioral synthesis, transform the sequential behavior of their designs. Why? Because without modifying the sequential implementation there is no way to meet a power, performance, and area budget of their design. Some common techniques to improve operational characteristics are re-timing, resource sharing, and pipelining, all of which modify sequential behavior.
In re-timing, long combinational logic paths between state elements are rebalanced to reduce latency. Microprocessor teams frequently make this change to achieve timing constraints or maximize clock speed. As evident in Figure 3, re-timing transformations are sequential in nature because the values in the re-timed state-map are distorted from the original.
Figure 3: Retiming
Another example of a sequential change is resource sharing or duplication. In Figure 4, the algorithm A + B + C may be implemented with or without resource sharing. Resource sharing reduces the number of adders in this design at the cost of additional cycles, latency. Likewise logic resources can be duplicated to increase performance at the cost of area.
|
A technique to improve data throughput is pipelining. Figure 5 shows two data path sequences. In the pipelined, version the "calc" function is broken into two phases so the data path can run in parallel. In this case, the sequential behavior and temporal relationship of the output is altered. Given the designs are functionality equivalent this sequential modification is a tradeoff between latency, throughput and area.
A sequential shift in design methodology is necessary to effectively design at the system level. System-level design has the greatest affect on performance, power and area in SoC designs. To meet system-level requirements, designers will make micro-architectural optimizations that modify sequential behavior. Today, sequential changes are performed manually. Hardware engineers need a new generation of tools to enable a sequential shift in design methodology.
The cornerstone of functional verification is software simulation. While RTL-based software simulation remains a vital tool, it alone is not sufficient to verify complex SoC designs. It simply takes too much time to create and run the multitude of tests required to fully validate a design. Regressions take several days or weeks to complete. Thus the process of exploring multiple sequential implementations could add weeks or months to a project. Consequently, verification remains the largest barrier to evaluating design alternatives and meeting system level specifications.
To make matters worse, sequential changes can invalidate existing test benches. When this happens, test benches require inspection and adjustment. In the resource sharing example (Figure 4), additional cycles in the output are introduced when sharing is implemented. The new output has a clock correlation discrepancy with respect to the other implementation. In this case, the regressions would fail despite equivalent functionality. Depending on the situation this type of failure could ripple from block level regressions into the sub-system or even system-level.
A verification methodology based on successive refinement from system-level models to RTL-implementation has several benefits. System-level tests run thousands of times faster than RTL simulation. Verifying system-level interactions gives designers' confidence in their architectures and algorithms. As models are refined, previous iterations can be used as a reference. When combined with formal methods, system-level verification efforts are leveraged throughout the design process. This allows quick detection of side effects and insures the implementation remains consistent with the original intent.
SLEC verifies that micro-architectural optimizations for power, timing, and area do not introduce functional side effects. Since formal methods yield high coverage, sequential equivalence checking gives designers confidence to make changes late in the design process. Instead of relying on test benches or properties, sequential equivalence checking uses a golden RTL model or system-level reference design written in Verilog, VHDL, SystemC or C/C++. Unlike combinational equivalence checkers, SLEC proves functional equivalence across levels of sequential and data abstraction.
As a result, sequential equivalence checking boosts verification productivity. SLEC gives designers the flexibility and confidence to make micro-architectural changes to achieve challenging timing, power and performance goals.



