Design Article

FPGAs Boost Performance Levels of Wideband Digital Receivers

Rodger Hosking

10/8/2003 12:00 AM EDT

Few component technologies have evolved as rapidly in the last few years as FPGAs (field programmable gate arrays). In this highly competitive market, each new generation of devices delivers faster speeds, improved density, larger memory resources, and more flexible interfaces. Totally new resources, such as dedicated hardware multiplier blocks and complete processor cores, also appear.

Specifically, hardware multipliers have afforded FPGAs a strategic entry into DSP applications like software radio, where they are now challenging both ASICs and programmable DSPs. Initially competing for specialized architectures for digital receivers, the latest FPGAs can now outperform ASICs for the data processing demands of the new wideband communications standards. However, coaxing these new devices to handle higher sampling rates requires careful allocation and deployment of FPGA resources.

Digital Receiver Basics
Digital receivers, sometimes called digital downconverters or digital drop receivers, are the fundamental building block of the software radio industry. They revolutionized the communications industry soon after the first monolithic silicon devices were introduced at the beginning of the 1990s.

Digital receivers accept digitized samples of IF or RF signals typically derived from a radio antenna. They utilize digital signal processing techniques to translate a desired signal at a certain frequency down to DC and then remove all other signals by low-pass filtering.

The three essential elements of the digital receiver shown in Figure 1 are the local oscillator, the mixer and the filter—terms appropriately derived from their discrete analog circuitry counterparts in a traditional superhet radio. The local oscillator consists of a phase accumulator (an adder and a register) and a lookup table to generate digital quadrature sine and cosine signals.


Figure 1:  Basic digital receiver block diagram

The accumulator is clocked at the A/D converter's sample clock frequency so that the local oscillator output sample rate matches the A/D sample rate. Frequency control is achieved by programming the phase increment for each clock.

The complex mixer consists of two digital multipliers that accept digital samples from the A/D converter and the local oscillator. They produce a complex representation of the input signal, which has been translated down by the frequency setting of the local oscillator. By appropriately tuning the local oscillator, any frequency band of interest can be centered at zero Hz.

The complex FIR low-pass filter accepts I and Q samples from the mixer. By judicious choices for coefficient values and the number of taps, it can implement a wide range of transfer functions, each with specific passband flatness, shape factor and stopband attenuation to reject unwanted signals outside the band of interest.

At the filter output, a decimation stage drops all but one of every N samples, consistent with the bandwidth reduction of the filter. This produces a complex baseband output suitable for subsequent signal processing tasks such as demodulation, decoding or storage. By suitable reordering and sign changing of the I and Q output components, a real representation of the signal is also available. A useful definition of the decimation factor is the ratio between the input sampling rate and the output bandwidth.

Digital Receiver Types
Digital receivers are divided into two classes appropriately named for the relative range of output signal bandwidths: wideband and narrowband. Digital receivers with minimum decimation ranges of 32 or more generally fall into the narrow-band category, and are extremely appropriate for extracting voice signals with bandwidths of several kilohertz from digitized input signals with bandwidths of several tens of megahertz. In these applications decimation factors can be 10,000 or higher.

Because the complexity of the FIR lowpass filter is proportional to the decimation factor (and inversely proportional to the bandwidth), ASIC implementations of narrow band receivers usually rely an initial CIC filter stage to perform high decimation factors without requiring hardware multipliers. Since the CIC filter produces a sloping frequency response in its passband, its output is delivered to a CFIR (compensating FIR filter), which restores an overall flat pass band response. Finally, a PFIR (programmable FIR) filter is used to achieve the desired final frequency response.

Market demand for narrowband applications such as wireless base stations has inspired several ASIC receiver chip offerings. However, with the migration to new wideband-CDMA wireless modulation schemes, narrowband receivers are falling short of the mark. Required bandwidths of 5, 10, and even 20 MHz are now mandated by technologies entering mainstream applications. Unfortunately, because of the CIC front end, most ASIC narrowband receivers impose a minimum overall decimation of 32. With a 100 MHz input sampling rate, this yields a useable output bandwidth of only 2.5 MHz.

Wideband Receivers
In order to achieve lower decimation factors, wideband receivers rely on the classical FIR filter implementation, just like the block diagram in Figure 1. However, wideband receivers require substantially more hardware than their narrowband counterparts. This is because they cannot rely on CIC filters for decimation in the first stages where sampling rates are the highest. The desired filter response can only be achieved by adding enough filter taps for undecimated input samples, and each tap of the FIR filter requires a multiply and an add operation. Since hardware multipliers consume a significant portion of silicon, they must be deployed judiciously.

To better understand these issues, it will help us to look at the Graychip GC1012B as an example of a popular ASIC wideband receiver. It accepts A/D samples at rates up to 100 MHz and offers programmable decimation factors of 2, 4, 8, 16, 32, and 64. For a fixed filter characteristic with a flat passband over 80% of the Nyquist bandwidth and a stopband attenuation of 75 dB, the FIR filter requires 40 taps for a decimation factor of two. Since the filter is complex, this design, using a brute force approach, would require 80 hardware multipliers operating in parallel.

Even after incorporating some architectural efficiency, the number of multipliers is still substantial. In contrast, a narrowband receiver with a CIC decimation filter takes ample advantage of time-sharing of the FIR multipliers to reduce their number by at least a factor of eight.

FPGA Implementation
Since wideband digital receivers are multiplier-intensive, the new generation of FPGA devices featuring dozens of dedicated hardware multipliers is an attractive platform. A design task was launched to create a general-purpose wideband receiver, similar to the Graychip GC1012B, but with enhanced performance. The FPGA design would accept data from a new monolithic 12-bit A/D converter operating at sampling rates up to 200 MHz, twice the maximum input rate supported by the GC1012B. Dynamic range performance over 80% of the Nyquist bandwidth should increase from 75 dB to 100 dB. The third major advantage would be the added capability of downloading custom FIR filter coefficients to meet some of the new, tougher wideband frequency templates.

The Xilinx Virtex-II FPGA family was chosen because of its generous mix of block memory, system gates and 18 x 18 hardware multipliers. Intellectual property (IP) cores are available for all the basic building blocks shown in Figure 1. These include a complete direct digital synthesizer (DDS) for the local oscillator and configurable FIR filter designs. The mixer is nothing more than two of the hardware multipliers.

In implementing the design, the first problem was the 200 MHz input clock requirement. The available speed-grade FPGAs offered a maximum clock of 125 MHz for the multipliers and the DDS section. The solution was to split the DDS and mixer into two identical sections, each running at 100 MHz. The output of the A/D converter is then demultiplexed into two streams to match this rate, as shown in Figure 2.


Figure 2:  FPGA-based wideband receiver implementation

Each DDS must deliver output sine and cosine samples at 100 MHz advancing by the same phase step each clock cycle. However, the output phase of one DDS must be offset by one half of this phase step to match the alternating sample sequence from the A/D converter. To accomplish this, an extra adder stage is required ahead of the sine/cosine lookup table for one DDS engine. The net result is that together, the two DDS engines generate alternate samples of an idealized 200 MHz DDS local oscillator. This arrangement preserves phase-continuous frequency switching for complex FSK or sweep sequences.

The FIR filter is also split into two complex FIR filters, one for each mixer output. Each filter section receives half the coefficients and calculates the taps assigned to the alternate sample stream it receives. The two filter outputs are added in an output combining stage to produce the final complex output.

The 100 dB out-of-band rejection specification of the filter requires 56 taps for a decimate-by-two design. If all these were implemented with dedicated multipliers, 112 multipliers would be required to handle the complex signals. Fortunately, two strategies bring this number down to a more reasonable count.

By taking advantage of symmetrical filter coefficients, two input samples can be added before the multiplication, saving a factor of two. Also, since the output rate is one-half the input rate and since the multiplier operates at the input clock rate, one multiplier can be shared to calculate two taps. This further reduces the total number of dedicated hardware multipliers to 28.

To handle the other required decimation factors of 4, 8, 16, 32, and 64 with equivalent filter performance, the number of filter taps approximately doubles with each step. Since the output rate is also reduced by decimation, the extra time between output samples allows the multipliers to be time-shared to compute these additional taps. This way, the 28 hardware mutlipliers can handle all the decimation factors involved in the design of a wideband receiver.

One conventional approach for implementing the delay line for the FIR is to use registers within the logic slices. For the decimate- by-64 mode, the number of filter taps is 1792, which results in an extremely inefficient utilization of the slices. Instead, the delay line is constructed from block RAM plus suitable addressing engines. As input samples enter the RAM, they are stored in a circular block with the newest sample replacing the oldest sample. The size of the block is adjusted to the number of taps for each decimation factor. Since this RAM is dual-ported, an output-addressing engine can efficiently pick the pairs of samples required to take advantage of the symmetrical filter coefficients.

Since all the math is performed with fixed-point engines, great care must be taken in scaling, rounding and defining word lengths. Although designed to work with a 12-bit A/D converter, provisions are made for 16-bit input samples to support other sources that can take full advantage of the dynamic range of the receiver. The mixer multipliers also accept 18-bit sine/cosine samples from the DDS and the outputs are rounded to 17 bits using a bias-free algorithm. When two of these 17-bit samples from the delay RAM are added, the 18-bit result matches the input of the tap multiplier. The filter accumulators are 42 bits wide to avoid overflow for intermediate results even though the final sum of products requires far fewer bits.

Summary
For heavily dedicated applications with only one decimation factor and fixed filter coefficients, many of the programmable features of this general-purpose design can be eliminated.

In general, FPGA-based digital receivers offer unprecedented flexibility in filter characteristics, dynamic range, sampling rates, and frequency switching features. They support the demands of new wideband communications standards such as those emerging now, or may be forthcoming in the future.


About the Author
Rodger Hosking is Vice President and Co-Founder of Pentek. Rodger is currently responsible for new product definition, marketing and sales activites, and strategic alliances with third-party hardware and software vendors. He is an accredited speaker with over 25 years experience in the electronics industry.

Mr. Hosking served as Engineering Manager at Wavetek and Rockland Systems. He designed the first commercial direct digital frequency synthesizer in 1971 and holds patents in frequency synthesis and FFT spectrum analysis techniques.

This article was first published in The Pentek Pipeline, Fall 2002, Vol. 11 No. 3.





Please sign in to post comment

Navigate to related information

Datasheets.com Parts Search

185 million searchable parts
(please enter a part number or hit search to begin)

Feedback Form