Digital Filtering.

Digital Filtering

Objectives After completing this module, you will be able to:
State various filters supported in System Generator Describe the FIR filter implementation and understand how to take advantage of certain filter parameters Describe the integration of FDATool block in System Generator

Outline Introduction FIR Filters CIC Filters IIR Filters
DA Filters MAC FIR Filters CIC Filters IIR Filters Generating Coefficients FDATool and XFLDATool Exporting Coefficients Simulink Tips and Tricks Plotting Functions Spectrum Scope White Noise

Introduction Digital filters are the most common of functions found in DSP systems Following blocks are supported by System Generator for digital filtering FIR block CIC block Digital filtering technique to be used will depend on several factors Sample rate Sample width Coefficients profile Clock rate

DA FIR Filter Block Implements a single channel, single and multi-rate finite-impulse response (FIR) digital filter In multi-channel, a single FIR filter is used in time division multiplexing mode so that the amount of resource utilization virtually remains same at the expense of reduced sample rate The DA FIR filter provides optional ports for coefficient reloading When a reload sequence is initiated, the filter stops accepting new data input samples and begins accepting new filter coefficients The amount of time required for the filter to reload is a function of the filter length and type Reloading Option Fixed Option The DA FIR filter provides optional ports for coefficient reloading. When a reload sequence is initiated, the filter stops accepting new data input samples and begins accepting new filter coefficients. Once all of the new coefficients have been written, the filter processes the coefficients and initializes the necessary internal data structures. The amount of time required for the filter to reload is a function of the filter length and type. After the reloading sequence has completed, the filter comes back online and continues to accept new input data samples. For more information about the reload sequence and filter reload time, please refer to the FIR core datasheet. For coefficient reloading option, coefficients structure can not be Inferred from coefficients.

DA FIR Filter Supported features Parameterizable coefficient width
Two to 1024 taps One to eight channels Polyphase interpolation/decimation Optimization for symmetry, half-band, interpolated filters Parallel, serial and multi-clock per output implementation An interpolated FIR (IFIR) has a similar architecture to a conventional FIR filter, but with the unit delay operator replaced by k-1 units of delay. k is referred to as the zero-packing factor. The effect is functionally equivalent to inserting k-1 zeros between the coefficients of a prototype filter coefficient set. Thus Interpolate 1-3 means between each coefficients two zeros are added, thus two units delay is inserted. Interpolation factor of up to 8 is possible. Interpolated filters are useful for realizing efficient implementations of both narrow-band and wide-band filters. Each of the polyphase segments will operate at the low input sample rate (compared to the high output sample rate ) enabling serial techniques to be exploited. The output of the filters has to run at the fastest sample rate (input sample rate*interpolation factor). The Polyphase Interpolator provides support for single-channel operation only. Similarly, decimation process is throwing M-1 samples out from M samples. This is achieved through polyphase filters. Each of the polyphase segments will operate at the low output sample rate (compared to the high input sample rate ) enabling serial DA techniques to be exploited for each segment. The subfilters, all operating in parallel, are employed in the filter architecture. The polyphase decimator provides support for single-channel operation only. Hardware Over-Sampling Rate indicates hardware clocks per sample. For example, hardware over-sampling of 4 means that four clock cycles will be used per sample. This affects hardware implementation only, and has no effect on simulation. In multi-channel mode, number of channels will multiply the implicit over-sampling factor. Thus for 2-channels filter with the hardware over-sampling of 4 will use eight clocks per sample period.

Twos Complement Serial Multiply
One bit at a time: Multiplicand (in this example = -127) is added to the partial sum after sign extending for every multiplier bit except for the last multiplier bit For the last multiplier bit, multiplicand is subtracted to handle negative multiplier numbers For each multiplier bit, one product bit result is determined and output

SDA 1-Tap FIR Filter Z-1 +/- Partial Product ROM Parallel to serial
N BITS WIDE SAMPLE DATA Partial Product ROM A0 Z-1 X0 +/- 1 Parallel to serial converter Scaling Accumulator LUT contains two locations C0 A0 1 Distributed arithmetic is based on saving partial products in memories. Because the coefficients are known ahead of time, it is possible to pre-calculate the result of a multiplication. In this example, we are looking at a 1-tap FIR filter. The result of the multiplication is either 0 x coef or 1 x coef. Hence, the LUT, used in ROM mode, will be initialized with 0 at location 0 and C0 at location 1. Taking this further for 2 taps

Distributed Arithmetic for a 2-Tap Filter
Partial products of equal weight are added together before being summed to next higher partial product weight Create look-up table of summed partial products C0 = (-7) C1 = ( 6) X X0 = ( 7) X X1 = ( 5) + ( ( ( ( ) ) ) ) = (-1) (-14) (-4) (0) (-19) Basically involves changing the order of the computations. Calculate the partial product formed by multiplying bit 0 by the first coefficient and the second coefficient, then add them together. (-49) ( 30) = Sign Extension (Serial-Data / Tap-Parallel Multiply)

SDA 2-Tap FIR Filter Partial Product ROM Z-1 +/- Scaling Accumulator
N BITS WIDE SAMPLE DATA A0 Partial Product ROM X0 Z-1 +/- A1 X1 1 Scaling Accumulator This shows the 2 tap version. Shows the partial products output LUT contains all possible sums of the partial products 00 01 10 11 C0 C0 + C1 C1

SDA 4-Tap FIR Filter Partial Product + ROM + + 0000...0 C0 0000...0 C1
N BITS WIDE SAMPLE DATA Partial Product ROM A0 X0 C0 1 + +/- Z-1 Scaling Accumulator A1 X1 C1 1 + A2 X2 C2 1 + A3 X3 C3 Here is the 4 tap showing 4 ROMs, each with two locations used out of 16 to store the coefficient and the 0 values. But the LUT has four inputs and so the four ROMs and adders are pre-programmed within a single 16x1ROM with the four address bits provided by the outputs of the parallel to serial converters.

SDA 8-Tap FIR Filter Partial Product ROM + Partial Product ROM
N BITS WIDE SAMPLE DATA A0 X0 Partial Product ROM 1 A1 X1 1 A2 X2 Pre-Adder 1 A3 X3 Z-1 + +/- 1 A0 X4 Partial Product ROM Scaling Accumulator 1 A1 X5 Due to the FPGA 4-input look-up tables, taps are grouped by four in order to efficiently address the LUTs preloaded with partial products. Based on the above block diagram, you can imagine that there may be an advantage to use multiple of 4 taps to make full use of the distributed memory. More design tricks are covered in the DSP Implementation Techniques course. 1 A2 X6 4 -input LUT contains all possible sums of the partial products 1 A3 X7

Xilinx DA FIR Performance
10 20 30 40 50 60 Sample Rate (MSPS) Single MAC DA FIR B=8 DA FIR B=12 DA FIR B=16 100 150 200 250 Serial FPGA FIR 6000 Dual MAC 5000 DA FIR B=8 DA FIR B=12 4000 DA FIR B=16 3000 Performance (MMACs/s) Serial FPGA FIR 2000 1000 50 100 150 200 250 Filter Length (Taps) Filter Length (Taps) fclk = 200 MHz for both processor and FPGA B = data sample precision for FPGA As number of taps increases, MAC-based filter’s sample rate decreases exponentially whereas serial DA-based FIR filter will have constant sample rate independent of number of taps. The sample rate depends on the sample size in case of DA FIR filter. Hence as the B increases, sample rate decreases. Note that the hardware resources is a function of sample size and number of taps. In the right side figure, performance is given in terms of mega MACS per slice.

Exploiting Filter Symmetry
Impulse response often possesses symmetry Symmetry or negative symmetry Symmetry is exploited to produce efficient FPGA implementation Uses half the number of multipliers, thus a large size reduction Number of clock cycles increases by 1 due to pre-adder x ( n ) Z-1 Z-1 Z-1 Z-1 Z-1 Z-1 Z-1 Z-1 + + + + a0 a1 a2 a3 a4 The impulse response for many filters possesses significant symmetry. This symmetry can be exploited to minimize arithmetic requirements and produce area-efficient filter realizations. Instead of implementing this filter using the architecture shown earlier, a more efficient signal flow-graph can be used. In general, the former approach requires N multiplications and (N-1) additions. In contrast, the architecture requires only (N/2) multiplications and approximately N additions. This significant reduction in the computation workload can be exploited to generate efficient filter hardware implementations. This reduction in size comes with an expense of an added clock cycles. The increase in the clock cycle is due to pre-adder’s bit growth before multiplication. This bit-growth needs to be included in the calculation. The filter compiler interface allows the filter symmetry to be specified. When the impulse response does exhibit symmetry, the filter logic requirements can be significantly reduced in comparison to an implementation that does not exploit the impulse response structure. For example, a 100-tap non-symmetric filter with 12-bit data samples and 12-bit coefficients consumes 519 Virtex™ logic slices. In contrast, a 100-tap symmetric filter is realized with 354 slices. This represents approximately a 30 percent savings in area. + + + + y ( n ) Symmetric FIR - Odd number of coefficients Symmetric FIR Implementation

Half-Band FIR Odd number of coefficients (every other coefficient is zero) Half-band implementation (for odd number of coefficients) 2 4 6 8 10 -0.2 0.2 0.4 0.6 COEFFICIENT INDEX NUM. TAPS = 11 x ( n ) Z-1 Z-1 Z-1 Z-1 Z-1 Z-1 Z-1 Z-1 Z-1 Z-1 Quite commonly FIR filters possess what is known as a Half-Band Structure. The magnitude frequency response of the half-band filter is symmetrical about the quarter sample frequency /2 radians and the sample rate is normalized to 2 radians/sec. The passband and stopband frequencies are positioned such that the passband and stopband ripples, are respectively equal . These properties are reflected in the filter impulse response. It can be shown that approximately half of the filter coefficients are zero for an odd number of taps. This is illustrated in the above diagrams for an 11-tap half-band filter. This same structure can be utilized to generate an efficient DA FIR FPGA implementation. The Half-Band filter selection in the Core is intended for this purpose. This filter is available in the Coefficient Structure field of the FIR Block’s dialog box. The user must supply the complete list of filter coefficients, including the 0 value samples, when using the half-band filter. a0 a2 a4 a5 a6 a8 a10 + + + + + + y ( n )

Multi-channel FIR FILTER SUM Out Output sample rate = FIR Sample rate
N-bit Sample Data X0 Ch n . . . Ch 2 Ch 1 FIR FILTER Chan n - X0 ……. Chan 2 - X0 Chan 1 - X0 Up to 8 channels Output sample rate = FIR Sample rate # of channels Chan n - X0 ……. Chan 2 - X0 Chan 1 - X0 C0 Out SUM Ch n . . . Ch 2 Ch 1 C1 Chan n - X0 ……. Chan 2 - X0 Chan 1 - X0 Sample rate reduced as more channels are processed There is often a requirement to filter multiple signal streams using a common filter coefficient set. An example of this requirement is in systems that process complex data. The most straightforward method of addressing this problem is to employ separate filters for each data stream. An alternative approach is to construct a multi-channel filter that shares common resources between the signal sets. The objective is, of course, to produce an implementation that is more FPGA resource-efficient than employing a number of separate filters. A multi-channel filter implementation is very efficient in terms of the amount of logic resources utilized. A filter with two or more channels can be realized using virtually the same amount of logic resources as a single-channel version of the same filter. The tradeoff that needs to be addressed when employing multi-channel filters is one of sample rate versus logic requirements. As the number of channels is increased, the sample rate for an individual input stream will decrease, as the same number of clock cycles are used among more channels. For example, four channels will have the sample rate of clock-rate/(4*number of taps) compared to clock-rate/(number of taps) in case of a single channel. The logic area remains approximately constant, as only single filter is implemented. C2 K SUMs K TAPS LONG

Interpolated FIR Applications
Interpolated filter and image rejection filters Interpolated FIR implementation for narrow-band filters Interpolated Filter M(zk) Image Rejection I(z) x(n) y(n) Type ‘InterpolatedFIR’ to view the example x(n) Z-k Z-k Z-k Z-1 Z-k The interpolated FIR (IFIR) structure is shown above. A detailed description of the architecture can be found in: C. H. Dick, “Implementing Area Optimized Narrow-Band FIR Filters Using Xilinx FPGAs,” SPIE International Symposium on Voice, Video and Data Communications - Configurable Computing: Technology and Applications Stream, Boston, Massachusetts USA, 1-6 Nov This is a clever technique to implement a narrow band filter where the bandwidth of the frequency you are interested in is small compare to the sample rate. In this case, instead of forcing 0 on the data side as seen in the polyphase interpolation, we are forcing 0 in the coefficient set. Also note that the filter is a single-rate filter and an image rejection filter is also required. The combined resource utilization of these 2 filters is a great reduction over a single non-interpolated technique. There is a System Generator Example of an Interpolated FIR Filter in the lab 4 directory. It is called interpolatedFIR.mdl. anm-1 a0 a1 a2 a3 + + + + y(n)

Trade Clock Cycles for Logic Area
20Ms/s Multi bits per clock cycle 160Ms/s b7 b7 b7 Serial-DA Parallel-DA b4 b3 Hardware Over-sampling = 8 b0 Hardware Over-sampling = 4 b0 Hardware Over-sampling = 2 b0 b0 b7 b3 Hardware Over-sampling = 1 Processing the data serially, one-bit-at-a-time, can result in slow computation rates. When the input variables are B bits in length, B clock cycles are required to complete an inner-product calculation. Additional speed may be obtained in several ways. One approach is to partition the input words into L subwords and process these subwords in parallel. This method requires L-times as memory look up tables and so comes at a cost of a linear increase in storage requirements. Maximum speed is achieved by factoring the input variables into single bit subwords. With this factoring, a new output sample is computed on each clock cycle. This factoring results in a fully parallel DA FIR (PDAFIR) architecture. b4 b0 The sample is serialized and processed 1 bit per clock cycle. 8 clock cycles are thus required to process the whole sample The sample is serialized and processed 2 bits per clock cycle. 4 clock cycles are thus required to process the whole sample The sample is processed in parallel 8 bits per clock cycle The sample is serialized and processed 4 bits per clock cycle b0

Filter Throughput The filter sample rate is a function of:
Clock frequency, fclk Input data sample width, B Hardware over-sampling rate Coefficient symmetry and number of channels One channel, FIR filter Over-sampling = fclk / fs Fully serial implementation Non symmetric: Hardware over-sampling rate = B Symmetric: Hardware over-sampling rate = B + 1 Calculating up front the required filter throughput will help make a decision regarding which implementation should be used.

Questions How many clock cycles per input are required for a fully parallel 12-bit data, 20-tap symmetric filter? The requirement for a filter is to run at 25 MSPS. A 100-MHz system clock is available on the board. What should the hardware over-sampling rate parameter be set to for 8-bit data? How many clock cycles per input are necessary to process in serial an 11-bit data, 31-tap symmetric filter?

Capture of final result
MAC Filter yn = S xn-i hi i=0 N-1 The output of the filter at time n Can be implemented using single multiply and accumulate (MAC) engine Can also be implemented using n MACs (parallel technique), or Using between 1 and n MACs (MAC Farm technique) Samples can be stored in Distributed RAM Block RAM SRL16E Embedded multipliers may be considered for this architecture No high-level block in the System Generator There is a customizable core available A reference design is available in Reference Blockset Samples 92 × 8 Coefficients 92 × 12 Sample in Address Coefficient D Q CE + Capture of final result • Simple register Supports result size 8 12 20 26 Full Multiplier - width × max Coeff width Sample Memory Cyclic RAM buffer Depth = Taps Width = Sample size Accumulator width depends on number of taps MAC Engine Implementation The MAC FIR engine is composed of 5 main sections: input buffer, data storage, coefficient storage, a multiply-accumulate unit, and control logic. Input buffer: When a multichannel filter is defined, an input FIFO is automatically added to enable input samples to be supplied on sequential clocks, which is required for cascading multichannel filters. An input FIFO is also added for polyphase decimation filters so that the filter can process the incoming data in the most efficient manner. Data storage: Under user control, either distributed SelectRAM (distributed memory) or block SelectRAM, when available, (block memory) can be selected to store the filters data sample. Block memory is the default storage type. Selection of distributed SelectRAM memory may be advantageous for several cases: The device block memory resources are not available for use by the filter, or the number of data values is small and therefore the distributed RAM resources required would be minimal. Coefficient storage: Same criteria that for Data Storage applies Multiply and Accumulate engine: The multiplier is automatically selected as either a LUT-based or dedicated embedded multiplier when available (e.g., for Virtex-II Pro™ or Spartan™ -3 devices). Full precision results from the multiplier are summed in the accumulator. The accumulator has sufficient precision to ensure that no overflow can occur during the sum-of-products operation. The final summation is presented on the output port and can be optionally registered so the output value remains static during successive computation periods. Control Logic: The control logic manages the flow of data in and out of the input data buffer, the flow of data in and out of the data storage memory, reading of coefficients, and providing the control signals to the MAC.

MAC FIR Core The Xilinx MAC FIR core implements a highly configurable, high-performance, and area efficient FIR filter Single-rate polyphase decimators and interpolators are supported Multiple data channel operation is supported for all filter types Symmetry in the coefficient set is exploited for single MAC implementations to increase overall performance and minimize resource utilization Data-paths provide full-precision arithmetic to avoid overflow The MAC FIR core support for 1–32 channel. The coefficient symmetry is exploited for higher performance and compact implementation. It is capable of using multiple MACs to implement a filter. The MAC FIR Core uses one or more time-shared multiply-accumulate (MAC) functional units to service the N sum-of-product calculations in the filter. The core automatically determines the minimum number of MAC engines required to meet the user specified throughput. The number of multiplications can be reduced by first summing terms in the regressor vector that ultimately engage the same filter coefficient value. This reduces the computation workload by almost a factor of two. The MAC FIR v3.0 core automatically generates an implementation that meets the user defined throughout requirements based the system clock rate, sample rate, number taps and channels, and rate change. The core inserts one or more MAC engines to meet the overall throughput requirements. The number of MAC engines required for a filter is determined by computing the number of clocks available to process each input sample (A) and then dividing the number of multiples required to perform the computation (B) by A. The number of clocks available to process each input sample is found by: A = ceiling((System Clock Rate * Decimating Rate Change) / (Input Sample Rate * Number Channels * Interpolating Rate Change) ) B = Number Taps (for Single Rate & Decimating filters) = Number Taps / Interpolating Rate Change (for Interpolating filters) Thus, Number of MAC Engines = ceiling( B / A )

Transpose FIR Structure
Normal FIR Transpose FIR s(n) s(n) k0 k1 k2 k3 k3 k2 k1 k0 y(n) ‘0’ Note : Coefficient order reversed y(n) The Normal FIR Filter structure can be implemented using the MAC Core in a parallel technique. In this case samples stored in registers. Maybe implementation issues as number of taps increase due to adder tree The Transpose FIR Filter structure uses adder chain No high level block in System Generator implemented using basic elements As mentioned the MAC FIR, in order to deal with higher sample rates can be implemented in a parallel format as shown above. So it is one multiplier per tap. The problem occurs as the number of taps increase. The adder tree becomes larger and can have an effect on layout and hence the maximum speed at which the filter can run at. The transpose FIR structure performs exactly the same function mathematically as the Normal FIR Filter Structure, but re-organizes the order of processing into a new structure that can be much better for hardware implementation.

Serial Processing Techniques
Filter Selection MAC Engine Sample Rate = Number of Taps Clock Rate or = ½ × Number of Taps (Symmetry) 1 MAC Engine SDA semi-parallel Multi-bit Processing Sample Rate = Sample-bits (+1) Clock Rate X BAAT External RAM Distributed RAM Block RAM Sample Rate 75 25 50 75 250 500 750 2.5 5.0 7.5 25 50 75 500 MHz 20 kHz 200 kHz 2 MHz 20 MHz 200 MHz This foil is a summary of the various techniques that can be used to meet our filter requirements. By understanding the dynamics of each technique, it is easier to select which to use at the overlap points of the spectrum. We also see that the blue sector is achieved through exploiting multiple channels, which we will see in the next module. Full Parallel FIR Sample Rate = Clock Rate Parallel Techniques SDA Sample Rate = Sample-bits (+1) Clock Rate Serial Processing Techniques

CIC Filter Block Cascaded integrator-comb (CIC) filters are multi-rate filters used for realizing large sample rate changes in digital systems Both decimation and interpolation structures are supported CIC filters contain no multipliers; they consist only of adders, subtractors and registers They are typically employed in applications that have a large excess sample rate; that is, the system sample rate is much larger than the bandwidth occupied by the signal Frequently used in digital down-converters and digital up-converters

CIC Decimator The CIC decimator consists of a cascade of integrators followed by a resampling switch and a cascade of differentiators The integrator section consists of N ideal integrator stages operating at the high sampling rate fs The comb section operates at a slower rate fs/R, R is the integer rate change factor The differential delay M in the differentiator chain may be defined by the user to be either 1 or 2 To ensure high system clock frequencies, the CIC decimator is actually implemented using the pipelined architecture

CIC Interpolator CIC Interpolator consists of a cascade of differentiators followed by a cascade of integrators Data is presented to the filter at the rate fs/R where it is processed by the differentiators The rate expander causes a rate increase by a factor R by inserting R-1 zero valued samples between consecutive samples of the comb section output The up-sampled and filtered data stream is presented to the output at the sample rate fs

CIC Block Parameters Filter Type : Interpolator or Decimator
Sample Rate Change : 8 to (inclusive) Number of Stages : 1 to 8 (inclusive) Differential Delay : 1 or 2

IIR Filters Lower order than FIR filter, i.e., less number of taps
Characterized by having infinite impulse response Involves using previously computed values of the output signal as well as the input signal in the computation of the present output Build using basic blocks (multipliers, registers, adders) Sin  Sout a1 Basic algorithm assumes no additional sample delays in feedback path All input related calculations have bit-widths defined by the input samples a2 b1 Feedback path must support the largest value expected and can result in large bit-widths a3 b2 A second order filter consists only of coefficients a1, a2, a3, b1, and b2 a4 b3 a5 b4

IIR Filter Example A 2nd order IIR filter can be built using five multipliers, five registers, and an adder tree using constant multipliers implemented in LUTs Using embedded multipliers Multipliers and adder tree are of full resolution, however need a quantization block at the output of the adder tree to control the output width Type ‘sysgenIIR_DFormI’ to view the example Type sysgenIIR_DFormI in C:\training\dsp_flow\labs\lab4 directory

Xilinx FDATool Block The Xilinx FDATool (Filter Design and Analysis Tool) block provides an interface to the FDATool software available as part of the MATLAB Signal Processing Toolbox Xilinx FDATool provides a powerful means for defining digital filters with a graphical user interface The block will not function properly and should not be used if the Signal Processing Toolbox is not installed This block provides a means of defining an FDATool object and storing it as part of a System Generator model Does not use any hardware resources

FDATool Block Usage Copy an FDATool block into a subsystem where a filter is defined Double-clicking the icon in your Simulink model opens up an FDATool session and graphical user interface The filter is stored in internal data structure of the FDATool block The coefficients can be extracted using MATLAB helper functions xlfda_numerator(‘FDATool’) returns the numerator xlfda_denominator(‘FDATool’) to returns the denominator

FDATool Session This picture shows the GUI displayed when FDATool is invoked through Xilinx FDATool block rather than from MATLAB command. When invoked from Xilinx FDATool, it does not have quantization capabilities. With FDATool block parameters, you can transform one filter into another filter, either of same type with different frequency response, or different type of filter. When transformed, the analysis will indicate if it is stable transformed filter or not. You can also realize the filter with a single button click on “Realize Button” on the left tool buttons.

Effect of Quantizing After you import a filter in to FDATool or use FDATool to design a filter, the options on the Quantize Filter panel let you quantize the filter and investigate the effects of various quantization settings. Quantized filters have properties that define how they quantize data you filter. Use the Set Quantization Parameters dialog in FDATool to set the properties. Using the Set Quantization Parameters dialog’s options, FDATool lets you perform a number of tasks: Create a quantized filter from a reference filter after either importing the reference filter from your workspace, or using FDATool to design the reference filter. Create a quantized filter that has the default structure (Direct form II transposed) and other property values you select. Change the quantization property values for a quantized filter after you design the filter or import it from your workspace. Convert coefficient to determines how the coefficient quantizer handles filter coefficients. When you quantize a filter, the properties of this quantizer govern the quantization. Convert input to specifies how data input to the filter is quantized. Convert output to specifies how date output by the filter is quantized. Convert multiplicand to specifies how filter multiplicands are quantized. Multiplicands are the inputs to multiply operations. Convert product to determines how to quantize the results of multiply operations. Convert sum to determines how to quantize the results of arithmetic sums in the filter. Both reference filter and the quantized filter responses are overlaid. Note that the quantization option/capability is available when the FDATool is invoked from MATLAB command window. It is not available when it is invoked through Xilinx FDATool block.

Exporting Coefficients
Once the filter is designed, coefficients can be exported using File  Export from the FDATool Block GUI Can be exported to Workspace (provide variable name) Text-File (provide text file name) M-File (provide M-file name)

Saving Coefficients If the FIR filter block (or any other Xilinx block) is using variables from the workspace, it is desirable to load those variables (e.g., filter coefficients) from a file and have them loaded every time the file is opened To do this, create a MATLAB .m file that contains the coefficients in a vector. An example below shows the file Load_coef.m: Coef = [ ]; Use set_param to set the Simulink model parameter "PreLoadFcn" set_param(‘design_name’,’PreLoadFcn’,‘Load_coef’) The PreLoadFcn will run a script to create the variable and place it in the workspace The MATLAB path must be set to the location of the .m file for MATLAB to find the function If the FIR filter block (or any other Xilinx block) is using variables from the workspace, it is desirable to load those variables (e.g., filter coefficients) from a file and have them loaded every time the file is opened.The following steps should be taken: 1. Create a MATLAB .m file that contains the coefficients in a vector. The example below is the contents of a file called Load_coef.m. coef = [ ]; 2. Set the Simulink model parameter "PreLoadFcn” to use this load_coef.m and load the variable in to the workspace. When the PreLoadFcn parameter is set in a Simulink, the value of the parameter is executed every time the Simulink model is opened. To set this parameter type the following in the MATLAB console. SET_PARAM('OBJ','PARAMETER1',VALUE1,'PARAMETER2',VALUE2,...) where 'OBJ' is a system or block path name, sets the specified parameters to the specified values. Case is ignored for parameter names. Value strings are case sensitive. Any parameters that correspond to dialog box entries have string values. Hence the syntax maybe: set_param('fir_design', 'PreLoadFcn', Load_coef;) 3. In the FIR filter block in the Simulink model, specify the variable coef in the filter coefficient section. This will use the specified values stored in coef, which were loaded by the "PreLoadFcn".

Saving Coefficients Alternatively, create an M-file with coefficients defined in it Right-click anywhere on the design sheet and select Model Properties to open a form Enter the filename (without .m extension) in the “model pre-load function” field of the “Callbacks” tab and click OK

The Product You are DSP Designer at Cyberdyne Systems. Your company is investigating using Digital Filters instead of analog for their Security Tag detectors in an attempt to improve performance and reduce cost of the overall system. This will enable them to further penetrate the growing security market space. The specification of the single channel, single rate filter is specified below: Band Pass Filter Sampling Frequency (Fs) = 1.5 MHz Fstop 1 = 270 kHz Fpass 1 = 300 kHz Fpass 2 = 450 kHz Fstop 2 = 480 kHz Attenuation on both sides of the passband = 54dB Pass band ripple = 1 Cyberdyne has chosen to go with FPGAs due to their flexibility, time to market and performance advantages over DSP Processors. Your HDL design experience is limited and hence System Generator for DSP appears to be an excellent solution for implementing the filter in an FPGA, as you are already familiar with The MathWorks products.

Stage 1: The Prototype Your manager Miles Booth has requested that you create a prototype of the filter to be implemented on their Virtex-II Pro™ prototype board that is almost complete. The prototype must be finished as quickly as possible for the imminent Aggressive Security convention, which is the industry’s largest convention of the year so it must not be missed. Band Pass Filter Sampling Frequency (Fs) = 1.5 kHz Fstop 1 = 270 kHz Fpass 1 = 300 kHz Fpass 2 = 450 kHz Fstop 2 = 480 kHz Attenuation on both sides of the passband = 54dB Pass band ripple = 1 Data Bit Width = 8 Bits Coefficient Bit Width = 12 Bits

Lab 3: Design a FIR Filter
In this lab, you are asked to design a bandpass FIR filter Use the FDATool to create the coefficients and pass to the FIR block Use the DA FIR filter token to simulate and implement this filter

Basic MATLAB Plotting Functions
Commonly used general and plotting functions for system generator designs: plot(y, x) - general plotting function stem(x) - useful for viewing an impulse response Hold on - allows more than one plot on the same figure Hold off - switches off hold on Grid - displays a grid on the plot fft(x, 1024) - performs a 1024pt FFT on data ‘x’; abs(x) - use this to find the magnitude of number Angle (x) - use this to find the phase of a number Use help for further information on usage Stem plot of a Low Pass Filter’s impulse response Plot of the Magnitude Frequency response of the same Low Pass Filter The above commands are commonly used when analyzing DSP designs created in System Generator, especially when visualizing data that has been sent to the workspace from Simulink. The top plot was created with the following command: >> stem(x) The bottom plot was created from the same data with the following commands: >> X = fft(x, 1024); >> F = :48000/1023:24000; >> plot(F, 20*log10(abs(X))) >> grid

Using the Spectrum Scope
The Spectrum is extremely useful for performing a frequency analysis on your design and can be found in the DSP blockset  DSP sinks library As no System Generator designs will use frame-based data, the input must be “buffered” (under the Scope properties). The size of the buffer determines the resolution of the FFT performed Use overlapping to avoid the discontinuities of using finite data Use the “Axis properties” to control the axes scale and units Typically the spectrum scope should be used when analyzing the frequency response of a system over a long period of time with more realistic continuous data inputs. If you are trying to analyze the impulse response of a filter it is better to export the output to the workspace and visualize it there. This is due to the windowing functions and normalization of the spectrum scope. User can also set line colors through Line properties option (not shown on the slide) For example, [1 0 0]|[0 1 0] specifies the colors for 2 channels. The values are in RGB values

Using the Spectrum Scope
The example takes two chirp signals (frequency-varying Sine Waves), adds them together and views the results on the Spectrum Scope Note: Be aware of the window that is being used by the scope, especially when analyzing small data sets. The default is hamming and can only be changed by looking under the mask and changing the property on the window block A legend, line colors, and line styles can all be controlled from the line properties menu or from the axis and channels menus on the Spectrum Scope’s window. To change the window function used by the Spectrum Scope, right click on the block and select look under mask and double click on the short-time FFT block. Select the desired window type. It is beyond the scope of this class to discuss the effects and benefits of different windowing techniques. Type ‘SpectrumScope’ to view the example

Using White Noise An excellent block to complement the Spectrum Scope is the Gaussian White Noise block (Communications Blockset  Comms sources). This block outputs a signal over all frequencies below the Nyquist frequency Useful to view filter cutoffs Make sure you do not output vectors or frame-based data as System Generator designs do not accept them The White Noise is an excellent block that complements the Spectrum Scope especially when testing filter’s Spectral response. (Communications Blockset => Comms sinks). The Random number block from the Simulink sources has the same effects as the White Noise. The Gaussian Noise Generator block generates discrete-time white Gaussian noise. You must specify the initial seed vector in the simulation. The Mean Value and the Variance must scalars for SysGen usage. Type ‘WhiteNoise’ to view the example

Digital Filtering.

Similar presentations

Presentation on theme: "Digital Filtering."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Digital Filtering.

Similar presentations

Presentation on theme: "Digital Filtering."— Presentation transcript:

Similar presentations

About project

Feedback