Using “fdatool” of MATLAB software the obtained filter coefficient (truncated to 4th decimal point) are
b = [0.2374,0,0.2374]
a = [1,0,0.5]
These coefficients are scaled by a factor 4 and again truncated to give filter coefficients as,
b = [1,0,1 ]
a = [4,0,2 ]
Linear Difference equation representation: - Fromthe filter coefficients we can write the transfer function as
H (z)= [1+0.Z-1-1.Z-2] / [4+0.Z-1+2.Z-2 ]
Converting above equation to linear difference equation form we have,
4y(n)+2(n-2) = x (n)-x (n-2)
y (n) = 1/4[x (n)-x (n-2)-2y(n-2)]
For n = 0, y (0) = 1/4[x (0)]…………………………….(1)
For n = 1, y (1) = 1/4[x (1)]…………………………….(2)
For n = 2, y (2) = 1/4[x (2)]-1/4[x (0)+2y(0)]…………..(3)
Hardware Implementation And Analysis:
All the hardware setup exercised in previous study of filter implementation remains the same. To implement this filter only software has to be changed.
All the discussion carried over the sampling frequency effect, noise rejection ability of the filter etc hold for this filter also. Thus the discussion and analysis of this filter is going to be a repetition of what we have said in earlier section instead let us look into the peculiar problem that we have faced in implementing this filter and how work around is sorted out to solve the problem.
Keenly observe the equation (3), we can observe that at one instance the value of 1/4[x (2)] becomes less than 1/4[x (0)+2y(0)] when this happens, the total result y (2) comes out to be a –ve value. How PIC can understand that it is a –ve data? It treats each data as a +ve binary number and accordingly processes it. Then how to come out of this mischievous loop?
There is one more problem, which is peeping out! The problem is after some looping of the software and assignment of present value to previous value .We may get y(0) values as –ve. At this junction, ¼[x (0)+(-2y(0)] may result a –ve output. Now we have to handle one more –ve number, which the PIC unaware of!
To get through all these obstacles we have set a flag bit to indicate the sign status of the each resultant output y0, y1 and y2 (of course, then as you know, it loops). If the number is –ve, the flag bit is set else it is cleared.
Now let us imagine that y (0) is –ve. Then 2y(0) will be a –ve number. Then instead of adding x (0) with 2y(0) (refer equation (3)), we should subtract 2y(0) from x(0). This result also may be a –ve number. To indicate sign condition of this temporary result one more flag bit ‘ tempflag’ is used. According to the sign condition of this ‘tempflag’, 1/4[x (0)+2y(0)] is either added with or subtracted from 1/4[x (2)]. In the case of subtraction once again the result may come out to be a –ve value. To eliminate this –ve value a fixed DC level is added to each output before it finally out ported to PIC ports. The fixed DC value is selected such a way that it is greater than or equal to the resultant maximum –ve value.
Initially to check the numbers are –ve or +ve numbers are simply subtracted and checked for borrow. If carry is borrowed number from which another number is subtracted is treated as a –ve number (by setting corresponding flag bit), else it is treated as +ve number (by clearing corresponding flag bits). When present output value is assigned to previous output value, corresponding flag condition is also assigned accordingly.
FLOWCHARTS:
MAIN PROGRAM:
SUBROUTINE ‘SUB1’:
SUROUTINE ‘SUB2’:
SUBROUTINE ‘ADD1’:
SUROUTINE ‘ADD2’:
SUBROUTINE ‘DELAY’:
References
1. “Digital Signal Processing” By Sanjit K.Mitra
2. “Digital Signal Processing” By P.Ramesh Babu
3. “Digital Filters” By T.J.Terrel and E.T.Powner
4. “BASIC Digital Signal Processing” By Gordon B. Lockart and Barry M.G.Cheetham
5. “Digital Signal Processing” By Alan V.Oppenheim and Ronald W.Schafer
6. “DSP Microprocessors: Advances and Automotive Applications” By Subra Ganeshan and Dr.Gopal Arvamudhan
Filter parameter: Fs=10KHz i.e. sampling time =1/10
=100us.
Fc = 2KHz
Apass = 1dB
Order n = 2
SOFTWARE SIMULATION: - The above filter is designed and simulated using “fdatool”, “simulink” and “dspfwiz” as explained in the earlier section.
HARDWARE IMPLEMENTATION: - To implement the filter in hardware we need to have a transfer function and its corresponding linear difference equation representation of the filter.
From the “fdatool” the obtained numerator and denominator coefficients are as follows,
Numeratorb = 0.21797
0.43596
0.21798
Denominator a= 1.0000
-0.35135
0.32966
Since the PIC 16F877 Microcontroller does not directly support floating-point arithmetic, the numerator and denominator coefficients are equivalently converted to integer values by truncating and scaling by a factor of 4, which yields the filter coefficients as
b = [1, 2, 1]
a = [4, -1, 1].
Therefore,
Transfer function of the filter is,
H (z) = [ 1 + 2Z-1 + Z-2 ] / [4 - Z-1 + Z-2 ]
Converting the transfer function to linear difference equation we can write,
4y(n) – y (n-1) + y (n-2) = x (n) + 2x (n-1) + x (n-2)
Ory (n) = 1/4 [x (n) + 2x (n-1) + x (n-2) +y (n-1) – y (n-2)]
For n=0,y (0)=1/4[x (0)];
[Assuming the initial condition that x (-1), x (-2), y (-1) and y (-2) are zero].
For n=1,y (1)=1/4[x (1)+2x(0)-y (0)]
For n=2,y (2)=1/4[x (2)+2x(1)+x (0)+y (1)-y (0)]
Where x (0), x (1) and x (2) represents sampled values of input signal, y (0), y (1) and y(2) represents filtered output values.
FLOWCHART OF THE SOFTWARE: -
DELAY SUBROUTINE FLOWCHART:
HARDWARE SETUP: -The hardware setup for this filter implementation remains unchanged as shown in earlier blog post. Only change that we have is the design of RC lowpass filter.
Design:
Fc=Fs/2=5KHz;Let R=1.5KW.
Therefore,
C= 1/2*p*fc*R = 1/2*p*5K*1.5K=21.22nF
Therefore,
C selected=22nF
FREQUENCY RESPONSE: - The observed frequency response of the filter is as listed below,
Input voltage Vi=0.5v
Input Frequency (Hz)Output voltage (volts)
100----------5
200-----------5
400----------- 5
600----------- 5
800----------- 5
1K -----------5
1.2K----------5
1.4K----------5
1.5K----------4.6
1.6K-----------4.5
1.8K---------- 4.0
1.9K-----------3.8
2.0K-----------3.5
2.1K-----------3.2
2.2K-----------3.0
2.4K-----------2.5
2.6K-----------2.0
2.8K-----------1.5
3K-----------1.0
Graphical representation of the above filter response is as shown in Figure (1).
Figure (1)
The magnitude corresponding to 3dB frequency is 5*0.707=3.535v. Note that the observed cutoff frequency is 2KHz, which is same as that of theoretically designed (In the frequency response list shown above shows 3.5v at cutoff frequency. In fact the value is 3.53 itself. But since on a CRO reading accuracy is limited by the ranging of voltage and error in reading the data, we have written the voltage value as 3.5; remember that it is not because the filter doesn’t exhibit exact cutoff frequency but we can’t read the response accurately).
2.4.6.SAMPLING FREQUENCY OF THE FILTER: - Sampling Frequency of the filter designed is 10KHz which implies that sampling time =1/10K=100ms.
Hence the looping part of the software should be completed within 100ms and has been achieved with just 83ms. To satisfy 100ms sampling time, delay is used. At the very beginning of the filter response there may be transient due to unsatisfied condition of sampling time and initial condition.
The discussion carried over the effect of sampling frequency variation and aliasing effect in earlier section also hold good here.
NOISE REJECTION ABILITY OF THE FILTER: - To study this, hardware setup is constructed as it is been constructed for I order Butterworth lowpass filter implementation.
Now our importance shifts to do a comparison of noise rejection capability of the I order and II order filter. There can’t be a second opinion about the performance of II order filter in eliminating the noise and thus making the signal ‘smoother’. Now our attention is drawn towards to have a trade-off between the performances of the filters.
Second order filters have sharper cutoff compared to first order filters, which is observed in this filter also. As per as noise is concerned both filters perform well.
LIMITATIONS: -
PIC Microcontrollers impose a lot of limitations in implementation of digital filters. Reader might be remembering that PIC MCU’s are 8-bit controllers having “Harvard Architecture” as its internal architecture. Those who wishes to study the digital filter response using PIC using PIC MCU’s as a tool are suggested to take notice of mentioned obstacles:
Processing of A/D converted data: - PIC 16F877 has a 10-bit built in internal ADC. The A/D converted data are stored in two 8-bit registers called ADRESL and ADRESH. The lower byte is stored in ADRESL and the remaining two higher bits are stored in ADRESH. Now the problem is, since to process this 10-bit A/D data it requires having 16-bit arithmetic operation, implementation, which in PIC 8-bit MCU, is a cumbersome job.
To overcome this problem care is taken when applying the input signal so that A/D converted data doesn’t exceed 8-bit or even it exceeds it should exceed the value by one bit only. All A/D converted bits are high when input is 5V.When input voltage is around 2.5V only lower bits of the A/D converted data are high. This is been tested by directly displaying the A/D converted data.
Now referring to DC shifter circuit in Figure (2.6b), the DC shift provided by the circuit is = -(1K W/6K W)*(VEE)
= -(1/6)*(-12V)
=2V
But maximum input voltage level that can be applicable to get 8 bit A/D converted data is 2.5V
Therefore,
The voltage swing that can be applicable=2.5V-2V
= 0.5V
Thus the input signal to the PIC port pins can vary from 1.5V to 2.5V, 2V being DC value. Therefore for DC shifter, since there is already 2V DC shift provided by DC shifter, the maximum applicable input voltage is 1V.This is the reason why we have applied only 0.5 V input signal. If input level to DC shifter circuit exceeds 1V A/D converted data will exceed 8-bits which is processed by the software and hence even though the filter work according to the design, signal loses its original shape.
Sampling frequency limitation: - The time taken to complete single loop in final looping port of the software determines the sampling time. The first order Butterworth lowpass filter designed in our study used sampling frequency 0f 25KHz which implies sampling time = 40ms. Software routine written for any other, same or higher order filter utilize more than 40ms for its looping part. Hence with 4MHz clock frequency for the operation of PIC MCU, the maximum possible sampling frequency is 25KHz. Thus to get fairly good filtered and reconstructed signal, the maximum input signal frequency is applicable is around 10KHz. If higher clock frequency like 20MHz is used the A/D conversion speed is more. This saves time and hence sampling frequency can be higher to a little bit.
Floating-point arithmetic and –ve results: - Another major drawback of the PIC MCU is it doesn’t support floating-point arithmetic operations, which are very crucial aspects when we deal with digital filters. Owing to this reason floating point filter coefficients are truncated as well as scaled to an integer when we did this precautions are taken so that filter response does not change considerably beyond required designed response.
Truncated and scaled filter coefficients of lowpass filter have not much affected the filter characteristics. But for high pass and band pass/reject filters in addition to floating point arithmetic problems, -ve numbers also came into picture; processing of these –ve numbers and D/A conversion becomes another difficult job. Somehow floating point numbers can be handled, but playing with –ve numbers is certainly a tedious job.
The above floating problem can be overcome by using ‘PICLITE’ assembler, which is supported by MPLAB. By enabling the ‘PICLITE’ assembler we can write the program in C code as well as in assembly code. C codes are converted to assembly code by the assembler itself. These assembled codes can be programmed to PIC MCU.
References
1. “Digital Signal Processing” By Sanjit K.Mitra
2. “Digital Signal Processing” By P.Ramesh Babu
3. “Digital Filters” By T.J.Terrel and E.T.Powner
4. “BASIC Digital Signal Processing” By Gordon B. Lockart And Barry M.G.Cheetham
5. “Digital Signal Processing” By Alan V.Oppenheim And Ronald W.Schafer
6. “DSP Microprocessors: Advances and Automotive Applications” By Subra Ganeshan and Dr.Gopal Arvamudhan
Design of peripheral and supporting circuits such as sense amplifier, address decoders, precharge and I/O control circuits are very important for the proper functioning of SRAM. The memory cell has to be accessed by all these supporting circuits by the help of BL and BLbar lines. Address decoders select a particular cell for read/write operation. Address decoding delay account for the maximum part of the memory access time in addition to the delay provided by the bit line capacitances of the memory cell itself. Read and write circuits provide an interface between internal memory cells to the external hardware facilitating proper data transfer between them. Before any layout is designed for all these blocks they have to be tested for functionality and worst case possibilities to make them error free design.
2 Sense amplifiers
Since SRAM cells provide true differential outputs any differential configuration of sense amplifier is directly applied to SRAM design. One such type of configuration is shown in Figure (2.1). Sense amplifier is a latch formed by cross coupling two CMOS inverters. Sense enable (SE) signal is used to turn ON/OFF the sense amplifier BL and BLbar becomes I/O terminals of amplifier. During read operation, if cell had stored 1, then a small +ve voltage will develop between BL and BLbar with VBL>VBLbar. Then amplifier raises voltage VBL to VDD and VBLbar to 0V. This output is then directed to the chip I/O pin by the column decoder.
Figure (1) sense amplifier
Sense amplifier performs the following functions:
à Amplification: small bit line swings are resolved by the sense amplifier. This reduces power dissipation.
à Reduction in delay: by accelerating the bit line transitions sense amplifier boosts the driving capability of the SRAM cell.
à Reduction in power dissipation: this is achieved by reducing large signal swing on the bit line eliminating the necessity to charge or discharge the bit line capacitance.
Simulation: SPICE simulation results of the sense amplifier for the schematic shown in Figure (1) is shown in Figure (2).
Figure (2) sense amplifier SPICE simulation waveform
Initially sense enable (SE) signal is deactivated. The inputs BL and BLbar lines are precharged and equalized to metastable point of the inverter. Initialization of read operation causes any one of the bit lines to drop. Once the sufficient amount of differential voltage is established SE signal is activated. The cross coupled inverters of the amplifier reaches to a stable operation point after the result of the positive feedback.
Sharing of the single sense amplifier between multiple columns can save area as well as power. Also by pulsing SE signal for short duration of evaluation reduces the static power the amplifier.
Normal W/L ratios are selected for NMOS and PMOS transistors. PMOS transistors have a W/L ratio of 6.66 which means that for 0.18 µ technology gate width of 1.2 µ. For NMOS transistors this ratio is 3.33 that are to say a gate width of 0.6 µ.
Simulation results are shown in Figure (2). Here sense amplifier is nothing but a differential amplifier. Node Y of the amplifier is forced with a pulse waveform. When the SE is activated, due to the differential configuration, BLbar shows complementary waveform of BL as shown by the circled area in simulation waveform. Further analysis is carried out along with SRAM cell and precharge circuits.
3 Precharge and Equalization Circuit
The precharge and equalization circuit is shown in Figure (3)
Figure (3) Precharge circuit and simulation setup
When precharge enable (PE) goes high prior to read operation, all three transistors conduct. M1 and M2 precharge the BL and BLbar to VDD/2. M3 helps to speed up this process by equalizing the initial voltages on the two lines. This equalization is critical to the proper operation of sense amplifier. Sense amplifier can erroneously interpret the any voltage difference present between BL and BLbar prior to the commencement of read operation.
Read operation sequence:
1. When precharge enable (PE) signal is made high both BL and BLbar precharges to VDD/2. Then PE is made low. This causes BL and BLbar to float for a small interval of time.
2. When word line is activated then voltage difference is established between BL and BLbar. If cell had stored 1, then VB>VBbar. If cell had stored 0, then VBBbar.
3. Now sense enable (SE) signal is activated. This turns ON the sense amplifier. Positive feedback structure of the sense amplifier establishes stable condition within a short time.
4 Half VDD generator
Half VDD sensing scheme has two advantages: it improves noise immunity and it has lower power consumption.
Figure (4) Half VDD generator
The basic circuit of half VDD generator consists of bias circuit and a driver circuit as shown in the Figure (4). The (W/L) ratio of the bias circuit transistors is set so that the voltage at the node B is VDD/2. Therefore voltage at node A is VDD/2+VTN (VTN-threshold voltage of NMOS transistor) and at node C is VDD/2-|VTP| (VTP-threshold voltage of PMOS transistor).The output voltage of the driver is stabilized at VDD/2. Static current of the driver circuit is very low due to poor ON state of driver transistors. Driver stage is in push pull configuration. (W/L) ratio of the driver transistors are made larger to suppress any unexpected change at the output node quickly by turning ON either transistor strongly.
Address decoder is required to select one of the 2M rows or columns in response to an M bit address input. A simple NOR based matrix structure fulfills this requirement. A 3x8 decoder used to decode 8 memory blocks is shown in the Figure (6). A PMOS is attached to each line. When there is no read write operations PEbar signal is kept high. Because of this arrangement the decoder circuit does not dissipate static power. NOR based decoders use less number of devices compared to normal decoder implementation methodology. Layout of such decoder is time consuming and cumbersome compared to NOR based implementation.
In the case of row decoder, PMOS is activated by precharge control signal PEbar prior to the address decoding process. All word line (WL) is pulled high to VDD during precharge. Column (or block) decoders have to provide the discharge path from the precharged bit line to the sense amplifier during read operation. The same lines should be able to drive the bit line to write either 0 or 1 to the memory SRAM cell. Read and write access time of the memory is primarily restricted by the propagation delay of the decoder. Floor plan of the decoder should be carefully studied before the layout implementation of the row and column decoders. Decoder outputs are connected throughout the memory cell making long interconnections which are main resources of delay and higher power consumption.
Generally NOR based decoders improves the speed of operation and achieve power efficiency. Larger the PMOS transistor, the faster is the pre-charging and so faster is the decoder. For 0.18 µ technology gate width of all NMOS transistors in both row and column decoders are selected as 0.6 µ. For PMOS transistors gate width is 1.2 µ.
5.1 Column decoder
In this SRAM design each block is connected as one column. Each block consists of 8 sub columns and 128 rows. BL and BLbar lines of the sub column have column enable transistors which are enabled or disabled by the output of 3x8 decoder.
Figure (6) 3x8 column decoder
At present buffer drivers for decoder outputs are not considered. But, due to the large capacitance offered by the column and row connections (more evident in row decoder) a buffer circuit may be necessary before the signal reaches column control transistors of each sub column.
Figure (7) 3x8 decoder SPICE simulation waveform
The SPICE simulation waveform is shown in Figure (7). Inputs A0 to A2 and complement of these are applied appropriately as per the NOR logic. (In the waveform all signals are named in small case). The outputs of the decoder C0 to C7 are highlighted by circles. False triggering of decoder output occurs due to the rise time and fall time of the address line signals. This can be counteracted by proper control of address inputs and DEbar signal.
5.2 Row decoder
7x128 row decoder schematic is extension of 3x8 decoder. The discussion on capacitance and false triggering holds good here as well. The corresponding SPICE simulation waveform is shown in Figure (8).
Address inputs A3, A6 and A9 are shown in the waveform. Simulation waveforms of only six outputs out of 128 are shown. (In simulation waveform signals are named in small case). They are R0, R1, R63, R64, R126 and R127and are highlighted by the circles and arrows. For A3 A9 =0, R0 is selected and A3 to A9=127 R127 is selected.
6 I/O control circuits
I/O control circuits are integral part of the memory circuit. They interface internal memory cells with the external world. Generally internal operation of the cell runs in lower voltage range compared to the external world power supply of the chip. In such cases to resolve compatibility issues I/O circuits become essential. Here in this section read write circuits and buffer design for SRAM is presented.
6.1 Read buffer
Gate level and transistor level schematic is shown in Figure (9). Corresponding truth table of the circuit is listed in Table (1). Read enable (RE) signal is given as common input to two NAND gate while DL and DLbar becomes other two inputs for the gate. Push pull configuration of transistors finally drive the DIO line which is externally available for the chip. Basic NAND gate design strategy is used to design transistors. All the transistors of the NAND gate has common W/L ratio. PMOS transistor M8 and M10 of inverters have twice the width of M9 and M11.
Transistors M10 and M11 form driver circuit which interface to the DIO line of the chip. Power supply to this driver is directly given from the external power supply of the chip so that logic levels are compatible to the external interface unit.
6.2 Write circuit
Write circuit should be able to force the BL and BLbar line to change its state as per the given input data by charging the large bit line capacitances instantaneously. Hence write circuit is designed with NOR gates to provide higher current driving capability. Gate level and transistor level schematic is shown in Figure (10). The circuit resembles the read circuit with NAND gate replaced by NOR gates. Write enable (WE) signals control the write operation. Output of each NAND gate is driven by NMOS transistor having higher W/L ratio. These two transistors drive DL and DLbar lines and hence BL and BLbar lines. For the NMOS transistors of the NOR gate W/L ratio of 3.33 (i.e. W=0.6 µ) is selected and for PMOS transistors W/L ratio of 12 (i.e W=7.2 µ) is selected.W/L ratio for driving transistors M9 and M10 is selected to be 6.66 which makes gate width of 1.2 µ.
Table (2) write circuit truth table
6.3 Write buffer
Figure (11) write buffer-gate and transistor level
Write buffer shown in the Figure (11) is essential to interface DIO line to the write circuitry. External DIO line is given to the first inverter stage of buffer. Buffer draws power from internal power supply line VDD. Second stage output of buffer hence becomes compatible to internal logic levels of the chip.
7 Complete SRAM chip schematic
As we seen earlier complete SRAM has total 8 blocks and in each block cells are arranged in 128x8 matrix structure. Consider the Figure (12) wherein one block of memory is shown. Row select lines are given to R0 to R127 from row decoder output. Since whole block is considered as one column for parallel configuration of read and write operation, single column line activates individual sub column select transistors. Thus column decoder output C0 drives first block, C1 drives second block and so on till C7 drives 8th block. Row decoder output R0 to R127 is connected to all memory blocks.
Total 8 read and write circuits are sufficient for read and write operation. As shown in the Figure (12) read and write circuit I/Os are connected to BL and BLbar lines of sub columns. The read and write circuit connected to first sub column of first memory block, also connects to first sub column of second memory block, third memory block and so on. Similarly the second read write is connected to second sub column of all the blocks and this arrangement continues for all other sub columns of memory blocks. All these 8 set of read and write circuits are active at a given point of time to access any memory locations arranged in any row of any memory block which is decided by address decoders.
Figure (12) schematic of memory block
Access time difference of this parallel architecture and the architecture wherein individual memory bits are accessible, have to be studied. Nonetheless, in both architectures delay contributed by the address decoders play vital role. The overall switched capacitance can be reduced by dividing the word line into several sub word lines that are enabled while addressing. Similarly capacitances of bit line for every read-write operation can be reduced by partitioning of the memory.
8 Conclusions
Different supporting circuits like sense amplifier address decoders and I/O circuits are designed and analyzed by the help of SPICE simulation waveforms. Individual circuit performance is found to be satisfactory and its performance with the SRAM memory cell has been reported in the previous chapter. Quantitative analysis of all these circuits proved their functionalities. The range of difference voltage which sense amplifier can interpret original logic levels and time required to sense this difference has to be studied. Similarly capacitance and hence the delay offered by the decoder circuits to decode the input has to be analyzed. These will help in designing accurate layout of supporting circuits and thereby facilitating with the easy integration of these modules into the SRAM memory layout.
Bibliography
[1] Sung Mo Kang and Yusuf Leblebici, CMOS digital integrated circuits-analysis and design, Tata McGraw hill, third edition, 2003
[2] Jan M Rabaey & Anantha Chandrakasan & Borivoje Nikolic, Digital integrated circuits-a design perspective, Pearson education, third edition, 2005