15 January 2008

New Devices-FinFET and SOI MOSFET

Scaling of conventional MOSFET devices deeper into the nanometer side are threatened by the short channel effects. Hence new devices are under research and development stage that can overcome short channel effects. FinFET and FD SOI MOSFET are such new semiconductor devices that promise the possibility of further scaling of the device. Both devices overcome the problem of short channel effects and parasitic capacitance effects.


General MOSFET at submicron level is suffering from several submicron issues like short channel effects, threshold voltage variation etc. FinFET is proposed to overcome the short channel effects. Structure of FinFET is shown in Figure (1).

Figure (1) Structure of FinFET [2] [3]

Silicon on insulator (SOI) process is used to fabricate FinFET. This process ensures the ultra thin specifications of device regions. In FinFET electrical potential throughout the channel is controlled by the gate voltage. This is possible due to the proximity of gate control electrode to the current conduction path between source and drain. These characteristics of the FinFET minimize the short channel effect.

Advantages of the FinFET over its bulk-si counterpart are as follows:

  1. Conventional MOSFET manufacturing processes can also be used to fabricate FinFET.
  2. FinFET provides better area efficiency compared to MOSFET.
  3. mobiblity of the carriers can be improved by using FinFET process in conjunction with the strained silicon process.

FinFET device structure

Silicon on Insulator (SOI) process is used to manufacture FinFET. A single poly silicon layer is deposited over a fin. Thus poly silicon straddles the fin structure to form perfectly aligned gates. Here fin itself acts as a channel and it terminates on both sides of source and drain. In general MOSFET device, over the Si substrate poly silicon gate is formed. Poly silicon gate controls the channel. Straddling of poly silicon gate over the Si fin gives efficient gate controlled characteristics compared to MOSFET.

Since gate straddles the fin the length of the channel is same as that of width of the fin. As there are two gates effectively around the fin we can write, width of the channel is equivalent to twice the height of the fin i.e. w=2*h.

A term called “fin pitch” is used to define the space between two fins. Height of the FinFET is equivalent to width of the MOSFET. If w is the fin pitch then to attain same area efficiency required fin height is w/2. But practical experiments have shown that fin height can be greater than w/2 for a fin pitch of w. thus FinFET achieves more area efficiency than MOSFET.

FinFET process technology

SOI technology is used for the fabrication of FinFET. In SOI technology, an insulator, SiO2, isolates the bulk from the substrate. An extremely shallow junction is formed due to the depth limitation put by the insulator. The dielectric isolation and elimination of latch up problem are the advantages of the SOI process.

The FinFET fabrication process steps are showed in the Figure (2). On a thin SOI layer Si3N4 and SiO2 are deposited. Electron beam lithography is used to form silicon fin. Channel length and channel width are determined by the accuracy of the fin. Poly silicon with pentavalent impurities and then oxide layer is deposited over the silicon fin. Then source and drain regions are separated and insulator spacers are formed. Then the etching process is carried out on spacer till silicon fin is reached. Gate is formed by depositing the gate layer.

Figure (2) FinFET process flow steps [2]

Silicidation is performed to decrease the high source drain resistance which is formed due to very thin layers of source and drain.

Fully Depleted SOI MOSFET

Structure of FD SOI MOSFET is as shown in the Figure (3).

Figure (3) Cross section of an n (p)-channel thin-film SOI MOSFET [1]

Fabrication Process

FD SOI MOSFET is fabricated using a standard fully-depleted SOI CMOS process with N+ polysilicon gate. P-type SIMOX substrates having a resistivity of 20 are used as starting material. Oxidation and oxide strip reduces film thickness to 100 nm. Then LOCOS process is carried out. To adjust the threshold voltages a 30 nm thick gate oxide is grown and boron is implanted. Polysilicon is then deposited and N type is doped. To form source and drain arsenic is implanted. Then oxide deposition and reactive ion etching form 150 nm thick spacer. Then silicidation process is carried out. In this process 30 nm thick titanium layer is deposited and annealed. This process reduces sheet resistance. A nitride or oxide layer is deposited contact holes are opened to access the device. Formation of a passivation layer completes the process.

Advantages of FD SOI MOSFET over bulk CMOS MOSFET

Advantages of FD SOI MOSFET over bulk MOSFET is as follows:

1. Reduced parasitic capacitances and leakage current are achieved due to dielectric isolation.

2. Sharper sub threshold slope, lower body effect and smaller vertical field mobility degradation are achieved with full depletion operation of thin film SOI MOSFETs.

3. Drive capability is increased for low voltage designs.

4. SOI CMOS process for FD SOI MOSFET is simpler compared to its counterpart. This results in minimized threshold voltage roll off, reliable ultra shallow junctions, complete elimination of latch up problem.


[1] Fully-Depleted SOI CMOS Technology for Low-Voltage Low-Power Mixed Digital/Analog/Microwave Circuits, D. flandre, J. P. colinge, J. chen, D. DE ceuster, J. P. eggermont, L. ferreira,B. gentinne, P. G. A. jespers and A. viviani, Microelectronics Laboratory, Belgium

[2] FinFET: A nanometer MOSFET structure, David John

[3] Design and Fabrication of Tri-Gated FinFET, Mohammed R.Rahman, 22nd Annual Microelectronic Engineering Conference, May 2004

14 January 2008

What is the difference between a latch and a flip-flop?

  • Both latches and flip-flops are circuit elements whose output depends not only on the present inputs, but also on previous inputs and outputs.
  • They both are hence referred as "sequential" elements.
  • In electronics, a latch, is a kind of bistable multi vibrator, an electronic circuit which has two stable states and thereby can store one bit of of information. Today the word is mainly used for simple transparent storage elements, while slightly more advanced non-transparent (or clocked) devices are described as flip-flops. Informally, as this distinction is quite new, the two words are sometimes used interchangeably. [wiki]
  • In digital circuits, a flip-flop is a kind of bistable multi vibrator, an electronic circuit which has two stable states and thereby is capable of serving as one bit of memory. Today, the term flip-flop has come to generally denote non-transparent (clocked or edge-triggered) devices, while the simpler transparent ones are often referred to as latches.[wiki]
  • A flip-flop is controlled by (usually) one or two control signals and/or a gate or clock signal.
  • Latches are level sensitive i.e. the output captures the input when the clock signal is high, so as long as the clock is logic 1, the output can change if the input also changes.
  • Flip-Flops are edge sensitive i.e. flip flop will store the input only when there is a rising or falling edge of the clock.
  • A positive level latch is transparent to the positive level(enable), and it latches the final input before it is changing its level(i.e. before enable goes to '0' or before the clock goes to -ve level.)
  • A positive edge flop will have its output effective when the clock input changes from '0' to '1' state ('1' to '0' for negative edge flop) only.

  • Latches are faster, flip flops are slower.
  • Latch is sensitive to glitches on enable pin, whereas flip-flop is immune to glitches.
  • Latches take less gates (less power) to implement than flip-flops.
  • D-FF is built from two latches. They are in master slave configuration.
  • Latch may be clocked or clock less. But flip flop is always clocked.
  • For a transparent latch generally D to Q propagation delay is considered while for a flop clock to Q and setup and hold time are very important.

Synthesis perspective: Pros and Cons of Latches and Flip Flops

  • In synthesis of HDL codes inappropriate coding can infer latches instead of flip flops. Eg.:"if" and "case" statements. This should be avoided sa latches are more prone to glitches.
  • Latch takes less area, Flip-flop takes more area ( as flip flop is made up of latches) .
  • Latch facilitate time borrowing or cycle stealing whereas flip flops allow synchronous logic.
  • Latches are not friendly with DFT tools. Minimize inferring of latches if your design has to be made testable. Since enable signal to latch is not a regular clock that is fed to the rest of the logic. To ensure testability, you need to use OR gate using "enable" and "scan_enable" signals as input and feed the output to the enable port of the latch. [ref]
  • Most EDA software tools have difficulty with latches. Static timing analyzers typically make assumptions about latch transparency. If one assumes the latch is transparent (i.e.triggered by the active time of clock,not triggered by just clock edge), then the tool may find a false timing path through the input data pin. If one assumes the latch is not transparent, then the tool may miss a critical path.
  • If target technology supports a latch cell then race condition problems are minimized. If target technology does not support a latch then synthesis tool will infer it by basic gates which is prone to race condition. Then you need to add redundant logic to overcome this problem. But while optimization redundant logic can be removed by the synthesis tool ! This will create endless problems for the design team.
  • Due to the transparency issue, latches are difficult to test. For scan testing, they are often replaced by a latch-flip-flop compatible with the scan-test shift-register. Under these conditions, a flip-flop would actually be less expensive than a latch. Read a good article on problems of latch published in eetimes long back !!

  • Flip flops are friendly with DFT tools. Scan insertion for synchronous logic is hassle free.

01 January 2008

Asynchronous FIFO-Clock Generation Using DCM

Clock Generation using Digital Clock Manager (DCM)

Note: Diagram numbers are continued from the previous post.

To generate the required clock for the FIFO Digital Clock Manager (DCM) wizard is used. DCM can implement a clock delay locked loop, a digital frequency synthesizer, digital phase shifter.

Below mentioned DCM ports are used in this design:

  • CLKIN: CLKIN pin is the clock input to the DCM and is always enabled. CLKIN provides the source clock to the DCM. In this design frequency of 50 MHz (in Spartan 3 development board) is used as input to the CLKIN.
  • RST (RST_IN): RST pin is the reset input to the DCM. If RST is not enabled then RST will be tied to GND.
  • CLKDV: Divide output of the DCM is available at CLKDV pin. An option available in the ‘Divide by Value’ list determines the output clock CLKDV frequency. In this FIFO design input clock is divided by a divide value of 5 to get 10MHz write clock. Input clock of 50 MHz is directly taken as read clock.
  • CLKFX: The CLKFX output pin provides fully digital, dedicated frequency synthesizer output to the DCM. The output frequency is a function of the input clock frequency described by M and D, where M is the multiplier (numerator), and D is the divisor (denominator). M and D are calculated for 100 MHz clock frequency. This output can be divided by 2 by using another DCM to get 50MHz of read clock. Instead of doing this, in design, input clock itself is directly used as read clock for the sake of simplicity.
  • CLK0: Frequency output is same as CLKIN input. This output is also used for on-chip or off-chip synchronization.

Figure (15) RTL schematic of DCM module

The RTL schematic of the DCM module is shown in the Figure (15). The test bench simulated waveform (post-translate model from Modelsim v.5.8 simulator launched by Xilinx ISE) for the DCM is shown in Figure (16).

Figure (16) DCM simulation waveforms

Reset is held high for around 20 nS. Hence the CLKDV_OUT and CLKFX_OUT is zero. Then reset is made low. Out put is generated only after 10 input clock cycles. CLKFX_OUT is twice (50MHz*2=100MHz) and CLKDV_OUT is 5 times less (50 MHz/5=10 MHz) than the input clock frequency (50 MHz).

FIFO with DCM: Synthesis and analysis

The verilog code generated for DCM by the architectural wizard is instantiated in the body of top level module fifo_top.v.

a_fifo5 a_fifo55(d_out,f_full_flag,f_half_full_flag,f_empty_flag,

f_almost_full_flag,f_almost_empty_flag,d_in,r_en,w_en,CLK0_OUT,CLKDV_OUT,reset); //instantiate fifo


LOCKED_OUT); //instantiate DCM

FIFO code a_fifo5.v is also instantiated in the top level module. Binary counters are instantiated in the a_fifo5.v program. Thus total hierarchical structure of the design is as shown in the Figure (17). Top level module has instantiations to the modules dcm_fifo, a_fifo5. dcm_fifo module generates required clocks for the design while a_fifo5 module implements FIFO memory controller. B_counter module is instantiated twice in a_fifo5 module to obtain read and write address generators.

Figure (17) Hierarchical structure of the FIFO design

FIFO top level module fifo_top.v is verified by the help of test bench program generated by Xilinx ISE and behavioral simulation results are shown in the Figure (19). RTL schematic generated for the top module is shown in the Figure (18). The schematic includes both DCM module and FIFO module connected to each other as per the design requirement. Outputs x, y and z are from DCM module and are not used. The simulation results justify the FIFO working without any error. Input clock CLKIN_IN itself becomes read clock. Since w_clk is generated internally it is not shown in the simulation waveform.

Figure (18) RTL schematic of DCM and FIFO

Figure (19) FIFO top module simulation waveform

The summary of the FPGA resources used for the complete design is listed below. The below mentioned devise utilization summary is part of the synthesis report generated by Xilinx-synthesize-XST.

Device utilization summary:


Selected Device : 3s200ft256-5

Number of Slices: 28 out of 1920 1%

Number of Slice Flip Flops: 23 out of 3840 0%

Number of 4 input LUTs: 54 out of 3840 1%

Number used as logic: 38

Number used as RAMs: 16

Number of IOs: 29

Number of bonded IOBs: 29 out of 173 16%

Number of GCLKs: 3 out of 8 37%

Number of DCMs: 1 out of 4 25%

Timing Summary:


Speed Grade: -5

Minimum period: 5.091ns (Maximum Frequency: 196.444MHz)

Minimum input arrival time before clock: 6.185ns

Maximum output required time after clock: 11.059ns

Maximum combinational path delay: 6.662ns

From the timing summary it can be observed that maximum operating frequency is increased from around 110 MHz to 196.444 MHz. this can be attributed to the DCM which provides stable clock signal to the hardware resources. 16 numbers of LUTs are utilized as distributed dual port RAM. Synthesis of a_fifo5 showed around 35 numbers of slices. But with the complete implementation it has reduced to 28. Total numbers of flip flops used also reduced.

Merits and demerits of the design

One of the important advantages of the proposed design is that the design shows efficient performance for the synchronous as well as asynchronous clocks (frequency within the maximum operating frequency). All status flags are asserted and deasserted with zero clock cycle delay. The new design uses simple 4 bit binary counters for addressing the FIFO memory. Synchronization between the clock domains is achieved with the pointer difference concept which is very easy to understand and implement. But the design can be slower compared to the design of FIFO with gray pointer approach. But this has to be tested.

This ends the article series on asynchronous FIFO I designed. There are many methods by which design can be implemented. I shared what I designed. Your comments are always welcome!

Related Articles