www.ijpera.com ISSN: 2456-2734,

Volume 7, Issue 1 (May) 2022), PP 117-125

# A Novel Low Power Technique for Contention Current Reduction in Wide Fan-In Fin FET Based Dynamic OR Gate

# Arpana Tripathi and Ashish Dubey

VLSIDesignLab, Electronics & Comm. Department,

Shri Ram College of Engineering and Management, Morena, Gwalior-474010, Madhya Pradesh, India

Abstract—Register file buildings in contemporary FinFET chiefly primarily based microprocessors typically appoint sizeable fan-in domino FinFET OR structure. Susceptible keepers have always been utilized to get to the bottom of the lower noise margin trouble of FinFET structures, specifically based totally dynamic good judgment layout. Competitive scaling tendencies in CMOS in addition to FinEFT plan have decreased the effectiveness of this prone P-FinFET keeper. On the unique hand big sized P-FinFET keeper utilized in massive fan-in dynamic OR gate results in competition amongst the pull down local (PDN) and the keeper Due to the fact of competition there may be an vain boom in electricity dissipation and loss in normal performance. on this paper a new FinFET primarily based completely keeper format is proposed that is in a position to lowering the opposition amongst the keeper and PDN and due to this fact in a position to lowering the power and performance. Simulation outcomes at 45nm suggests that the electrical energy dissipation and put off have been reduced with the beneficial useful resource of 40% and 35% respectively in distinction to the giant fan-in dynamic OR gate with normal keeper.

**Keywords-**Dynamic CMOS logic; Noise immunity; Keeper Transistor

## I. INTRODUCTION

Widefan-in dynamic OR gate forms an important structure in the critical path of modern high speed micro processors [1]. But aggressive scaling trends in CMOS design [2], [3] leads to variation in leakage current of gates. In such a state of affairs to preserve splendid stage of noise margin for huge fan-in OR gate a massive sized PMOS keeper is used, however this giant measurement keeper effects in giant competition between pull down community (PDN) and the keeper. This rivalry effects in an useless expand in electricity dissipation and delay. An effort has been made in this work to decrease the rivalry ensuing in low energy dissipation and much less delay.

In this area first of all the significance of huge fan-in domino logic OR structure is defined and then the deign troubles with the broad fan-in domino OR srtructure are discussed. Later in this area the prior method is mentioned which is additionally successful of lowering rivalry cutting-edge up to some extent. Finally in the subsequent area a higher rivalry modern-day discount scheme has been proposed.

A. reuirement of vast fan-in OR structure in Register file design

 $Fig. 1 shows the architecture of ARM-Cortex A 9 microprocessor. The high performance ARM Cortex {\tt TM-Cortex} and {\tt TM-Cortex} are also as a finite of the cortex and {\tt TM-Cortex} are also as a finite of the cortex and {\tt TM-Cortex} are also as a finite of the cortex are also as a finite of t$ 

AseriesProcessorsareusedascoreprocessorsinalmostallthesmartdevicesbeingusedtodaysuchasiphone,ipad,andmob ilephones[4].In this processor, two register archives are deployed in the facts path, which are boxed for emphasis. The register archives are used nearly in every clock cycle, as in order to execute every education records must both be study from or written to the register file. Therefore register archives types an essential module in excessive pace microprocessors.



Fig.1.RegisterfilesdeployedinARMCortex<sup>TM</sup>-Amicroprocessor[1].



Fig. 2. (a) Block diagram of a simplified register file and (b) readport implemented using 4x1 multiplexer (MUX) [1]

Fig. 2(a) indicates the block plan of such a register file consisting of static RAM register, a study and a write port [1]. These ports are applied the use of coder and encoder whose implementation is proven in Fig. 2(b). This figure illustrates a easy 4x1 decoder with four enter lines. Note that original ARM microprocessor would be consisting of sixteen or 32 bit register file and therefore would want sixteen or 32 bit enter OR structure. Therefore, large fan-in OR structure types a vital shape in such a excessive velocity microprocessor. But designing a relatively sturdy large fan-in dynamic OR gate is a hard project in sub 100nm regime [8]. High competition contemporary is the aspect which makesconstructing the extensive OR gate a difficult project as will be mentioned in the subsequent section.

#### 1.1 FinFET Structure

With this structure, the gate is in a position to utterly dissipate the channel accordingly having tons higher electrostatic manipulate over the channel. Depending on the gate shape or substrate type, FinFETs can be classified. A FinFET can have either a shortened gate or a G gate due to different gate structures. As shown in Figure 2, SG devices have left and right aspects linked together in a wrap-around manner; it is possible to use these devices in place of planar devices that also have a gate, a supply, and a drain (three terminal devices).

Wraparound gate IG FinFETs have etched-out pinnacle sections that can serve as independent gates and can be managed sequentially [9]. There are greater layout options with IG FinFETs, but fabrication costs are also higher in general. SOI FinFETs are built on SOI wafers, have a lower parasitic capacitance, and have barely any leakage. Bulk FinFETs are built on bulk substrates, such as silicon wafers. Due to the greater familiarity of bulk FinFETs among designers, the drastically lower fabrication costs, and their advantage over SOI FinFETs in heating, bulk FinFETs are commonly utilized for most digital applications. Both conventional bulk and SOI wafer fabrication methods are well matched with those of FinFET units.

#### 1.2 Device Geometry and Sizing

For FinFET technologies, machine width is quantized into complete fins, as opposed to non-stop transistor width for planar technologies. Figure 1(b) illustrates the FinFET device with its advantageous gate width as n(2Hfin +t), where n is the number of fins, t is the fin width, and Hfin is the fin top. With a FinFET system, the gate has been designed to enable excellent electrostatic manipulation over the channel, and the etching uniformity requirements mean that the fin dimensions (e.g., peak Hfin) are no longer under the control of the clothier, thus making the system width less arbitrary than it would be with a planar technology. The use of more than one fin results in wider transistors with higher on-currents, but the range of options is limited. This is regarded as the width quantization issue. Because of this quantization problem, it becomes hard to scale devices, particularly in analog diagrams and SRAMs. Some time during the layout phase, the designers will have to adapt to this new constraint. An choice answer would be for the foundry to furnish the designers with a couple of variations of FinFET with one of a kind fin heights. Work in [15] explored the design house of FinFETs with double fin heights and concluded that the lack of non-stop sizing can be compensated; however, there are many unknowns from both fabrication charges and manufacturing difficulties, so this approach isn't likely to become widely available. Considering most phone designs can be tailored to use a restrained desire of system widths, width quantization for digital circuits may not be a huge problem anymore.



Fig. 2 Differences between the two structures from a cross-section (a) Bulk FinFET and (b) SOI FinFET.

#### A. Contentionproblemlargefan-inORstructure

Contention problem in wide fan-indynamic OR gate can be explained clearly with the conventional wide fan-in OR gate with a large size keeper as shown in Fig. 3. Consider the evaluation phase when clock is at logic '1', dynamic node is charged to logic '1' and output is at logic '0', the logic '0' output keeps the keeper ON during the evaluation phase compensating for any leakage through the pull down network. Now if one of the inputs witches from logic '0' to '1' as shown in Fig. 3 the nthat NMOS becomes ON and slowly tries to pull down the dynamic Node to logic '0'. Slow discharge of dynamic node is be cause of the fact that in wide fan-

in OR gate a large amount of parasitic capacitance appears at the dynamic node due to large number of NMOS connected inthe pull downnetwork. Due to such as low discharge of dynamic node the keeper remains ON until the time up to which the dynamic node discharges to such a value that the static inverters witches state and output becomes logic '1' and turns OFF the keeper. During this period the keeper tries to pull up the dynamic node while the ONNMOS tries to pull down the dynamic node. The current flowing through the keeper during this time is known as contention current [5]. Such a contention results in unnecessary increase in delay and static power dissipation.



Fig.3.Conventionalwidefan-inORgatewithoneoftheinputswitchingfromlogic'0'to'1'duringevaluationphase

To illustrate this competition cutting-edge a traditional 16-input domino OR structure has been simulated for highest delay situation i.e. with single input as '1'. This simulation is shown in Fig. four for 50nm science and at 1.6 GHz of frequency. The relaxation of simulation setup has been described in part III.

The high degree clock sign indicates the assessment section whilst a low degree clock sign indicates the precharge phase. First waveform indicates the competition present day flowing via the keeper Transistor. Simulation end result really exhibits that the competition modern-day is of the order of sixteen  $\mu A$  in the course of the preliminary duration of comparison phase. This competition contemporary unnecessarily will increase static strength dissipation of the design.

From the above discussion it is clear that the contention current is an important is sue forwide faning years. The proposed design deals with reduction in contention current, as given in section II.



Fig.4. Simulation result of the conventional 16 input dynamic OR structure...

# $B. \quad A high performance low contention (HPLC) technique for wide fan-indynamic OR gate$

In this section appreviously proposed technique [6] has been reviewed and analyzed. Further in simulation and analysis section III, this technique has been used for comparison with proposed technique.

The high performance low contention (HPLC) technique based wide fan-

indynamicORgatehasbeenillustratedinfig.6.InthistechniquetheoutputnodeisconnectedtothefooternodeoftheORgat easshowninfig.5.Byusingthistopologytheoutputnodeismaintainedatavoltagehigherthanthegroundvoltage,thisresul tinthereductionofkeepergateoverdriveandhenceresultsinthereductionofcontentioncurrent.

In this technique the strength of the keeper is reduced during the initial period of evaluation phase to reduce the content ion current. However, the keeper is still ON during the initial period of evaluation phase, and hence results in some amount of the content of the con

contention current. Moreover, the output node never reaches to a ground level voltage that may further result in an errone output propagated to the next stage of the OR gate.



Fig.5.Highperformancelowcontentionwidefan-indynamicORgate[6]

## II. PROPOSEDTECHNIQUE

The proposed design illustrated in fig. 6 targets at lowering the unwanted contention contemporary for huge fanindynamic OR gate. on this phase design, evaluation and operation of the proposed approach is described.

## A. DesignandAnalysis:

The proposed big fan-in dynamic OR gate with diminished competition is tested in fig. 6. In the usual keeper format the keeper modified into operated by using capability of sensing the output node however, in the proposed plan the keeper operates through the usage of sensing the dynamic node and the clock.



Fig.6.ProposedkeeperdesignforFinFET based widefan-inORgate.

The principal precept used proper right here to minimize the rivalry present day is to hold the keeper OFF indoors the length whilst there are probabilities of opposition amongst pull down neighborhood and keeper. Such an operation is completed via the use of capability of delaying the clock by way of ability of a enough period and working the keeper with this delayed clock. This ensures that there are now no longer any possibilities of opposition existing day flowing by means of the usage of the keeper. This put off is acquired with the aid of ability of a buffer set up indoors the proposed layout. The delayed clock acquired from the buffer is furnished to the gates of PMOS M18 and NMOS M19.

From fig. 4 the approximate dimension of competition current day may additionally moreover be except troubles received. To hold away from contention trouble the buffer transistors are designed such that the buffer has a prolong a good deal much less than or equal to this period. An integral plan constraint is that the put off of the buffer have to now no longer exceed this duration, in view that on exceeding this keep away from the keeper will proceed to be OFF when it is wished in the route of the contrast location and will degrade the noise margin. Buffer transistors are designed such that the buffer has a lengthen much less than or same to length of competition in the evaluation segment. Thereforetoobtainanappropriatedelayfromthebufferasuperbufferdesignhasbeenused. Such a superbufferde signisshowninfig. 7 and has been used and explained in [5].



Fig.7.SuperBufferDesign

Thesizes of the transistors can be obtained from equation 1 to obtain required delay.

$$\tau_{total} = (N+1)\tau_0 \left( \frac{c_d + \alpha c_g}{c_d + c_g} \right)....(1)$$

Where:

 $\tau_{total}$  =TotalDelayofthebuffer

N =Numberofstages

 $\tau_0$  =Delayoffirstinverter

 $C_g$  =Gatecapacitanceoffirstinverter

 $C_d$  =Draincapacitanceoffirstinverter

α=Scalingfactor

 $Firstly the size of first inverter is fixed and then with the appropriate value of total delay the value of scaling factor \alpha is obtained from equation 1. With this value of scaling factor the size so fsecond inverter is obtained as: \\$ 

$$\left(\frac{W}{L}\right)_{NMOS\,(inverter\ 2)} = \alpha \left(\frac{W}{L}\right)_{NMOS\,(inverter\ 1)}$$

and

$$\left(\frac{W}{L}\right)_{PMOS\,(inverter\ 2)} = \alpha \left(\frac{W}{L}\right)_{PMOS\,(inverter\ 1)}$$

# B. Operation:

Much like common dynamic sketch, the proposed structure has stages of operation namely initial stage when the CLOCK signal is low and final stage as the clock sign is high.

Inside the pre-fee area transistor M5 is ON and M6 is OFF making positive that the keeper M1 is OFF for the duration of the pre-price part even as it is not usually needed. at some factor of the comparison part M5 stays ON and will come to be OFF easiest after a put off obtained from the buffer in order that keeper stays OFF at some point of the evaluation phase. Now if any person of the enter turns into immoderate then the dynamic node slowly discharges and the keeper M1 stays OFF at some factor of this generation making certain no competition modern.

On total discharge of the dynamic node M7 will turn out to be OFF retaining keeper OFF inside the leisure of contrast phase. In each and every different situation when all the inputs are excessive inside the evaluation

section, the M7 stays ON at the equal time as M6 will end up ON after some put off from the buffer. This process turns ON keeper transistor proper away for the relaxation of the comparison part whilst it is miles wanted.

Such an operation of proposed sketch is oriented towards lowering contention as illustrated by means of way of simulation penalties in phase III.

#### III. SIMULATIONRESULTS

To study the relative overall performance of the proposed layout in evaluation with the previous keeper designs a 16-enter huge fan-in OR gate is applied the use of the proposed technique. For the contrast cause two 16-input huge fan-in OR gate also are designed with the traditional keeper technique and the high overall performance low contention (HPLC) approach [6] having equal transistor sizes and specs. The simulation has been achieved with the help of Tanner SpiceV14.1 simulator the use of PTM 50nm generation [] with deliver voltage of 1V at room temperature. The 16- enter dynamic OR gate is applied for an ARM Cortex-A9 microprocessor as defined in section I, therefore the running frequency of the applied layout should suit with the utility. ARM Cortex-a chain processors has a most running frequency of 1.6 GHz [4] and for this reason the running frequency of carried out sixteen-input dynamic OR gate is 1.6 GHz. The sizing of the transistors have been performed such that the unity gain DC Noise (UGDN) [9] of zero.2 V is maintained. The comparison of the proposed design has been done with the traditional keeper and the HPLC method [6] defined in segment I-C. The evaluation has been achieved on the premise of contention current, energy and delay all through the worst case delay condition whilst one of the enter in the PDN is identical to common sense '1'.

to demonstrate the discount in rivalry contemporary the circuit has been simulated inside the worst case delay circumstance. underneath this condition the present day

TABLEI. A VERAGE CONTENTION CURRENT, DELAYAND POWERDISSI PATION COMPARISON OF THE PROPOSED TECHNIQUE

WITHTHEHPLC [6]ANDCONVENTIONALTECHNIQUE

| Technique                                     | AverageContentioncurrent(uA) |      |       |       |      | SwitchingDelay(ps) |         |         |        |        | PowerDissipation(uW) |       |       |       |       |
|-----------------------------------------------|------------------------------|------|-------|-------|------|--------------------|---------|---------|--------|--------|----------------------|-------|-------|-------|-------|
| VddSupply(V)                                  | 1                            | 0.9  | 0.8   | 0.7   | 0.6  | 1                  | 0.9     | 0.8     | 0.7    | 0.6    | 1                    | 0.9   | 0.8   | 0.7   | 0.6   |
| Conventionalkeep<br>erdesign                  | 4.8                          | 4.38 | 3.86  | 3.64  | 3.5  | 209.52             | 204.76  | 201.06  | 198.34 | 185.66 | 4.8                  | 3.951 | 3.088 | 2.54  | 2.1   |
| HighPerformance<br>lowcontentiondesi<br>gn[6] | 1.8                          | 1.45 | 1.125 | 0.841 | 0.79 | 139.94             | 131.455 | 127.135 | 122.5  | 115.56 | 1.8                  | 1.305 | 0.9   | 0.588 | 0.474 |
| Proposeddesign                                | 1.11                         | 0.9  | 0.73  | 0.66  | 0.51 | 136.8              | 127.396 | 121.52  | 120.96 | 111.5  | 1.11                 | 0.81  | 0.584 | 0.46  | 0.306 |

through the keeper has been measured and illustrated in Fig. eight. On comparing the graphs of Fig. five with Fig. nine it may be located that the rivalry present day has been reduced from sixteen  $\mu A$  to five  $\mu A$ .



Fig. 8.Simulationresultoftheproposedtechniqueshowingthereducedcontentioncurrentthroughkeeper.

This comparison shows that there is a drastic amount of reduction in contention current by almost 66% as compared to HPL Cdesign [6]. Average contention current with variation in supply voltage for the three designs has been illustrated in fig. 9. Since with the reduction in contention current the delay in curred in discharging the dynamic node is reduced during the worst case delay condition, therefore with reduction in contention current the delay during the worst case condition is also reduced. Fig. 10 verifies that the proposed design has the reduced delay with supply voltage variation as compared to the conventional keeper design and the HPL Cdesign.

For evaluation, electricity dissipation of all of the energy resources has been considered. As visible in fig. 8 the rivalry modern has been reduced which has a direct effect on discount of electricity dissipation in the proposed design. Fig. eleven suggests that the proposed layout has the bottom energy dissipation with deliver voltage version compared to the traditional keeper design and the HPLC layout [6]. the sort of reduction in delay is because of discount in rivalry current. common contention modern, postpone and electricity dissipation for the 3 techniques have been summarized in desk I.



Fig.9. Power dissipation comparison at 50nm.



Fig.10. Power dissipation comparison at 50nm.



Fig.11.DelaycomparisonoftheConventionalkeeper,Processvariationtolerantdesignandproposeddesignat50nm.

#### IV. CONCLUSION

Wide fan-in dynamic OR gates applied in excessive velocity microprocessors consisting of ARM Cortex microprocessors has a prime troubles with it - rivalry modern. The proposed layout has performed discount in competition present day which has been illustrated in the simulation outcomes. Proposed layout has almost 25 % and 35 % reduction in delay compared to HPLC technique [6] and conventional keeper layout respectively for whole method version variety. in addition the proposed design has 33 % and 40% reduction in energy dissipation compared to the HPLC layout and traditional keeper layout respectively. hence the proposed technique can be the simplest method for reducing the contention difficulty for wide fan-in dynamic OR gate.

#### REFERENCES

- [1]. H.F.DadgourandK.Banerjee"ANovelVariation-TolerantKeeperArchitectureforHigh-PerformanceLow-PowerWideFan-InDynamicORGates" *IEEEtransactiononVLSIsystems*, vol.18, NO.11, pp. 1567-1577, Nov2010.
- [2]. Srivastava,P.;Pua,A.;Welch,L.,"Issuesinthedesignofdominologiccircuits,"VLSI,1998.Proceedingsofthe8thGreatLakesSymposiumon, vol.,no.,pp.108,112,19-21Feb1998.
- [3]. Mahmoodi,H.;Mukhopadhyay,S.;Roy,K.,"Highperformanceandlowpowerdominologicusingindependentgatecontrolindouble-gateSOIMOSFETs,"SOIConference,2004.Proceedings.2004IEEEInternational,vol.,no.,pp.67,68,4-7Oct.2004.
- [4]. J.Koppanalil,G.Yeung,D.O'Driscoll,S.Householder,Č.Hawkins,"A1.6GHzdual-coreARMCortexA9implementationonalowpowerhigh-Kmetalgate32nmprocess,"VLSIDesign,AutomationandTest(VLSI-DAT),2011InternationalSymposiumon,vol.,no.,pp.1-4,25-28April2011
- [5]. RakeshGnanaDavidJeyasingh,NavakantaBhat,andBharadwajAmrutur,"AdaptiveKeeperDesignforDynamicLogicCircuitsUsingRateS ensingTechnique," *IEEETrans.VeryLargeScaleIntegr.(VLSI)Syst.*,vol.19,no.2,pp.295-204,Feb.2011.
- [6]. P.MeherandK.K.Mahapatra,"AHigh-
  - PerformanceCircuitTechniqueForCMOSDynamicLogic,"IEEEconf.onVeryLargeScaleIntegr.(VLSI)Syst.2011,pp.1080–1085.
- [7]. LeiWang;Krishwamurthy,R.K.;Soumyanath,K.;Shanbhag,N.R.,"Anenergy-efficientleakage-tolerantdynamiccircuittechnique,"ASIC/SOCConference,2000.Proceedings.13thAnnualIEEEInternational,vol.,no.,pp.221,225,2000.
- [8]. S.Borkar, T.Karnik, S.Narendra, J.Tschanz, A.Keshavarzi, and V.De, "Parameter variations and impacton circuits and microarchitecture," in *Proc.DAC*, pp. 338–342, Dec 2003.
- [9]. A.Alvandpour, R.Krishnamurthy, K.Soumyanath, and S.Borkar, "Aconditional keepertechnique for sub-0.13nmwidedynamic gates," in *Proc. VLSI Circuits*, pp. 29–30, Mar 2001.