# Design of Tapered Serial Chains for Reduced Delay and Power Dissipation Brian S. Cherkauer and Eby G. Friedman Department of Electrical Engineering University of Rochester Rochester, New York 14627 Abstract — In this paper, the design issues relating to channel width tapered serially connected MOSFET chains are discussed. Channel width tapering is a method which is used to reduce the delay, area, and power dissipation of serial MOSFET chains. A design system for determining when tapering is appropriate, selecting the amount of tapering, and synthesizing the physical layout is presented. Physical layout issues unique to tapering are discussed, and fabricated test structures are described. # I. INTRODUCTION In order to improve the performance characteristics of CMOS circuits, integrated circuit designers apply various specialized techniques to decrease the time, area, and power required for signals to propagate through combinatorial networks. One such technique is the use of channel width tapering in those logic structures which contain serially connected MOSFET chains. Many CMOS logic structures are composed of chains of MOSFETs serially connected between a power supply rail and the output of the subcircuit. These serially connected MOSFETs are a major source of delay and power dissipation [1], therefore, optimal sizing of these transistors is important in reducing the delay and power dissipation of these circuit structures. It has been previously established by Shoji and others that under certain circumstances the propagation delay of serial chains may be reduced through the use of channel width tapering [2-9]. It has been further shown that channel width tapering can reduce the power dissipation of these structures [10, 11]. In this paper, a design process for tapered serial chains is presented and experimentally validated. In Section II, the speed and power dissipation advantages of tapered MOSFET structures are discussed, including the specific circuit constraints under which these performance advantages occur. An automated system for determining an applicationspecific tapering factor and synthesizing the physical layout is presented in Section III. Layout issues unique to tapered serial MOSFET chains are explored in Section IV. In Section V, fabricated tapered test structures are described. Finally, some conclusions are presented in Section VI. # II. ADVANTAGES OF TAPERED SERIAL MOSFETS The geometric size of the transistors in a serially connected MOSFET chain are typically constrained to have equal channel dimensions. The magnitude of these dimensions are chosen to satisfy application-specific design criteria for speed, power, and area. Shoji [3, 4] first pointed out that under certain circumstances (specifically, the load capacitance $C_L$ must be of the same order of magnitude as the parasitic drain/source capacitances $C_0$ between the serial transistors), this constant width approach to transistor sizing may be non-optimal. He proposed using either an exponential tapering of transistor aspect ratios [3], or a linear tapering of transistor aspect ratios [4], with the largest transistor closest to ground and the smallest closest to the load (see Figure 1). Exponential tapering assumes a fixed ratio $\alpha$ (where $0 < \alpha \le 1$ ) between the channel widths of adjacent transistors. A tapering factor of $\alpha = 1$ implies that each channel has equal width, i.e., this structure represents an untapered chain. Shoji further demonstrated that it was often possible, with the proper choice of tapering factor, to produce a circuit which would discharge a capacitive load more quickly than an untapered chain and therefore provide a faster transient response. Fig. 1. Untapered and exponentially tapered MOSFET chains It has been shown that in addition to the speed improvements that Shoji described, tapering also provides a method for reducing power dissipation in CMOS integrated circuits [10, 11]. Furthermore, the relative advantages of applying channel width tapering falls into one of three categories depending upon the ratio of the output load capacitance to the drain/source capacitance of the serial chain. The first category, $C_L < C_0$ , encompasses those circuits which Shoji studied. In this category, a reduction in propagation delay as well as a reduction in both short-circuit power [12], $P_{SC}$ , (dissipated in the following stage) and dynamic power, $P_d$ , (dissipated switching the serial chain) may be achieved through channel width tapering. For serial chains falling into this category, tapering is quite advantageous. The second category, $C_L > C_0$ , is characterized by a decrease in both dynamic and short-circuit power dissipation, while propagation delay is slightly increased. Note that it is an unusual circuit phenomenon for propagation delay to increase while short-circuit power dissipation in the following stage is reduced. This is due to a delayed but more rectangular voltage waveform at the output of the serial chain. Thus, short-circuit current conducted in the load circuit is reduced. Channel This material is based upon work supported by the National Science Foundation under Grant No. MIP-9208165. width tapering reduces both dynamic and short-circuit power dissipation of circuits belonging to this category. The third category, $C_L >> C_0$ , is characterized by an increase in propagation delay coupled with a significant degradation of signal quality resulting in increased short-circuit power dissipation in the following stage. Dynamic power dissipation is reduced, but this consideration is outweighed by these negative effects of tapering, and tapering is therefore not recommended for those circuits belonging to this category. If the ratio of the load capacitance, $C_L$ , to the parasitic capacitance at the drain/source nodes, $C_0$ , is less than one, as is typically found in Domino logic [13], the circuit belongs to Category 1; if the ratio is between approximately one and two, the circuit belongs to Category 2; if the ratio is larger than two, as would be found in standard static CMOS logic gates driving global lines, the circuit belongs to Category 3. A summary of the effects of tapering on each category is presented in Table I and is shown pictorially in Figure 2. Table I Summary of effects of channel width tapering on circuit characteristics | Category | Capacitance<br>Ratio | Propagation<br>Delay | Short-<br>Circuit<br>Power | Dynamic<br>Power | Total<br>Power | |----------|----------------------------------|----------------------|----------------------------|------------------|----------------| | 1 | $C_L < C_O$ | ļ | Į. | Ţ | 1 | | 2 | $C_L > C_O$ | 1 | Į. | 1 | Ţ | | 3 | C <sub>L</sub> >> C <sub>O</sub> | 1 | 1 | Į. | 11 | Fig. 2. Example circuit illustrating decreased power dissipation of channel width tapered circuit The definition of $C_0$ needs to be clarified as the drain/source diffusion capacitances are both voltage and tapering dependent. $C_0$ is defined as the zero bias junction capacitance at a drain/source node of an untapered chain. Thus, $C_0$ is neither voltage nor tapering dependent and may be calculated from (1), where A is the area of the drain/source diffusion region and P is the perimeter of the diffusion region. $$C_0 = A(C_{i0}) + P(C_{isw0}) \tag{1}$$ Using more complicated analytic expressions for the drain/source capacitance does not add significantly to the accuracy of the category classification. In Table II, the power dissipation advantages of tapering over uniform channel width serial chains for an example circuit are illustrated [10, 11]. As noted in [4], delay reductions of over 20% are achievable by tapering Category 1 circuits. Table II TOTAL POWER DISSIPATION | Category | Capacitance<br>Ratio | $\alpha = 1.0$ | $\alpha = 0.9$ | $\alpha = 0.7$ | |----------|----------------------|----------------|----------------|----------------| | 1 | $C_L/C_0 = 0.87$ | 100% | 84% | 67% | | 2 | $C_L/C_0 = 1.2$ | 100% | 87% | 65% | | 3 | $C_L/C_0 = 5.0$ | 100% | 88% | 79% | # III. AUTOMATION OF CHANNEL WIDTH TAPERING In order to exploit the speed and power dissipation characteristics of tapering, a method for determining the applicability and the amount of tapering must be devised. In this section, a simple automated design system for investigating these design tradeoffs is presented. Applying the $C_I/C_0$ guideline described in the previous section, the tapering category of a specific serial chain is initially determined. If the circuit falls into a category such that tapering is beneficial, an appropriate value of $\alpha$ is selected. If speed is the only criterion of concern, a linear resistive model of the transistors in the serial chain could be used to determine a near-optimal tapering factor [7, 8, 14], through the use of RC delay approximations. However, an RC delay model provides no information about the shape of the discharge waveform. Thus, it is unable to predict variations in short-circuit power dissipation. For this reason, the interactive design system described in this section uses SPICE [15] in order to accurately estimate the effects of channel width tapering on power dissipation [16, 17] as well as to provide timing information. Alternatively, a power estimation tool such as described in [18] could be used. This would reduce the simulation time while incurring only a slight loss of accuracy as compared to SPICE. A block diagram of this design system is shown in Figure 3. The selection of the tapering factor is performed by automatically sweeping the tapering factor over a small range of $\alpha$ (typically $0.7 \le \alpha \le 1.0$ ). SPICE circuit simulation files are generated and analyzed for an appropriate application-specific tapering factor. The range and step size of the search are determined based on physical criteria. Physical fabrication limitations constrain the minimum step size and range of the tapering factor, thereby ensuring that the design space is small. The two primary constraints are the minimum transistor dimensions and the minimum resolution of the optical reticles. The minimum transistor dimensions establish a lower limit on $\alpha$ beyond which the physical design rules would be violated. Similarly, the minimum resolution of the reticles limits the minimum variation in $\alpha$ which is physically realizable. This leads to a lower limit on step size. Thus, the search space is small, allowing the search procedure to be sufficiently accurate without incurring a significant penalty in search time. The range and step size may either be specified by the designer or determined automatically based on the minimum transistor dimensions and reticle resolution. Fig. 3. Program flow of the automated tapering system As the relative importance of speed and power dissipation greatly depends upon the application, selection of the proper tapering factor is done interactively by the designer. Once the proper tapering factor is chosen, a layout of the serial chain is automatically generated, as exemplified by Figure 4. Currently, the physical layout is generated in Magic database format conforming to MOSIS scalable CMOS design rules. Fig. 4. Example of automated layout of tapered serial chain with $\alpha = 0.8$ #### IV. LAYOUT CONSIDERATIONS It should be noted that tapering may cause design rule checking programs to highlight a possible error based on the violation of a minimum spacing between a polysilicon and a diffusion layer. This warning is intended to prevent the unintentional creation of transistors through process misalignment in those places where polysilicon and diffusion are in close proximity. In the case of tapering, these misalignments can cause only slight variations in the drain/source capacitance and, in extreme cases, slight variations in the effective W/L ratio of the transistors. However, no violation is possible since the polysilicon gate remains completely overlapped across the entire diffusion island, thereby maintaining the correct operation of the original transistor. Since misalignment occurs globally on a layer, all transistors along the chain are affected proportionately, and therefore the tapered transistor behavior is preserved. Hence, this design rule violation does not apply in the case of tapering. The minimum polysilicon overhang should be increased based on the maximum misalignment which may occur for the specific process technology. In order to guarantee that this overhang is not violated by misalignment, the overhang should be determined from the edge of the widest portion of the drain/source implant of each transistor. These design rule issues are illustrated in Figure 5. Fig. 5. Layout issues with tapering # V. FABRICATED TEST STRUCTURES A simple test chip fabricated using structures produced by this tapered design system was manufactured to verify the waveform characteristics of the tapered chains. An example of these test circuits is shown in Figure 6, which has a tapering factor of $\alpha=0.7$ . The tapered circuit structures were fabricated using an Orbit Semiconductor 2 $\mu$ m double level metal, double polysilicon P-well CMOS process and are fully functional. An oscilloscope photograph of the output traces from two otherwise identical test structures is depicted in Figure 7, one containing a tapered ( $\alpha = 0.9$ ) serial chain and one containing an untapered serial chain ( $\alpha = 1.0$ ). This photograph illustrates the phenomena typical of a Category 2 circuit. The delay is somewhat increased in the tapered structure over that of the untapered structure. However, the slope of the output is also increased over that of the untapered structure, leading to decreased short-circuit power dissipation in the following stage. Note that the buffering of the output of the serial chain is responsible for the reversed polarity of the transition shown in Figure 7. Fig. 6. Photomicrograph of fabricated test structures with $\alpha = 0.7$ Fig. 7. Oscilloscope photograph of tapered $(\alpha=0.9)$ versus untapered Category 2 circuits # VI. CONCLUSION Channel width tapering can be used to reduce the delay and power dissipation of serially connected MOSFET chains. Power dissipation reductions of over 30% are shown for Category 1 and 2 circuits with $\alpha=0.7$ . Furthermore, power dissipation reductions of over 10% with $\alpha=0.9$ , coupled with either reduced delay for Category 1 circuits or a slight increase in delay for Category 2 circuits, without loss of signal quality, are demonstrated. An interactive layout synthesis system which permits a designer to examine and exploit the speed and power dissipation tradeoffs available in tapered serially connected MOSFET chains has been developed. Upon determining the category of a specific circuit based on its $C_L/C_0$ ratio, the synthesis system provides delay and power dissipation information in order to determine the application-specific value of $\alpha$ for each circuit. This design system is then used to automatically generate a physical layout of the desired tapered circuit. A fabricated integrated circuit containing test circuits is described which validates the utility of tapered MOSFET structures, and output waveforms comparing tapered and untapered structures are shown. Thus, the advantages of channel width tapering may easily be incorporated into those high performance circuits where speed and power dissipation are of primary concern. #### REFERENCES - [1] T. Sakurai and A. R. Newton, "Delay Analysis of Series-Connected MOSFET Circuits," *IEEE Journal of Solid-State Circuits*, Vol. SC-26, No. 2, pp. 122-131, February 1991. - [2] M. Shoji, "Electrical Design of BELLMAC-32A Microprocessor," Proceedings of the International Conference on Circuits and Computers, pp. 112–115, September 1982. - [3] M. Shoji, "Apparatus for Increasing the Speed of a Circuit Having a String of IGFETs." U.S. Patent 4,430,583, issued February 7, 1984. - [4] M. Shoji, "FET Scaling in Domino CMOS Gates," IEEE Journal of Solid-State Circuits, Vol. SC-20, No. 5, pp. 1067–1071, October 1985. - [5] M. Shoji, CMOS Digital Circuit Technology, pp. 243–253. Prentice Hall, 1988 - [6] M. Shoji, Theory of CMOS Digital Circuits and Circuit Failures, pp. 137–142. Princeton University Press, 1992. - [7] S. S. Bizzan, G. A. Jullien, and W. C. Miller, "Analytical Approach to Sizing nFET Chains," *Electronics Letters*, Vol. 28, No. 14, pp. 1334– 1335, July 1992. - [8] L. T. Wurtz, "An Efficient Scaling Procedure for Domino CMOS Logic," IEEE Journal of Solid-State Circuits, Vol. SC-28, No. 9, pp. 979–982, September 1993. - [9] S. R. Vemuru, "Scaling of Serially-Connected MOSFET Chains," Proceedings of the Fourth Great Lakes Symposium on VLSI, pp. 200– 203, March 1994. - [10] B. S. Cherkauer and E. G. Friedman, "The Effects of Channel Width Tapering on the Power Dissipation of Serially Connected MOSFETS," Proceedings of the IEEE International Symposium on Circuits and Systems, pp. 2110-2113, May 1993. - [11] B. S. Cherkauer and E. G. Friedman, "Channel Width Tapering of Serially Connected MOSFETs with Emphasis on Power Dissipation," *IEEE Transactions on VLSI Systems*, Vol. VLSI-2, No. 1, pp. 100–114, March 1994. - [12] H. J. M. Veendrick, "Short-Circuit Dissipation of Static CMOS Circuitry and Its Impact on the Design of Buffer Circuits," *IEEE Journal of Solid-State Circuits*, Vol. SC-19, pp. 468–473, August 1984. - [13] R. H. Krambeck, C. M. Lee, and H.-F. S. Law, "High-Speed Compact Circuits with CMOS," *IEEE Journal of Solid-State Circuits*, Vol. SC-17, No. 3, pp. 614–619, June 1982. - [14] G. A. Jullien, W. C. Miller, R. Grondin, Z. Wang, L. Del Pup, and S. Bizzan, "Woodchuck: A Low-Level Synthesizer for Dynamic Pipelined DSP Arithmetic Logic Blocks," Proceedings of the IEEE International Symposium on Circuits and Systems, pp. 176–179, May 1992. - [15] A. Vladimirescu and S. Liu, "The Simulation of MOS Integrated Circuits Using SPICE2." ERL Memo M80/7, University of California, Berkeley, October 1980. - [16] S. M. Kang, "Accurate Simulation of Power Dissipation in VLSI Circuits," *IEEE Journal of Solid-State Circuits*, Vol. SC-21, No. 5, pp. 889–891, October 1986. - [17] G. J. Fisher, "An Enhanced Power Meter for SPICE2 Circuit Simulation," *IEEE Transactions on Computer-Aided Design*, Vol. CAD-7, No. 5, pp. 641-643, May 1988. - [18] F. Rouatbi, B. Haroun, and A. J. Al-Khalili, "Power Estimation Tool for Sub-Micron CMOS VLSI Circuits," *Proceedings of the International Conference on Computer-Aided Design*, pp. 204–209, November 1992.