### From 100 Milliwatts/MIPS to 10 Microwatts/MIPS Moderator: Eby G. Friedman Department of Electrical Engineering University of Rochester Rochester, New York 14627 USA friedman@ee.rochester.edu Forumists: Sung-Mo (Steve) Kang, University of Illinois, Urbana-Champaign, Illinois USA (kang@uivlsi.csl.uiuc.edu) Eric A. Vittoz, CSEM Centre Suisse D'electronique et de Microtechnique SA, Neuchatel Switzerland David J. Allstot, Carnegie Mellon University, Pittsburgh, Pennsylvania USA (allstot@gauss.ece.cmu.edu) Erik P. Harris, International Business Machines, Yorktown Heights, New York USA (harris@watson.ibm.com) Ran-Hong Yan, AT&T Bell Laboratories, Holmdel, New Jersey USA (rhy@spin.att.com) Sponsor: VLSI Systems & Applications Technical Committee of the IEEE Circuits and Systems Society #### **ABSTRACT** The design and application of low power VLSI-based circuits and systems is the focus of this forum and paper. Attention is placed on issues related to low power systems, such as choice of implementing semiconductor technology, power supply voltage (e.g., 3.3V, 2.5V, 1.0V), CAD for low power design and synthesis, micropower digital and analog circuit design techniques, system architectural tradeoffs for low power, selective clocking for power management, low power synchronization strategies, self-calibration circuit techniques for low power, and low power figures of merit. ### Eby G. Friedman Eby Friedman (SM'90) received the B.S. degree in electrical engineering from Lafayette College, Easton, PA. in 1979 and the M.S. and Ph.D. degrees in electrical engineering from the University of California, Irvine, in 1981 and 1989, respectively. He was previously employed by Philips Gloeilampen Fabrieken in 1978 and Hughes Aircraft Company from 1979 to 1991, working in the areas of custom digital and analog IC design, supporting design methodologies and CAD tools, and high performance/high resolution DSP and oversampled systems. Dr. Friedman has been in the Department of Electrical Engineering at the University of Rochester since 1991, where he is an Associate Professor and Director of the High Performance VLSI/IC Design and Analysis Laboratory. His current research and teaching interests are in the areas of high performance VLSI/IC design with an emphasis on niche technologies and their system applications. #### INTRODUCTION High speed and low power are conflicting requirements in the development of microelectronic systems. Historically, these performance goals have led to separate application paths; either high speed, computationally intensive, and easily maintainable systems, expending dozens to thousands of watts of power, or ultralow power, battery operated, portable or highly isolated (difficult to maintain) systems, operating at power levels where tens of milliwatts matter. These paths represent the two primary dichotomous application areas for high speed, low power VLSI-based systems. Throughout the late 80's and early 90's, the primary research focus was high speed, processor-oriented systems. Recently, since the early 90's, significant attention has been spent on ultra-low power systems, best identified with portable laptop computers. However, a clear shift has become apparent. High speed and low power have both become of paramount importance. This trend is primarily due to two developing application areas. One area, a continuation of the high speed systems of the late 80's, is extremely high speed, ultra-high density applications, exemplified by 200 plus MHz, microprocessors with one to two million transistors or more on a single chip. In these systems, the power density of the transistors are of fundamental significance. The second area, an extension of the ultra-low power systems of the early 90's, is battery operated, portable computers which must operate at significantly greater operating speeds than are currently possible while dissipating very little power. Both application areas represent important growth paths within the commercial semiconductor microelectronics sector. Furthermore, these areas of intense research and development have many features in common, and are expected to coalesce into a unified extremely high speed, ultra-low power semiconductor technology of general utility and applicability. Research in this area has occurred at all levels of design and technology. These efforts range from low-level circuit design and technological issues to high-level topics, such as performance-based architectural tradeoffs, selective clocking, and low voltage operation. In an effort to survey some recent topics in extremely high speed, ultra-low power systems, a panel of experts has been brought together to discuss this topic. Issues, such as state assignment in finite state machines for low power, micropower analog design, circuit design for low power, portable computers, and low voltage technology tradeoffs will be discussed. Biographies of these individuals with position statements and recent research results are provided below. ### Sung-Mo Kang Sung-Mo (Steve) Kang (F'90) received the Ph.D. degree in electrical engineering from the University of California at Berkeley in 1975. Until 1985 he was with AT&T Bell Laboratories at Holmdel, Murray Hill, and also has served as a faculty member of Rutgers University. In 1985, he joined the University of Illinois at Urbana-Champaign where he is Professor of Electrical and Computer Engineering, Computer Science, and Research Professor of Beckman Institute of Advanced Science and Technology, Coordinated Science Laboratory, and Associate Director of NSF Engineering Research Center for Compound Semiconductor Microelectronics. He is also an Associate in the Center for Advanced Study, University of Illinois at Urbana-Champaign. In 1989 he was a Visiting Professor at Swiss Federal Institute of Technology at Lausanne. His research interests include VLSI systems design methodologies, optimization for performance, reliability and manufacturability, modeling and simulation of semiconductor devices and circuits, and high-speed optoelectronic circuits and fully optical network systems. He has served as AdCom member, Secretary and Treasurer, Administrative Vice President, and 1991 President of IEEE Circuits and Systems Society. He has served on the program committees and technical committees of ISCAS, ICCAD, ICCD, DAC, MWSCAS, MCMC, International Conference on VLSI and CAD (ICVC), International Conference on VLSI Design, LEOS Topical Meeting, and on the editorial boards of IEEE Transactions on Circuits and Systems, International Journal of Circuit Theory and Applications, and Circuits, Signals and Systems. Dr. Kang is listed in Who's Who in America; Technology; Engineering; Midwest; and is the Founding Editor-in-Chief of the IEEE Transactions on Very Large Scale Integration (VLSI) Systems and Founding Director of the Illinois Center for ASIC Research and Development. He has received the SRC Inventor Recognition Award (1993), IEEE CAS Darlington Prize Paper Award (1993) and other best paper awards (1979, 1987) and coauthored three books, Design Automation For Timing-Driven Layout Synthesis, Hot-Carrier Reliability of MOS VLSI Circuits, and Physical Design for Multichip Modules from Kluwer Academic Publishers. ### State Assignment for Low Power FSM Synthesis Using Genetic Local Search Eric Olson and S. M. Kang We can reduce the power consumption of a finite state machine by designing a circuit that lowers the switching frequency of each gate while reducing the number of gates and their fanout capacitances. The choice of state assignments can affect the power efficiency of the resulting circuit. By optimizing the sequential state assignment process to reduce both the gate switching frequency and the circuit size, we can reduce power consumption independently of the supply voltage, technology, architecture, and other power reduction methods. So far the research on state assignment has been concentrated upon reducing the circuit area and critical delay path. MUSTANG [1], JEDI [2], and MUSE [3] all perform state assignments that target minimal area multilevel logic implementations. To produce a low power state assignment, we create and optimize a cost function that combines area optimization and a reduction of the switching frequency: $$cost = \alpha \sum C^{A}(k,l) \Delta_{k,l} + \beta \sum P^{st}(k,l) \Delta_{k,l}$$ (1) The $\alpha$ term is the area cost function used by MUS-TANG and JEDI. It performs common cube extraction to reduce the number of literals resulting from multilevel logic minimization, resulting in an implementation with fewer gates. The $\beta$ term, or switching cost, measures the average number of state bits that change every cycle. $\Delta k,l$ is the Hamming distance or the number of state bits that change during a $k \rightarrow l$ transition. $P^{st}(k,l)$ is the state transition probability, the probability that the FSM will make a $k \rightarrow l$ transition, derived from profile information or Markov chain simulation. The product of the Hamming distance and the state transition probability is the average number of state bits changing due to $k\rightarrow l$ transitions. The switching cost represents the average total state bit changes per transition. For example, if the state encodings of k = 001 and l = 010, the $k \rightarrow l$ state transition causes two state input bits to change. If l's encoding = 011, only one bit changes and it potentially uses less power because only one memory element changes state. In addition, a lower transition density for state inputs and outputs would probably cause fewer gate transitions in the combinational portion of the FSM. Therefore, state encoding should be assigned so the most frequent state transitions cause the fewest state bits to change. We optimize the cost function by solving the graph embedding problem using a genetic local search algorithm [4]. We can quickly find near optimal state assignments for finite state machines with less than fifty states and good assignments for larger benchmarks, producing 2.5% better solutions than JEDI's simulated annealing with comparable run times. We compared our state assignment with JEDI by assigning states to the larger MCNC benchmarks, doing multilevel synthesis with SIS's script.rugged, and estimating the power. The power estimator calculates the gates' transition densities [5][6] using the sequential state transition probabilities. We produced designs that use 19% less power than current area minimization techniques and gate areas were also 8% smaller. These designs use approximately 25% less power than random encodings. Reducing the switching frequency of the state inputs helps to reduce the average power, but this is sometimes not enough to compensate for a larger area design. Area is an equally important design parameter for low power design. ### References - [1] S. Devadas, H.-K. Ma, A. R. Newton, and A. Sangiovanni-Vincentelli, "MUSTANG: State Assignment of Finite State Machines Targeting Multilevel Logic Implementations," *IEEE Transactions on Computer-Aided Design*, Vol. 7, No. 12, pp. 1290-1300, December 1988. - [2] B. Lin and A. R. Newton, "Synthesis of Multiple Level Logic from Symbolic High-Level Description Languages," Proceedings of the International Conference on VLSI, pp. 187-196, August 1989. - [3] X. Du, G. Hatchel, B. Lin, and A. R. Newton, "MUSE: A MUltilevel Symbolic Encoding Algorithm for State Assignment," *IEEE Transactions on Computer-Aided Design*, Vol. 10, No. 1, pp. 28-38, January 1991. - [4] N. L. J. Ulder, E. H. L Aarts, H. Bandelt, P. J. M. van Laarhoven, and E. Pesch, "Genetic Local Search Algorithms for the Traveling Salesman Problem," *Proceedings of the 1st* Workshop on Parallel Problem Solving from Nature, pp. 109-116, October 1990. - [5] F. N. Najm, "Transition Density: A New Measure of Activity in Digital Circuits," *IEEE Transactions on Computer-Aided Design*, Vol. 12, No. 2, pp. 310-323, February 1993. - [6] A. Ghosh, S. Devadas, K. Deutzer, and J. White, "Estimation of Average Switching Activity in Combinational and Sequential Circuits," Proceedings of the 29th ACM/IEEE Design Automation Conference, pp. 253-259, June 1992. ### Eric A. Vittoz Eric A. Vittoz (F'89) is Executive Vice-President, Integrated Circuits and Systems, at the Swiss Center for Electronics and Microtechnology (CSEM) in Neuchâtel. He has accumulated more than 25 years of experience in very low power circuits, originated in the early developments for electronic watches. His present research interest is concentrated on massively parallel analog VLSI circuits for perception. Dr. Vittoz is also professor at the Federal Institute of Technology in Lausanne (EPFL). He has authored or co-authored more than 90 papers on low power and analog design, and holds 25 patents. ## Micropower Analog Design: Limits and Basic Techniques Eric A.Vittoz The absolute minimum limit of the power consumed by analog circuits comes from the need to maintain the signal energy much higher than the thermal energy, in order to obtain the required signal-to-noise ratio S/N. For signal processing circuits, it can be expressed as P > 8kTf S/N, where f is the signal bandwidth. This limit is only approached within 2 orders of magnitude in the best existing active filters. It also applies to oscillators, and to amplifiers after multiplication by their voltage gain. It is very steep, since it implies a 10-fold increase in power for every 10 dB in S/N. In comparison, the power consumed by digital circuits is only proportional to the number n of bits (or to n<sup>2</sup>, eventually n<sup>3</sup>) and is therefore only weakly dependent on S/N. Various techniques have been developed to implement low-voltage, very low-power circuits which approach these limits. Analog micropower techniques exploit the particular properties offered by MOS transistors operated in weak inversion. This mode is limited to low frequencies, but it provides minimum saturation voltage, maximum transconductance and minimum gate noise for a given current, maximum voltage gain and minimum gate offset. Most of these features are offered by bipolar transistors as well, thus BiCMOS is the ideal process for high-frequency low-power analog circuits. Single-stage architectures and operation in class AB are used to maximize the power efficiency of amplifiers. Special cascode schemes and onchip clock voltage multiplication for analog switches facilitate operation at low supply voltage. Low voltage does not help reducing the power of analog circuits, but is acceptable down to 1V. Scaled down processes result in lower power and/or better performance, provided the circuits are adequately re-designed and not simply scaled down geometrically. Because of the very steep dependency of their power on S/N, analog circuits are no match to digital circuits for implementing low-power signal processing functions with large S/N. Their role in this case is reduced to that of unavoidable interface circuits, for which innovative ideas are needed to push the power closer to the limit. New "analog floating point" techniques are being developed to extend the dynamic range of analog circuits while keeping their S/N at the minimum value which is acceptable for the application, in order to minimize power. However, analog solutions are much more power efficient for carrying out tasks which require only a small S/N and a low precision. Furthermore, analog circuits can exploit the transistor more effectively than a switch to implement a diversity of dense, linear or non-linear, processing blocks suitable for collective operation. Massively parallel analog VLSI processing is therefore expected to be much more efficient than digital solutions to implement perception/action systems capable of processing and merging the dynamic flow of information delivered by large arrays of sensors, and/or to control distributed arrays of actuators. ### David J. Allstot David J. Allstot is Professor of Electrical and Computer Engineering at Carnegie Mellon University and Associate Director of the SRC-CMU Center of Excellence in Computer-Aided Design. He has published about 100 technical papers with his students and colleagues and has received several best paper and teaching awards. He has advised about 35 M.S. and Ph.D. students. Prof. Allstot is a Fellow of the IEEE, and currently serves as Editor of the *IEEE Transactions on Circuits and Systems II*. He is a workaholic with no apparent hobbies. ### Will Circuits Keep the Wheels on the Low-Power Band Wagon David J. Allstot The key to reducing power dissipation in ICs is obviously a dramatic reduction in the supply voltage below the current 5/3.3 V standards. In order to maintain reasonable performance with very low supply voltages, it is necessary to similarly reduce MOSFET threshold voltages. Although it is not widely acknowledged at the moment, it is known that the magnitude of the threshold voltage variation associated with a typical CMOS process tends to stay constant, if not increase, as V<sub>1</sub> is reduced and technologies scale to smaller feature sizes. The results are huge variations in speed and power dissipation with low supply voltages which appears to be the major impediment to the development of practical manufacturable ultra low-power systems. Divine intervention to reduce or eliminate process spreads is always welcome. Even more welcome is massive funding of low-power research. But if that effort is aimed solely at systems-level research, it is as likely as divine intervention to find a solution to the manufacturing problem. Hence, low-level circuit and/or process calibration techniques may provide an answer to this high-level problem. Several such techniques and strategies will be proposed in this panel presentation. ### Erik P. Harris Erik P. Harris is Manager of Plans, Operations, and Business Development in the Systems, Technology, and Science Department at the IBM T. J. Watson Research Center in Yorktown Heights, NY. He received his PhD in Physics at the University of Illinois in 1967. He then joined IBM, where he has held various technical and management positions in microelectronic technology during his career. ### Technology Directions for Low-Power Systems Erik P. Harris Computers with reduced power consumption are assuming increased importance in today's environment. This is part of an overall trend in computing toward providing the customer with greater function at smaller size, lower weight, reduced electrical and thermal loads, improved reliability. and lower cost. When considering how to reduce power consumption in computers, one must take into account the power budget of the complete system, since the power dissipation of peripherals can exceed the power consumed by the processor chip or chipsets. This is perhaps best exemplified by technology trends and power budgets in portable computers. Portable computing is driving the development of many low-power technologies today, and the discussion which follows will focus on portable computers. The term "portable computer" covers a broad range of system types, including palmtop and notepad computers, keyboard-input notebook computers, and larger AC-powered units. Of these, the greatest market growth so far has been for A4-format battery-powered notebook computers. Typical notebook computers today with performance of a few MIPS (Millions of Instructions Per Second) and a monochrome STN/LCD display weigh 2-3 kilograms and dissipate 9-10 Watts when running at full speed. They are typically powered from a rechargeable NiCd or Ni-metal-hydride battery which hold about 30 Watt-hours of energy, thereby providing about 3-3.5 hours of battery life in continuous operation. With power-saving features enabled, battery life can be increased by about 1.5 times in many applications. Two key directions for the future evolution of the notebook computer are envisioned. One direction is evolution toward reduced size and weight at "constant" function and performance, leading toward notepad and palmtop computers with significant computing capability and incorporating pen-input and wireless capability. A second direction is the incorporation over time of greater performance and function in notebook-size computers while continuing to reduce weight and increase battery life. By the latter part of this decade, notebook computers should be available with performance of tens of MIPS and including an active-matrix color LCD display and other advanced features in a unit which is significantly thinner and lighter than today's notebook computers and has significantly longer battery life. The power budgets of these future color notebook computers will be dominated by LCD backlight power dissipation, as opposed to the power budgets of today's systems which are about evenly divided between display power, planar electronics power, and all other sources of power dissipation taken together. In fact, the dissipation of active-matrix color LCD displays is likely to exceed today's monochrome displays even if known possibilities for backlight power reduction are exploited. Because of this, enhanced battery life in such systems will only be attained if other avenues of low-power technology development are aggressively exploited: for example, CMOS electronics power reduction by low-voltage design and capacitance minimization, disk-drive idle power reduction through miniaturization and electronics power reduction, and the development of high-energy-capacity Li-based batteries. ### Ran-Hong Yan Ran-Hong Yan is a Member of Technical Staff in the Silicon Electronics Research Laboratory of AT&T Bell Laboratories, Holmdel, NJ. He received the Ph.D. degree from UC Santa Barbara in 1990. His present research interests include aggressive MOS scaling down to 0.1 µm and below, high performance/low-power electronics covering technology, circuit, and system considerations, including ultra-low power digital signal processors for hand-held communication and computing applications. # Low-Voltage CMOS Technology - integrated engineering considerations Ran-Hong Yan Reducing operating voltage is considered to be one of the most straight forward approaches for reducing power consumption for portable and mainframe applications. Figure 1 shows the expected power reduction (or more precisely, reduction in energy per instruction) made possible through aggressive voltage scaling of CMOS technology. In practice, however, voltage reduction calls for complicated optimization of a multi-parameter system, including CMOS processing technology, circuit topology, application performance requirement, and engineering/manufacturing cost. Depending on the performance/cost trade-offs, engineering solutions could range from as easy as modifying threshold voltage adjustment implants to as complicated as exploiting a completely new fabrication technology, with vastly different circuit and system layouts. For applications where low cost is at least as important as low power consumption, for example, in commodity-like market such as portable digital cellular phones, one might want to first characterize the technology at the lowest operating voltage possible while meeting the minimum performance requirements. Then, one would see how much performance mance enhancement could be obtained by adjusting threshold voltage (implants) in the same CMOS processing technology without having to re-design the chips. Since a process change normally requires some sort of re-qualification of the entire process, the additional nonrecurring engineering costs could be more than affordable. On the other hand, if the opportunity exists such that new designs could start with a new processing technology, one might have more options. Figure 1: Reduction in energy per instruction by aggressive voltage scaling of the CMOS technology, assuming the same processor architecture. Table I shows a comparison of several low-voltage CMOS approaches with different levels of engineering. The standard process @ 3.3 V is assumed to have SPEED=1 and POWER=1. Depending on the targeted applications, one could choose the appropriate approach accordingly. Not shown in the table are the costs associated with each different process, except for the first one where the design cost remains the same. The testing cost might increase because of the different operating regime. Table I: Comparison of several different low-voltage CMOS approaches | NAND<br>(FI=2, FO=4) | Standard<br>Process<br>@ 2 V | Vt-<br>Adjusted<br>@ 1 V | New<br>Process<br>@ 1 V | |----------------------|------------------------------|--------------------------|-------------------------| | SPEED | 0.5 | 0.5 | 1.0 | | POWER | 0.37 | 0.092 | 0.11 |