# Tile-Based Power Delivery Networks for High Current, Voltage Stacked Systems

Kan X[u](https://orcid.org/0000-0003-1418-1458)<sup>®</sup>, Nurzhan Zhuldassov, and Eby G. Friedman, *Fellow, IEEE* 

*Abstract***— Due to the increasing throughput of highperformance integrated circuits, the power consumption of recent high-performance computing systems has grown significantly, leading to greater on-chip current demand. The large current flowing within the power delivery system produces challenging issues, such as electromigration, low power efficiency, and thermal hot spots. To reduce on-chip current demand, voltage stacking has become a topic of growing interest within the industrial and academic communities. The challenges of power delivery for on-chip voltage stacked systems are, however, significant. The power delivery system connects individual stacks and stacks to converters. Challenges include cross-core current paths. A tile-based power delivery design methodology is proposed for high current, voltage stacked systems. The tile-based power delivery system provides net separation, low parasitic impedance, and a scalable design process. The tile shape eases the design and characterization of the parasitic impedance, making this methodology effective for early stage exploration of the power delivery system within voltage stacked structures.**

*Index Terms***— Differential power processing, parasitic impedance, power delivery network, voltage stacked system.**

#### I. INTRODUCTION

VER the past several years, the number of cores and throughput of high-performance integrated circuits (ICs) have significantly increased [1]–[5]. The higher throughput of the processor and the slowdown in scaling the power supply voltage have led to much greater dynamic power consumption [6]. Aggressive CMOS device scaling has greatly increased both gate and channel leakage currents, which has become the primary component of the total power dissipated in modern ICs [7]. As a result, the power consumption of a high-performance computing system continues to increase. The power consumption of recent high-performance CPUs, GPUs, and application-specific integrated circuits (ASICs) can exceed 250 W [1]–[5]. The IEEE International Roadmap

Manuscript received April 27, 2021; accepted June 10, 2021. Date of publication June 18, 2021; date of current version July 20, 2021. This work was supported in part by the National Science Foundation under Grant CCF-1716091, in part by the Intelligence Advanced Research Projects Activity (IARPA) under Grant W911NF-17-9-0001, in part by Qualcomm, and in part by Synopsys. Recommended for publication by Associate Editor J. N. Tripathi upon evaluation of reviewers' comments. *(Corresponding author: Kan Xu.)*

The authors are with the Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY 14627 USA (e-mail: kxu8@ece.rochester.edu; nzhuldas@ece.rochester.edu; friedman@ece. nzhuldas@ece.rochester.edu; friedman@ece. rochester.edu).

Color versions of one or more figures in this article are available at https://doi.org/10.1109/TCPMT.2021.3090378.

Digital Object Identifier 10.1109/TCPMT.2021.3090378

for Devices and Systems predicts further increase in power consumption in high-performance processors [8].

High current challenges exist in high-performance computing systems, including power noise, electromigration, and power efficiency. By serially connecting multiple power domains, voltage stacking can significantly reduce the current flowing through a power delivery network, greatly lowering the current. Voltage stacking, also referred to as charge recycling [9] and multistory power delivery [10], has recently drawn significant attention from both the industrial and academic communities [9]–[17]. The challenges of voltage stacked systems are, however, significant. Redesign of the on-chip and package power delivery network is required to support the flow of current through the serially connected power network within each stack. Cross-core current paths in the power delivery system lead to complex power integrity and reliability issues [18].

Another challenge is load imbalances between layers, which can lead to voltage variations within each power domain. Although circuit, architecture, and scheduling techniques can enhance the balance between the stacked systems, load imbalances cannot be entirely eliminated with these methods. Push–pull regulators are therefore required to regulate the voltage levels across each stack, supporting a range of load imbalances within a voltage stacked system [15]. The power delivery system connecting the regulators to the stacks is also challenging due to the already complex power network within voltage stacked systems. The parasitic impedance within this power delivery system plays an impactful role. A design methodology that considers these challenges in delivering power in voltage stacked systems is highly desirable. For the first time, the effects of the parasitic impedance of the power delivery system on the power integrity of a voltage stacked system are discussed. A tile-based power delivery system is introduced to provide a practical and scalable physical design methodology supporting net separation and large load imbalances within voltage stacked systems. This tile-based power delivery system is a promising platform to enable the full potential of voltage stacking with high current demand.

The rest of this article is organized as follows. Prior work related to power delivery networks within voltage stacked systems is discussed in Section II. The performance degradation of a voltage stacked system when ignoring the parasitic impedance within the power delivery network is also discussed. In Section III, power delivery in voltage stacked systems, which consider the parasitic impedance, is discussed.

2156-3950 © 2021 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.



Fig. 1. 16-core, four-layer voltage stacked system.

The parasitic impedance of the power delivery system within two converter topologies, stack-to-bus and stack-to-stack, is also described and compared. A tile-based power delivery design methodology for high current, voltage stacked systems is presented in Section IV, supporting multiple power domains while lowering the parasitic impedance. Some conclusions are offered in Section V.

## II. BACKGROUND AND PREVIOUS WORK

Power delivery within a voltage stacked system is highly complex. Voltage stacking is introduced in Section II-A. The challenges of load imbalances within conventional power delivery systems are also discussed. Existing work on developing balanced stacks and current balancing converters is reviewed in Section II-B. Performance degradation of a voltage stacked system when ignoring the parasitic impedance within the power delivery system is also described.

#### *A. Voltage Stacked Systems*

Voltage stacking is a circuit- and architectural-level technique that serially connects multiple voltage domains. A high input voltage and lower current are achieved, managing electromigration constraints, distribution losses, and thermal hot spots. The high input voltage also improves system efficiency due to the high-voltage transmission [19]. Voltage stacking is therefore a useful technique to alleviate electromigration in high-performance systems while requiring less metal resources for the power I/Os. Depending on where voltage stacking is applied, voltage stacking can be categorized into board/package voltage stacking and on-chip voltage stacking. Board/package voltage stacking does not mitigate the high current flowing into the on-chip power delivery system [14]–[16]. Board/package voltage stacking is, therefore, not helpful in mitigating electromigration in the package balls due to the high on-chip current. On-chip voltage stacking is an effective candidate to resolve this issue. The focus of our work is, therefore, on on-chip voltage stacking.

An example of a 16-core, four-layer voltage stacked system is shown in Fig. 1, where a 16-core processor is divided into four voltage domains [18]. Note that the term, "layer," in this article describes those core/cores sharing the same voltage domain within a voltage stacked system, similar to a "stack" referenced in other work. The current passing through each stacked layer is ideally the same. In practice, load or current imbalances exist across the stacked layers.

These load imbalances lead to voltage variations across the *n* layer voltage domains, challenging system performance, and reliability, where  $n$  is the number of layers. The effects of load imbalances within a four-layer voltage stacked system are evaluated here [18]. A significant voltage drop is observed with a relatively minor load imbalance scenario. It is also observed that placing decoupling capacitors within each layer is not effective in balancing the load. A 5% noise margin can only be satisfied with a minor load imbalance scenario while requiring an impractical amount of decoupling capacitance [18].

## *B. Existing Work*

A better approach in voltage stacked systems is to utilize a voltage regulator to manage the effects of load imbalances on the power noise. Significant work has focused on a power converter to support load imbalances. A high-efficiency, highpower density, fully integrated switched-capacitor converter is proposed in [20], supporting a two-layer, voltage stacked system with on-chip trench capacitor technology. A fully integrated on-chip push–pull switched-capacitor converter is described in [11] to regulate the voltage across each layer when load imbalances occur. A four-layer voltage stacked system dissipating 17.5 mW is achieved [11]. A hybrid voltage stacked system is proposed in [9], where an off-chip voltage regulator combined with an on-chip integrated voltage regulator is utilized to address load imbalances. It is reported that  $82.4 \text{ mm}^2$  of on-chip area is dedicated to the integrated voltage regulators [9]. A ladder topology switched-capacitor voltage regulator is described in [18] to support a high current, voltage stacked system. A  $20 \times$  smaller voltage drop is observed as compared with a voltage stacked system without regulation or decoupling capacitors [18].

The complex nature of the on-chip power delivery system, including the current paths within an on-chip voltage stacked system, has, however, not been considered in existing work. Delivering power to voltage stacked systems is either oversimplified or assumed to behave the same as nonvoltage stacked systems. Due to the cross-core current paths, the parasitic impedance characteristics of voltage stacked systems are, in fact, different from nonvoltage stacked systems. Despite playing a critical role in voltage stacked systems, the effects of the parasitic impedance have been to date neglected in the literature.

To demonstrate the effects of the parasitic impedance, in [18] the parasitic resistance and inductance between adjacent layers are added to the same load balancing circuit. The effects of the parasitic impedance on the voltage drop are shown in Fig. 2. As noted in Fig. 2(a), the voltage drop significantly increases with greater parasitic resistance. This large voltage drop is due to the high current flowing through the parasitic resistance within the power delivery



Fig. 2. Voltage drop considering the effects of the parasitic impedance within a power delivery system. (a) Parasitic resistance increases from 0 to 5 m $\Omega$ , while the parasitic inductance is assumed negligible. (b) Parasitic inductance increases from 0 to 80 pH, while the parasitic resistance is assumed negligible.

system. The parasitic inductance also leads to a greater voltage drop, as shown in Fig. 2(b), although not as significant as the parasitic resistance. The high switching frequency of the switched-capacitor converter and parasitic inductance leads to greater *Ldi*/*dt* noise.

# III. PARASITIC IMPEDANCE-AWARE POWER DELIVERY SYSTEMS

Due to the serially stacked layers, delivering power to a high current, voltage stacked system with multiple voltage domains is highly complex. A conventional power delivery network within a voltage stacked system can produce either an unrealistic design specification or significant parasitic impedance, damaging system reliability and efficiency. A power delivery network dedicated to voltage stacked systems as well as load balancing circuits is presented in Section III-A. The primary differences between a conventional power network and a voltage stacked power network are reviewed. The power delivery system connecting the layers to the load balancing circuits, a differential power processing (DPP) system, is introduced in Section III-B. The effects of the topology of the DPP system on the power delivery network are also discussed. In Section III-C, a voltage stacked system with a load balancing converter utilizing a stack-to-bus topology is introduced. The effects of the parasitic impedance on the power delivery system are also reviewed.

#### *A. Power Delivery Paths Within Voltage Stacked Systems*

Power delivery systems should provide low-resistance and reliable paths to distribute current to the on-chip loads. Due to the serial connections between each layer, the current distribution paths in a voltage stacked system are different from a



Fig. 3. On-chip current paths among different cores. (a) Regular nonstacked, four-core system, and (b) four-layer voltage stacked system.

regular system. A comparison of the current paths between a four-core regular nonstacked system and a four-layer voltage stacked system is shown in Fig. 3. In a nonstacked system, the current distribution is dominated by the current flowing from the bumps to the on-chip load, as shown in Fig. 3(a). The current transferred from a bump is distributed within the power distribution cell [21]. A power distribution cell is a circular area within a power grid surrounding a power bump, where the on-chip load within this area draws current from this bump.

In a voltage stacked system, the current flows through the serially connected power network, producing a cross-core current path, as shown in Fig. 3(b). Header and footer layers are defined here, respectively, as the first and last layer along the current path. The current flows from the header layer (e.g., core 1) to the footer layer (core 4), passing through the intermediate layers (cores 2 and 3), as shown in Fig. 3(b). Due to this cross-core current path, the power bumps, connected to the power plane of the package, only exist in the footer and header layers of a voltage stacked system. No power I/Os exist within the intermediate layers connected to the power plane within the package. Assuming the same bump pitch as a nonstacked four-core system, only 1/4 of the power bumps are required in a voltage stacked system, thereby maintaining the same current density within each power bump. Bump resources and area can, therefore, be saved for high-bandwidth signals. The cross-core current paths, however, produce a large voltage drop across the parasitic impedance within the power network, degrading the performance of the voltage regulator within a voltage stacked system. Note that these cross-core current paths do not exist in a nonstacked power delivery system, where current is only distributed within the power distribution cell. This situation occurs since the power delivery network within a nonstacked system is not serially connected.

To evaluate the effects of the cross-core current paths on the power delivery system, a comparison of the current density between a regular power network and a voltage stacked power network is presented. The specifications of the on-chip power grid are listed in Table I, including the parasitic impedance

TABLE I SPECIFICATIONS OF THE ON-CHIP POWER NETWORK [22]

|  | Value                |                                                                                                                                                                                                                                                                             |
|--|----------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|  | $1,000 \times 1,000$ |                                                                                                                                                                                                                                                                             |
|  | 50                   |                                                                                                                                                                                                                                                                             |
|  | 50                   |                                                                                                                                                                                                                                                                             |
|  | 1                    |                                                                                                                                                                                                                                                                             |
|  | 2.5                  |                                                                                                                                                                                                                                                                             |
|  | $\overline{2}$       |                                                                                                                                                                                                                                                                             |
|  | 5.88 x $10^7$        |                                                                                                                                                                                                                                                                             |
|  | 0.8                  |                                                                                                                                                                                                                                                                             |
|  | 1                    |                                                                                                                                                                                                                                                                             |
|  |                      | Power bump<br>Power cell<br>Current                                                                                                                                                                                                                                         |
|  | (a)                  |                                                                                                                                                                                                                                                                             |
|  |                      | Power grid<br>Power bump                                                                                                                                                                                                                                                    |
|  |                      | Current                                                                                                                                                                                                                                                                     |
|  | Core N               | Specs<br>Core size $(\mu m)$<br>Power bump pitch $(\mu m)$<br>Power metal line pitch $(\mu m)$<br>Metal line width $(\mu m)$<br>Metal line depth $(\mu m)$<br>Power grid layer number<br>Cu conductivity (S/m)<br>$V_{dd}$ (V)<br>Average power consumption (W)<br>Core N 1 |

Fig. 4. Current density model. (a) Cell-based regular nonstacked power grid. (b) Voltage stacked power grid with cross-core current paths.

of the on-chip power grid. The parasitic impedance rather than S-parameters is utilized in the remainder of this article since the size of the on-chip power grid is not sufficiently large and the frequency is not sufficiently high to consider transmission line effects. A current density model of a regular and voltage stacked power network is shown in Fig. 4. In a regular nonstacked power grid, the current flow is based on the power distribution cell, as shown in Fig. 4(a). In an ideal case, where the current loads are evenly distributed across an IC, the current from the package to the on-chip loads is limited to the power cell, producing a low impedance path. Alternatively, in a voltage stacked system, the cross-core current passes from the header layer to the intermediate layers through the same power grid, as shown in Fig. 4(b). The length of this current path is, however, significantly longer than a regular nonstacked power grid.

The current density is

$$
J = \frac{I_{\text{total}}}{N \cdot A} \tag{1}
$$

where  $I_{total}$  and  $A$  are, respectively, the total current flowing through the cross-core path and the cross-sectional area of the power metal line,  $I_{total}$  and *A* are the same in both nonstacked and voltage stacked power grids, and *N* is the total number of metal lines supporting the cross-core current paths. In a nonstacked power grid, *N* is  $2 \times N_{\text{bump}}$ , where

*N*bump is the number of power bumps. Alternatively, in a voltage stacked power grid, *N* is the number of metal lines connecting the adjacent layers. The current density of a nonstacked and voltage stacked power network is, respectively, 625 and  $25000$  A/mm<sup>2</sup>. The voltage stacked power network, therefore, exhibits a  $40\times$  higher current density as compared to a nonstacked power network, which is impractical due to electromigration constraints, as suggested by the Black expression [23].

The dc voltage drop due to the cross-core current paths is

$$
V_{\text{drop}} = J \cdot A \cdot R_{\text{unit}} \cdot D_{\text{eff}} \tag{2}
$$

where  $R_{\text{unit}}$  is the unit resistance of the metal line and  $D_{\text{eff}}$  is the effective distance of this current path.  $D_{\text{eff}}$  within the two power grids are equal, respectively, to the radius of the power cell and the size of the layer, leading to a  $1600 \times$  increase in voltage drop in a voltage stacked power grid as compared with a nonstacked power grid. Observe that a conventional on-chip power grid is not designed to consider cross-core current paths. Due to the significantly greater current density and voltage drop, dedicated power delivery is necessary for voltage stacked systems.

## *B. Power Delivery Network for DPP Systems*

As discussed in Section II, load balancing circuits are a critical issue in achieving a target power noise. By processing the mismatched power between loads, DPP [24] is an effective technique to regulate different layers within a voltage stacked system. Two converter topologies exist in DPP, stack-to-stack, and stack-to-bus. The power delivery system for a DPP converter varies significantly between these two topologies. The parasitic characteristics, therefore, affect these two topologies differently. The effects of the parasitic impedance on these two topologies are discussed here. Note that the power delivery system described in this section is different from the power system discussed in Section III-A. The focus of this section is on the power delivery system within a DPP system, which is placed between the voltage stacked system and the load balancing circuits. The current paths between the stacked layers and the converters are the primary issue. The focus of Section III-A is on the power delivery system between stacked layers, where the current path between layers is the primary issue.

The topology of a DPP system is a critical factor affecting the operational behavior and performance of load balancing. Multiple topologies for DPP can be applied to voltage stacked systems depending upon the connections between the inputs and outputs between the layers. Stack-to-bus and stack-to-stack are two common topologies for DPP, as shown in Fig. 5. In the stack-to-bus topology [see Fig. 5(a)], *n* dc/dc converters are required for an *n* layer voltage stacked system. The input of each dc/dc converter is connected to the same power bus, whereas the output of each dc/dc converter is connected to a nearby layer. In the stack-to-stack topology [see Fig. 5(b)],  $n - 1$  dc/dc converters are connected between a nearby layer and two adjacent layers. Multiple types of converters, including a buck–boost converter and



Fig. 5. Two VR topologies utilizing DPP in a voltage stacked system. (a) Stack-to-bus topology, where *n* dc/dc converters are required for an *n* layer voltage stacked system, and (b) stack-to-stack topology, where  $n - 1$  dc/dc converters are required for an *n* layer voltage stacked system.

a transformer-based converter, can be utilized as the dc/dc converter within these topologies. For example, a multiphase buck–boost converter can provide dc/dc conversion in a stackto-stack topology [14], [15]. By varying the duty cycle of the buck converters, any voltage variations due to load imbalances can be regulated.

Due to the multidomain characteristics of a voltage stacked system and the complex nature of a DPP system, power delivery to the overall system can be quite complicated. The current paths and parasitic impedance of a power delivery system within a stack-to-bus and stack-to-stack topology are shown in Fig. 6. Five voltage domains, P1, G1 (P2), G2 (P3), G3 (P4), and G4, exist in a four-layer voltage stacked system with both DPP topologies. The ground net G of the upper layer is the power net P of the adjacent lower layer, as shown in Fig. 6.

The effects of the parasitic impedance on these two DPP topologies are quite different. A zoom-in of net G1 (P2) within the stack-to-bus and stack-to-stack topologies is shown, respectively, in Fig. 6(a) and (b). The parasitic impedance within the power delivery system is shown, where S-S and S-VR represent, respectively, the parasitic impedance between adjacent stacked layers and the parasitic impedance between the layers and converters. The current flowing through S-S and S-VR is highlighted by an arrow. As shown in Fig. 6(a), in a stack-to-bus topology, converter *n* is only connected to layer *n*. A self-contained current loop is formed within layer *n*, where the current flows from converter *n* to layer *n* and returns to converter *n*. The current regulated by converter *n* does not flow to the other layers. This self-contained current loop does not exist in a stack-to-stack topology.

In a stack-to-stack topology, the current from converter *n* can either flow to layer *n* or layer  $n + 1$  depending upon the load imbalance scenario. In addition, multiple converters, including converter  $n-1$ , converter *n*, and converter  $n+1$ , are connected to layer *n*, as shown in Fig. 6(b). The voltage of layer *n* is therefore determined by the interactions among the three converters, leading to a complex control system for the DPP. A multi-input multi-output (MIMO) control system is required to coordinate the regulation of each layer, increasing the design complexity as well as the cost of the voltage



Fig. 6. Current path and parasitic impedance of a power delivery system within (a) stack-to-bus topology, and (b) stack-to-stack topology.



Fig. 7. Resonant converter-based DPP system with stack-to-bus topology in a four-layer voltage stacked system.

stacked system. Moreover, the current produced by the converter flows through the S-S parasitic impedance, producing a significant voltage drop within the stack-to-stack topology.

#### *C. Resonant Converter-Based Stack-to-Bus Topology*

Delivering power to a voltage stacked DPP system is discussed in Section III-B. A case study of both power delivery systems is provided. Due to certain advantages, such as a relatively straightforward regulation scheme, well-characterized current paths, and a self-contained current loop, a stackto-bus DPP system is considered here. As shown in Fig. 7, the dc/dc converter is based on a transformer-based resonant converter [25], where the input of the converter is connected to the system power bus and the output is connected to the stacked layer. Transformer-based resonant converters are utilized in the case study due to the advantages of high efficiency and the inherent isolation between the input and output. Four converters are required to support a four-layer voltage stacked system.

The design specifications of the DPP system are listed in Table II. Note that the value of the parasitic impedances listed in Table II is based on an impedance extraction of a system-in-package [19]. The values can also be applied

TABLE II SPECIFICATION OF RESONANT CONVERTER-BASED DPP SYSTEM WITH A STACK-TO-BUS TOPOLOGY

| <b>Specs</b>                | Value                 |
|-----------------------------|-----------------------|
| Frequency                   | 2 MHz                 |
| Parasitic resistance (S-VR) | $0.6 \text{ m}\Omega$ |
| Parasitic inductance (S-VR) | $1.6$ pH              |
| Parasitic resistance (S-S)  | $1.2 \text{ m}\Omega$ |
| Parasitic inductance (S-S)  | 8 pH                  |
| Load imbalance              | $8 - 80$ A            |
| di/dt                       | $7.2$ A/ns            |
| On-chip de-cap              | $4 \times 5 \mu F$    |





Fig. 8. Voltage drop as a function of parasitic resistance and inductance between a stacked layer and a converter. The parasitic resistance ranges from 0.1 to 5 m $\Omega$ . The parasitic inductance ranges from 0 to 50 pH.

to a stack-to-stack topology. Since a transformer-based resonant converter is considered here, on-chip integration of this DPP system is not practical. An off-chip DPP system is therefore assumed. The switching frequency of the converter is accordingly decreased as compared with the on-chip switched-capacitor converter described in Section II. The S-VR parasitic impedance should be carefully considered. Due to the DPP being off-chip, the S-VR parasitic impedance is no longer negligible. Also, due to the self-contained current loop in the stack-to-bus topology, the S-S parasitic impedance exhibits a negligible effect on the power noise. This behavior is due to the differential current generated from converter *n* in layer *n* without passing through the S-S parasitic impedance. A methodology for designing this power delivery system is introduced in Section IV.

The layers within the voltage stacked system are modeled as a resistive load with an 8-A load current. To model a worst case load imbalance scenario, the load current is increased from 8 to 80 A on layer two, whereas the load current on the remaining layers is maintained constant. To suppress the power noise due to this load imbalance, a voltage-controlled oscillator is utilized within each resonant converter. By varying the switching frequency of the resonant converter around the nominal frequency, the voltage across each layer is regulated.

To evaluate the effects of the parasitic impedance of the DPP power network, a range of parasitic resistances and inductances is explored. The limits of the parasitic impedance are set by the assumption that the DPP system is integrated at the package level, forming a system-in-package. The range of



Fig. 9. Cross-sectional view of a tile-based power delivery system within a package, including the P1, G1, and G2 nets.

parasitic resistance and inductance is, respectively,  $0.1-5$  m $\Omega$ and 0–80 pH, common values for a system-in-package [19]. The voltage drop as a function of the parasitic impedance is shown in Fig. 8. The area highlighted by the lightest region illustrates the parasitic impedance that satisfies the noise margin, assuming a 5% noise margin with *V*<sub>DD</sub> equal to 0.8 V [18]. To ensure that the power noise is within the target noise margin, the parasitic resistance and inductance are maintained below, respectively, 1 m $\Omega$  and 20 pH. Note that the parasitic inductance significantly affects the voltage drop when the parasitic resistance is low [26]. The parasitic resistance has a dominant effect on the voltage drop once the magnitude of the parasitic resistance exceeds 1 m $\Omega$ .

# IV. TILE-BASED POWER DELIVERY DESIGN METHODOLOGY FOR VOLTAGE STACKED SYSTEMS

The power delivery system within a voltage stacked DPP system is highly challenging. These major challenges include: 1) the complex nature of a multidomain power delivery network to support a multilayer voltage stacked system, as shown in Fig. 6; 2) the cross-core current paths due to the serially connected power network, as discussed in Section III-A; and 3) the effects of the parasitic impedance within the power network of a DPP system, as discussed in Section III-B. A methodology for designing a tile-based power delivery system, targeting these challenges, is proposed here for voltage stacked DPP systems.

This tile-based power delivery system is intended for a voltage stacked DPP system. Package-level DPP integration is assumed, where the power planes within the package are utilized to construct a tile-based power delivery system. Consider a four-layer voltage stacked system with a stackto-bus DPP system, as shown in Fig. 9. Each square block represents one layer, connected to a dc/dc converter. Four layers are oriented to form a square shape, a practical shape for an IC. The current flow within the voltage stacked DPP system is highlighted by the arrows.

As discussed in Section III-A, conventional on-chip metal lines cannot manage the high currents being transferred from layer to layer. To manage the cross-core current paths, package resources are allocated for this current path in the tile-based power delivery system. A cross-sectional view of the proposed



Fig. 10. Decomposition of the tile-based power delivery system for each power net.

tile-based power delivery system is shown in Fig. 9. The location of the cross section is highlighted by the vertical surface. The power planes and vias connecting the VR, layer one, and layer two as well as the current flow paths are illustrated.

To manage the challenges of a multidomain power network, a two-layer, interdigitated power plane is proposed for the tile-based power network. The power domains are isolated from each other. Note that two layers are the minimum number of power planes for this tile-based power delivery system, where a two-layer system is assumed here as an example for the following discussion. Consider the P1, G1, and G2 power domains, as shown in Fig. 9. The P1 power plane is beneath dc/dc converter 1 and layer 1. The G1 power plane is one layer above the P1 power plane, occupying the area beneath dc/dc 1, layer 1, and layer 2. The G2 power plane is one layer beneath the P1 power plane, occupying the area beneath layer 2. The P1 and G2 power domains are separated in Fig. 9 by a dashed line. Only two layers are required for the five power domains, P1, G1, G2, G3, and G4. The power planes for the different nets are separated by the square shape of each layer within the voltage stacked system, forming a tile-shaped power plane.

A top view of the tile-based power network as well as decomposition of each power net is shown in Fig. 10. Four dc/dc converters and five power nets are included. The gray squares, numbered as 1–4, represent the four layers in the voltage stacked system. The hatched squares, numbered as 5–8, represent the four dc/dc converters in the DPP system. The dc/dc converters are oriented to ensure that each converter is next to the connected layer. The five tiles with increasing gray level represent power net P1, G1, G2, G3, and G4. The number on the tiles illustrates the horizontal location of each tile. Consider an example of power net P1. The numbers, 5 and 1, on this tile mean that the two squares within the tile are beneath converter 1 and layer 1, which are numbered, respectively, 5 and 1. The glowing effect around the tiles illustrates the vertical location of each tile. Squares without the glowing effect mean that this tile is located at the bottom layer of the two-layer power planes. The squares with the glowing effect mean that this tile is located on the top layer of the two-layer power planes, as shown in Fig. 9. Each number has one square with the glowing effect and one square without the glowing effect, indicating that, for each converter square or stacked layer square, two power plane layers exist beneath the square. As discussed in Section III-B, the parasitic impedance between the stacked layer and converter produces



Fig. 11. Tile-based power network matching the circuit model of the stackto-bus topology.



Fig. 12. Voltage drops with a change of activity factor ranging from 10% to 50% for a tile-based system and a system without voltage regulators.

significant power noise. To manage this challenge, the proposed tile-based power delivery network is designed to be scalable to a higher number of power planes, reducing the overall parasitic impedance within the DPP system.

Consider an example of the G1 net utilizing the tile-based power delivery system. Three current paths flow within this power domain, as shown in Fig. 11: 1) the current loop between converter 2 and layer 2 for regulation; 2) the current from layer 2 to layer 3; and 3) the current loop between converter 3 and layer 3 for regulation. The tile-based power network matches quite well with the circuit model of the stackto-bus topology, as discussed in Section III-B. In this case, parasitic extraction of the tile-based power delivery system can be directly integrated into circuit simulation and verification processes. The parasitic impedance is based on the square shape, which can be extracted by an EM solver [27]. Note that other power delivery systems that do not follow this proposed tile-based design methodology may introduce an additional parasitic impedance that is not characterized by the circuit model. The intuitive net separation and tile shape ease the design process and characterization of the parasitic impedance, making this tile-based design methodology a useful method for exploring different power delivery systems within a voltage stacked system. This design methodology can also be applied to the stack-to-stack topology.

The simulation framework discussed in Section III-C of the proposed tile-based power delivery system is applied. The parasitic extraction process is based on [27]. A comparison of the voltage drop within a tile-based system and a system without voltage regulators is shown in Fig. 12. With no change in the activity factor, the voltage drop of the two systems remains zero since the loads are balanced. If the activity factor

|           | Input   |                    |   | Voltage                    | Load                                | Power network                       | Power network                |
|-----------|---------|--------------------|---|----------------------------|-------------------------------------|-------------------------------------|------------------------------|
|           | voltage | Converter          |   | # of Stacks stacking level | imbalance                           | parasitic impedance physical design |                              |
| [9]       | 4.1 V   | On/off-chip Hybrid | 4 | $On$ -chip                 | N/A                                 | N/A                                 | N/A                          |
| $[11]$    | 3.6 V   | Switched capacitor | 4 | $On$ -chip                 | $0$ to $40$ mA                      | N/A                                 | N/A                          |
| $[14]$    | 12 V    | Buck / boost       | 4 | Board level                | 8 to 14 A                           | N/A                                 | N/A                          |
| $[15]$    | N/A     | Multi-level ladder | 4 | Board level                | $\vert 0 \rangle$ to 500 mA $\vert$ | N/A                                 | N/A                          |
| $[18]$    | 3.2 V   | Switched capacitor | 4 | $On$ -chip                 | 40 to 44 A                          | N/A                                 | N/A                          |
| This work | 3.2 V   | Resonant converter | 4 | On-chip                    | 8 to 80 A                           | $\sqrt{(S-S \& S-VR)}$              | $\sqrt{\text{(Tile-based)}}$ |

TABLE III COMPARISON WITH OTHER VOLTAGE STACKED SYSTEMS

TABLE IV STACK-TO-BUS DPP SYSTEM WITH A TILE-BASED POWER DELIVERY SYSTEM COMPOSED OF TEN POWER PLANES





Fig. 13. Maximum voltage drop as a function of the number of power planes in tile-based power delivery systems.

of one of the layers in a voltage stacked system increases, the voltage in that layer becomes lower. The first group shown in Fig. 12 illustrates the voltage drop of a tile-based system, whereas the second group represents a system without voltage regulators. The tile-based system exhibits a much lower voltage drop with a change of activity factor ranging from 10% to 50%, as compared with the system without regulators.

The tile-based power delivery system is also capable of handling a high load imbalance. A load imbalance ranging from 8 to 80 A is applied to the aforementioned simulation framework. A maximum voltage drop of 46 mV is exhibited, as listed in Table IV. Ten layers within the power plane of the package are assumed. Ten layers are a practical number within a package, typical in high-performance computing systems [19]. The maximum voltage drop significantly decreases with additional power planes, as shown in Fig. 13. Note that the layer number shown in Fig. 13 includes all of

the power networks within the voltage stacked system. No additional power planes are required.

A comparison between this work and state-of-the-art voltage stacked systems is listed in Table III. Voltage stacking can be applied both on-chip and off-chip. An *N* times current reduction through the BGA and bumps is noted if the stacks are on-chip. Note that the primary objective of this article is not converter design but rather to provide guidelines for the power delivery system within voltage stacked systems, supporting high currents and load imbalances between the layers. Due to the unique current paths within voltage stacked systems, the effects of the parasitic impedance of the power delivery system require reevaluation. A power integrity driven methodology for the physical design of the power delivery system is therefore necessary. These issues are the focus of this work.

## V. CONCLUSION

The challenge of load imbalances in a high current, voltage stacked system is discussed in this article. The power delivery system for a voltage stacked DPP system is a primary issue. Note that the on-chip power grid cannot manage cross-core current paths within a voltage stacked system. Two topologies, stack-to-stack and stack-to-bus, for DPP systems are evaluated, demonstrating a challenging power delivery design process due to the effects of the parasitic impedance. Targeting these challenges, a design methodology for a tile-based power delivery system is proposed. This tile-based power delivery system is effective in mitigating certain challenges, such as the complex nature of a multidomain power network, the cross-core current paths flowing between layers, and the effects of the parasitic impedance on the DPP system. The tile-based design methodology provides an intuitive and efficient process for characterizing the impedance within a power delivery system in voltage stacked structures.

#### **REFERENCES**

- [1] T. Paul. (2018). [Online]. Available: https://www.nextplatform.com/2018/ 05/10/tearing-apart-googles-tpu-3-0-ai-coprocessor/
- [2] (2018). *Intel ARK (Product Specs)*. [Online]. Available: https://ark.intel. com/products/95830/Intel-Xeon-Phi-Processor-7295-16GB-1-50-GHz-72-core
- [3] S. K. Sadasivam, B. W. Thompto, R. Kalla, and W. J. Starke, "IBM Power9 processor architecture," *IEEE Micro*, vol. 37, no. 2, pp. 40–51, Mar. 2017.
- [4] (2018). *Nvidia*. [Online]. Available: https://images.nvidia.com/content/ volta-architecture/pdf/volta-architecture-whitepaper.pdf
- [5] (2017). *Radeon Technologies Group (AMD)*. [Online]. Available: https:// www.amd.com/en
- [6] I. P. Vaisband, R. Jakushokas, M. Popovich, A. V. Mezhiba, S. Kose, and E. G. Friedman, *On-Chip Power Delivery and Management*, 4th ed. Springer, 2016.
- [7] E. Salman and E. G. Friedman, *High Performance Integrated Circuit Design*. New York, NY, USA: McGraw-Hill, 2012.
- [8] *International Roadmap for Devices and Systems*, IRDS Roadmap Teams, 2017.
- [9] A. Zou, J. Leng, X. He, Y. Zu, V. J. Reddi, and X. Zhang, "Efficient and reliable power delivery in voltage-stacked manycore system with hybrid charge-recycling regulators," in *Proc. 55th ACM/ESDA/IEEE Design Autom. Conf. (DAC)*, Jun. 2018, p. 43.
- [10] Q. Zhang, L. Lai, M. Gottscho, and P. Gupta, "Multi-story power distribution networks for GPUs," in *Proc. Design, Autom. Test Eur. Conf. Exhib. (DATE)*, 2016, pp. 451–456.
- [11] T. Tong, S. K. Lee, X. Zhang, D. Brooks, and G.-Y. Wei, "A fully integrated reconfigurable switched-capacitor DC-DC converter with four stacked output channels for voltage stacking applications," *IEEE J. Solid-State Circuits*, vol. 51, no. 9, pp. 2142–2152, Sep. 2016.
- [12] E. K. Ardestani *et al.*, "Managing mismatches in voltage stacking with CoreUnfolding," *ACM Trans. Archit. Code Optim.*, vol. 12, no. 4, pp. 43:1–43:26, Nov. 2015.
- [13] K. Blutman, H. Fatemi, A. Kapoor, A. B. Kahng, J. Li, and J. Pineda de Gyvez, "Logic design partitioning for stacked power domains," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 25, no. 11, pp. 3045–3056, Nov. 2017.
- [14] C. Schaef and J. T. Stauth, "Efficient voltage regulation for microprocessor cores stacked in vertical voltage domains," *IEEE Trans. Power Electron.*, vol. 31, no. 2, pp. 1795–1808, Feb. 2016.
- [15] K. Kesarwani, C. Schaef, C. R. Sullivan, and J. T. Stauth, "A multi-level ladder converter supporting vertically-stacked digital voltage domains," in *Proc. 28th Annu. IEEE Appl. Power Electron. Conf. Expo. (APEC)*, Mar. 2013, pp. 429–434.
- [16] P. S. Shenoy and P. T. Krein, "Differential power processing for DC systems," *IEEE Trans. Power Electron.*, vol. 28, no. 4, pp. 1795–1806, Apr. 2013.
- [17] K. T. Zhan *et al.*, "Serial power supply circuit virtual digital coin mining machine and computer server," China Patent CN 10 504 536 4A, 2016.
- [18] K. Xu and E. G. Friedman, "Challenges in high current on-chip voltage stacked systems," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, Oct. 2020, pp. 1–5.
- [19] K. Xu, B. Vaisband, G. Sizikov, X. Li, and E. G. Friedman, "Power noise and near-field EMI of high-current system-in-package with VR top and bottom placements," *IEEE Trans. Compon., Package. Manuf. Technol.*, vol. 9, no. 4, pp. 712–718, Apr. 2019.
- [20] L. Chang *et al.*, "A fully-integrated switched-capacitor 2:1 voltage converter with regulation capability and 90% efficiency at 2.3A/mm2," in *Proc. IEEE Symp. VLSI Circuits*, Jun. 2010, pp. 55–56.
- [21] A. V. Mezhiba and E. G. Friedman, "Scaling trends of on-chip power distribution noise," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 12, no. 4, pp. 386–394, Apr. 2004.
- [22] K. Xu, R. Patel, P. Raghavan, and E. G. Friedman, "Exploratory design of on-chip power delivery for 14, 10, and 7 nm and beyond FinFET ICs," *Integration*, vol. 61, pp. 11–19, Mar. 2018.
- [23] J. R. Black, "Electromigration—A brief survey and some recent results," *IEEE Trans. Electron Devices*, vol. ED-16, no. 4, pp. 338–347, Apr. 1969.
- [24] H. Jeong, H. Lee, Y.-C. Liu, and K. A. Kim, "Review of differential power processing converter techniques for photovoltaic applications," *IEEE Trans. Energy Convers.*, vol. 34, no. 1, pp. 351–360, Mar. 2019.
- [25] K. Xu, B. Vaisband, G. Sizikov, X. Li, and E. G. Friedman, "Distributed sinusoidal resonant converter with high step-down ratio," in *Proc. IEEE 26th Conf. Electr. Perform. Electron. Packag. Syst. (EPEPS)*, Oct. 2017, pp. 1–3.
- [26] M. A. El-Moursy and E. G. Friedman, "Shielding effect of on-chip interconnect inductance," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 13, no. 3, pp. 396–400, Mar. 2005.
- [27] K. Xu, E. G. Friedman, M. Popovich, and G. Sizikov, "Distributed port assignment for extraction of power delivery networks," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, Oct. 2020, pp. 1–4.



**Kan Xu** received the B.S. degree in electrical engineering from the North China University of Water Resources and Electric Power, Zhengzhou, China, in 2012, and the M.S. and Ph.D. degrees in electrical and computer engineering from the University of Rochester, Rochester, NY, USA, in 2015 and 2020, respectively.

His research interests include on-chip and package level power delivery networks, 3-D integration, and high-current HPC systems.



**Nurzhan Zhuldassov** received the B.S. degree in electrical and computer engineering from Nazarbayev University, Astana, Kazakhstan, in 2018, and the M.S. degree in electrical and computer engineering from the University of Rochester, Rochester, NY, USA, in 2019, where he is currently pursuing the Ph.D. degree in electrical and computer engineering.

His research interests include on-chip and package level power delivery networks, and cryogenic operation of the MOSFETs.



**Eby G. Friedman** (Fellow, IEEE) received the B.S. degree from the Lafayette College, Easton, PA, USA, and the M.S. and Ph.D. degrees from the University of California at Irvine, Irvine, CA, USA, all in electrical engineering.

He was with Hughes Aircraft Company, Glendale, CA, USA, from 1979 to 1991, rising to a Manager of the Signal Processing Design and Test Department, where he was responsible for the design and test of high-performance digital and analog ICs. He has been with the Department of Electrical and

Computer Engineering, University of Rochester, Rochester, NY, USA, since 1991, where he is a Distinguished Professor and the Director of the High Performance VLSI/IC Design and Analysis Laboratory. He is also a Visiting Professor with the Technion-Israel Institute of Technology, Haifa, Israel. He has authored over 500 articles and book chapters and authored or edited 19 books in the fields of high-speed and low-power CMOS design techniques, 3-D design methodologies, high-speed interconnect, superconductive circuits, and the theory and application of synchronous clock and power distribution networks, and he holds 22 patents. His current research and teaching interests include high-performance synchronous digital and mixed-signal circuit design and analysis with application to high-speed portable processors, low-power wireless communications, and server farms.

Dr. Friedman is a Senior Fulbright Fellow, a National Sun Yat-sen University Honorary Chair Professor, and an Inaugural Member of the UC Irvine Engineering Hall of Fame. He was a recipient of the IEEE Circuits and Systems Mac Van Valkenburg Award, the IEEE Circuits and Systems Charles A. Desoer Technical Achievement Award, the University of Rochester Graduate Teaching Award, and the College of Engineering Teaching Excellence Award. He was the Editor-in-Chief and the Chair of the Steering Committee of the IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS and the *Microelectronics Journal*, a Regional Editor of the *Journal of Circuits, Systems and Computers*, an editorial board member of numerous journals, and a program and technical chair of several IEEE conferences.