# Clock Distribution Models of 3-D Integrated Systems

Ioannis Savidis, Vasilis Pavlidis, and Eby G. Friedman Department of Electrical and Computer Engineering

University of Rochester

Rochester, NY 14627

[iosavid, friedman]@ece.rochester.edu, vasileios.pavlidis@epfl.ch

Abstract—Clock distribution topologies in a three-tier 3-D integrated circuit are explored. Models of three different clock topologies are applied to determine the root to leaf delay. The models incorporate the impedance of the 3-D via between planes based on closed-form expressions of the resistance, inductance, and capacitance of a through silicon via (TSV). The resulting modeled delays are compared to experimental data. Good agreement between simulation and experimental data is achieved.

# I. INTRODUCTION

The era of rapid technology scaling has brought revolutionary advancements in systems level integration. A potential technology that continues the evolution towards gigascale complexity is three-dimensional integration.

Three-dimensional integration is a novel technology of growing importance with the potential to offer significant performance and functional benefits as compared to conventional 2-D ICs [1]. 3-D integration provides enhanced interconnectivity, a high device integration density, a reduction in the number and length of the long global wires, and the potential to combine disparate heterogenous technologies [2]. The primary technological innovation required to exploit the benefits of 3-D integration is the through silicon 3-D via (TSV). Models characterizing the electrical behavior of both single [3] and bundled [4] TSVs have been developed.

A focus on circuit level design of 3-D integrated systems is a topic of great urgency. One such critical component is circuit synchronization. The complexity of delivering the clock signal is further exacerbated in 3-D ICs as sequential elements synchronized by the same clock signal can be located on multiple planes. In addition, since the clock network dissipates a significant portion of the total power consumed by a synchronous circuit [5], the design of a 3-D clock distribution network is further constrained due to greater power density and related thermal concerns.

Symmetric interconnect structures, such as H- and X-trees, are often utilized to distribute the clock signal across a circuit [6]. The symmetry of these structures permits the clock signal to arrive at the leaves of the tree at approximately the same time, resulting in reduced clock skew between leaves. Maintaining this symmetry within a 3-D circuit is a significantly more complex task that requires additional design resources.

In this paper, models of three different clock distribution topologies are presented. The root to leaf delay on each plane of a three-plane 3-D integrated circuit is also determined. A brief description of the test circuit and fabrication technology is provided in Section II. An overview of the closed-form expressions that characterize the electrical properties of TSVs to propagate the clock signal between planes is presented in Section III. The clock distribution models are discussed in Section IV. The resulting root to leaf delay (for the leaves on each of the three planes) for the three different clock topologies is presented in Section V. A comparison of the modeled results with experimental data from a 3-D test circuit manufactured by MIT Lincoln Labs (MITLL) is also described in Section V. Finally, some concluding remarks are provided in the final section of the paper.

#### **II. OVERVIEW OF 3-D TEST CIRCUIT**

The test circuit consists of three blocks. Each block includes the same logic circuit but implements a different clock distribution architecture. The total area of the test circuit is 3 mm  $\times$  3 mm, where each block occupies an approximate area of 1 mm<sup>2</sup>. Each block contains about 30,000 transistors with a power supply voltage of 1.5 volts.

The manufacturing process developed by MITLL for fully depleted silicon-on-insulator (FDSOI) 3-D circuits is summarized here [7], [8]. The MITLL process is a wafer level 3-D integration technology with up to three FDSOI wafers bonded to form a 3-D circuit. The diameter of the wafers is 150 mm. The minimum feature size of the devices is 180 nm, with one polysilicon layer and three metal layers interconnecting the devices on each wafer. A backside metal layer also exists on the upper two planes, providing the starting and landing pads for the TSVs, and the I/O, power supply, and ground pads for the entire 3-D circuit. An attractive feature of this process is the high density TSVs. The dimensions of these vias are 1.75  $\mu m \times 1.75 \ \mu m$ .

### III. CLOSED-FORM EXPRESSIONS OF TSV ELECTRICAL PARAMETERS

Closed-form expressions of the TSV resistance, inductance, and capacitance are presented in this section. The TSV diameter, length, and plane-to-plane distance are based on the 3-D vias manufactured by MITLL. The resulting resistance, inductance, and capacitance are compared to numerical simulation, and listed in Table I.

The resistance of a 3-D via is [3]

$$R_{DC} = \frac{1}{\sigma_{w}} \frac{\mathfrak{L}}{\pi \mathfrak{R}^{2}},\tag{1}$$

$$R_{1GHz} = \begin{cases} \alpha \frac{1}{\sigma_{w}} \frac{\mathfrak{L}}{\pi[\mathfrak{R}^{2} - (\mathfrak{R} - \delta)^{2}]}, & \text{if } \delta < \mathfrak{R} \qquad (2) \\ \alpha \frac{1}{\sigma_{w}} \frac{\mathfrak{L}}{\pi\mathfrak{R}^{2}}, & \text{if } \delta \ge \mathfrak{R}, \qquad (3) \end{cases}$$

This research is supported in part by the National Science Foundation under Contract Nos. CCF-0541206, CCF-0811317, and CCF-0829915, grants from the New York State Office of Science, Technology & Academic Research to the Center for Advanced Technology in Electronic Imaging Systems, and by grants from Intel Corporation, Eastman Kodak Company, and Freescale Semiconductor Corporation.



Fig. 1. Frequencies for which the closed-form inductance expressions are valid.

where  $\Re$  and  $\mathfrak{L}$  are the TSV radius and length, respectively.  $\sigma_w$  is the conductivity of tungsten. The skin depth  $\delta$  reduces the cross-sectional area of the 3-D via. An empirical constant  $\alpha$  is used to fit the 1 GHz resistance to the simulation data, and is based on the physical parameters  $\mathfrak{L}$  and diameter *D*. Both  $\delta$  and  $\alpha$  are provided, respectively, in (4), and (5) and (6).

$$\delta = \frac{1}{\sqrt{\pi f \mu_o \sigma_w}},\tag{4}$$

$$\alpha = \begin{cases} 0.0472 D^{0.2831} ln(\frac{\mathfrak{L}}{D}) + 2.4712 D^{-0.269}, & \text{if } \delta < \mathfrak{R} \\ 0.0091 D^{1.0806} ln(\frac{\mathfrak{L}}{D}) + 1.0518 D^{0.092}, & \text{if } \delta \ge \mathfrak{R}. \end{cases}$$
(6)

For frequencies other than DC and 1 GHz, the resistance described by (1)-(3) is adjusted by (7),

$$R_{f_{new}} = (R_{1GHz} - R_{DC}) \sqrt{\frac{f_{new}}{f_{1GHz}}} + R_{DC}.$$
 (7)

The inductance of a 3-D via is described by (8) - (11) [3]. These four equations express the self-  $(L_{11})$  and mutual inductance  $(L_{21})$  of a via at both DC and high frequency. The expressions for the high frequency inductance describe the asymptotic value of the inductance. The range of frequencies for which the closed-form inductance expressions is valid is depicted in Figure 1. The DC and  $f_{asym}$  self-inductance of a TSV is described, respectively, by (8) and (10), while the DC and  $f_{asym}$  mutual inductance of a TSV are described, respectively, by (9) and (11).

$$DC: \begin{cases} L_{11} = \alpha \frac{\mu_o}{2\pi} [ln(\frac{\mathfrak{L} + \sqrt{\mathfrak{L}^2 + \mathfrak{R}^2}}{\mathfrak{R}})\mathfrak{L} + \mathfrak{R} - \sqrt{\mathfrak{L}^2 + \mathfrak{R}^2} + \frac{\mathfrak{L}}{4}], \quad (8) \end{cases}$$

$$L_{21} = \beta \frac{\mu_o}{2\pi} \left[ ln(\frac{\pounds + \sqrt{\pounds^2 + P^2}}{P})\pounds + P - \sqrt{\pounds^2 + P^2} \right], \tag{9}$$

$$f \qquad : \int L_{11} = \alpha_{\frac{\mu_o}{2\pi}} |ln(\frac{2\mathfrak{L}}{\mathfrak{R}})\mathfrak{L} - 1|, \tag{10}$$

$$\int_{asym} \left\{ L_{21} = \beta \frac{\mu_o}{2\pi} \left[ ln \left( \frac{\mathfrak{L} + \sqrt{\mathfrak{L}^2 + P^2}}{P} \right) \mathfrak{L} + P - \sqrt{\mathfrak{L}^2 + P^2} \right].$$
(11)

The inductance expressions are dependent on the length  $\mathfrak{L}$ and radius  $\mathfrak{R}$  of the TSV. The radius is replaced by the pitch P for the expressions characterizing the mutual inductance  $L_{21}$  between two TSVs.  $\alpha$  is used to adjust the partial selfinductance, and approaches unity at DC and 0.94 at high frequencies with increasing aspect ratio  $\frac{\mathfrak{L}}{D}$ . The  $\beta$  parameter, used to adjust the partial mutual inductance, is unity at DC and ranges between 0.49 and 0.93 at high frequencies with increasing aspect ratio [3]. Both  $\alpha$  and  $\beta$  are described, respectively, by (12) and (13), and (14) and (15). Each parameter is determined at DC and  $f_{asym}$ .

$$\alpha = \begin{cases} 1 - e^{\frac{-4.3\mathfrak{L}}{D}}, & \text{if } f = \text{DC}, \end{cases}$$
(12)

$$(0.94 + 0.52e^{-10|\overline{D-1}|}, \text{ if } f > f_{asym},$$
 (13)

$$\beta = \begin{cases} 1, & \text{if } f = \text{DC}, \quad (14) \\ 0.1535 ln(\frac{\mathfrak{L}}{D}) + 0.592, & \text{if } f > f_{asym}. \quad (15) \end{cases}$$

The capacitance of a bulk 3-D via is [3]

$$C = \alpha \beta \cdot \frac{\varepsilon_{Si}}{t_{diel} + \frac{\varepsilon_{Si}}{\varepsilon_{Si}} x_{dTp}} 2\pi \Re H, \qquad (16)$$

where  $\Re$  is the radius of the TSV,  $\pounds$  is the TSV length,  $t_{diel}$  is the thickness of the dielectric surrounding the 3-D via,  $x_{dT_p}$  is the depth of the depletion region of p-type silicon, and  $\varepsilon_{SiO2}$  and  $\varepsilon_{Si}$  are, respectively, the electrical permittivity of silicon dioxide and silicon. The depletion region is dependent on the p-type silicon work function  $\phi_{f_p}$ , where  $n_i$  is the intrinsic semiconductor concentration,  $N_A$  is the silicon doping concentration ( $10^{21} \text{ m}^{-3}$ ), and  $V_{th}$  is the thermal voltage. Both  $x_{dT_p}$  and  $\phi_{f_p}$  are described, respectively, by (17) and (18).

$$x_{dTp} = \sqrt{\frac{4\varepsilon_{SI}\phi_{fp}}{qN_A}},\tag{17}$$

$$\phi_{fp} = V_{th} ln(\frac{N_A}{n_i}). \tag{18}$$

The  $\alpha$  and  $\beta$  fitting parameters adjust the capacitance for two physical factors; 1) the distance to the ground plane, and 2) the diminishing contribution of the upper portion of the 3-D via to the total capacitance relative to a ground plane below the via. These fitting parameters are

$$\alpha = (-0.0351 \frac{\mathfrak{L}}{D} + 1.5701) S_{gnd}^{0.0111 \frac{\mathfrak{L}}{D} - 0.1997}, \qquad (19)$$

$$\beta = 5.8934 D^{-0.553} \left(\frac{\mathfrak{L}}{D}\right)^{-(0.0031D+0.43)}.$$
 (20)

The electrical parameters describing a TSV (listed in Table I) are based on the 3-D via structures in the MITLL fabricated test circuit. The TSV characteristic parameters are  $\Re = 1 \ \mu m$ ,  $\mathfrak{L} = 8.5 \ \mu m$ , and  $P = 5 \ \mu m$ . The resulting resistance, inductance, and capacitance are used in the clock distribution network models discussed in Section IV. The equivalent electrical circuit of a TSV is shown in Figure 2.

TABLE I COMPARISON OF NUMERICAL SIMULATIONS AND ANALYTIC EXPRESSIONS OF THE TSV ELECTRICAL PARAMETERS.

|   |                                   | Numerical  | Analytic    |         |
|---|-----------------------------------|------------|-------------|---------|
|   | Electrical parameters             | simulation | expressions | % Error |
|   | DC resistance (mΩ)                | 148        | 154         | 4.1     |
|   | 1 GHz resistance $(m\Omega)$      | 166        | 177         | 6.6     |
|   | DC self-inductance (pH)           | 3.9        | 3.9         | 0       |
| Ì | $f_{asym}$ self-inductance (pH)   | 2.9        | 3.1         | 6.9     |
|   | DC mutual inductance (pH)         | 1.40       | 1.32        | -5.7    |
|   | $f_{asym}$ mutual inductance (pH) | 1.10       | 1.08        | -1.8    |
|   | Capacitance (fF)                  | 1.43       |             |         |



Fig. 2. Structure of TSV and equivalent electrical model.

# IV. CLOCK DISTRIBUTION MODELS

Several clock network topologies for 3-D ICs are described and modeled in this section. These architectures combine different topologies common in 2-D circuits, such as H-trees, rings, and meshes [6]. Each of the three blocks includes a different clock distribution structure, schematically illustrated in Figure 3. The dashed lines depict vertical interconnects implemented by groups of through silicon vias. Multiple TSVs at the connection points between the clock networks are used to lower the resistance of the vertical path while enhancing reliability.

As shown in Figure 3, these topologies range from purely symmetric to highly asymmetric networks. The effect that these topological choices have on the clock delay is modeled and experimentally verified. Since the clock signal is distributed in three dimensions, achieving equidistant signal propagation in a 3-D system is not straightforward. This task is further complicated by the different impedance characteristics of the vertical and horizontal interconnects. The symmetry of an H-tree topology and the load balancing characteristics of rings and meshes are thereby exploited.

In each of the circuit blocks, the clock driver for the entire clock network is located on the second plane. The location of the clock driver is chosen to ensure that the clock signal propagates through identical vertical interconnect paths to the first and third planes, ideally resulting in the same delay. The clock driver is implemented with a traditional chain of tapered buffers [9], [10]. Additionally, buffers are inserted at the leaves of each H-tree in all three topologies. The width of the branches within the H-tree is halved at each branch point [11], with an initial width of 8  $\mu$ m.

The architectures employed in the blocks are:

Block A: All of the planes contain a four level H-tree (equivalent to 16 leaves) with identical interconnect characteristics. All of the H-trees are connected through a group of TSVs at the output of the clock driver. Note that in Figure 3(a) the H-tree on the second plane is rotated by  $90^{\circ}$  with respect to the H-trees on the other two planes. This rotation eliminates inductive coupling between the H-trees. All of the H-trees are shielded with two parallel lines connected to ground.

Block B: A four level H-tree is included in the second plane. All of the leaves of this H-tree are connected by four TSVs to small local rings on the first and third planes, as illustrated in Figure 3(b). As in Block A, the H-tree is shielded with two parallel lines connected to ground. Additional interconnect



Fig. 3. Three 3-D clock distribution networks: (a) H-trees, (b) H-tree and local rings, and (c) H-tree and global rings.  $T_{C_A}$ ,  $T_{C_B}$ , and  $T_{C_C}$  are the root-to-leaf delay for a clock signal propagating from the clock source to the leafs of planes A, B, and C, respectively.

resources form local rings.

Block C: The clock distribution network for the second plane is a shielded four level H-tree. Two global rings are utilized for the other two planes, as shown in Figure 3(c). Buffers are inserted to drive each ring, which are connected by TSVs to the four branch points on the second level of the Htree. The rings on planes A and C are connected to the second level of the H-tree to avoid an unnecessary long ring that would result in a significant capacitive load and to maintain a ring with sides of equal length. Additionally, connecting the ring to the leaves at the perimeter of the H-tree results in a considerable difference in the load among the sinks of the tree, since only the outer leaves are connected to the ring.

The electrical characteristics of the clock distribution network on each plane are determined through numerical simulation. Trend lines for the capacitance, DC resistance, 1 GHz resistance, DC self- and mutual inductance, and the asymptotic self- and mutual inductance approximate the electrical parameters of different length interconnect segments within the clock network. These simulations include two ground return paths spaced 2  $\mu$ m from either side of the clock line. These return paths behave as ground for the electrical field lines emanating from the clock line, resulting in a more accurate estimate of the capacitance.

The electrical path of the clock signals propagating from the root to the leaves of each plane for the H-tree clock topology (see Figure 3(a)) is depicted in Figure 4. The size of the source follower NMOS transistor and the dimensions of the clock buffers at the root, leaves, and output circuitry are listed in Table II. The clock network on each plane is composed of 50  $\mu$ m segments, where a  $\pi$ -model represents the electrical properties of each segment. These 50  $\mu$ m segments model the distributive electrical properties of the interconnect. Similarly, when either meshes (Figure 3(b)) or rings (Figure 3(c)) are used on planes A and C (see Figure 4), each 50  $\mu$ m segment



Fig. 4. Structure of clock path from Figure 3(a) to model clock skew. The number within each oval is the number of parallel TSVs between device planes.

is replaced with an equivalent  $\pi$ -model to more accurately represent the single mesh and ring structure within the test circuit. Note that for the mesh structures, the clock signal is distributed to planes A and C from the leaves of the H-tree in plane B. For the rings topology, the clock signal distributed to planes A and C is driven by buffers at the second level of the H-tree. Both the global rings and local meshes are square topologies with lengths of 500 and 200  $\mu$ m, respectively. These lengths are composed of 50  $\mu$ m long segments. Each segment is replaced with an equivalent  $\pi$ -model for a line width of 4  $\mu$ m. The source follower NMOS transistor located in the output circuitry has a length of 180 nm and a width of 12  $\mu$ m. The interconnect connecting the output pads and circuitry to the leaves on each of the three device planes varies in length from 0 to 150  $\mu$ m depending upon the clock topology (with a line width of 2  $\mu$ m), and is also modeled by an equivalent  $\pi$ -network. The delay from the root to the leaves of each plane is determined from these models.

 TABLE II

 Transistor width of the clock buffers at the root, leaves, and output circuitry (all lengths are 180 nm).

| Buffer location  |          | $W_N \ (\mu m)$ | $W_P ~(\mu m)$ |
|------------------|----------|-----------------|----------------|
| Root             | Buffer 1 | 20              | 50             |
|                  | Buffer 2 | 54              | 136            |
| Leaf             | Buffer 1 | 15              | 38             |
|                  | Buffer 2 | 15              | 38             |
| Output circuitry | Buffer 1 | 2.5             | 7              |
|                  | Buffer 2 | 2.5             | 7              |

# V. COMPARISON OF MODEL WITH EXPERIMENTAL DATA

The clock delay of the three different 3-D clock distribution topologies is reviewed in this section. A list of the delay from the root to the leafs of each plane and the resulting per cent error as compared to experimental data for each topology is provided in Table III. Good agreement between the model and experimental data is demonstrated. A maximum error of less than 10% is achieved for all clock paths within any specific topology. There is one significant discrepancy between the model and experimental data, which occurs in the bottom device plane (tier A) for the local ring topology. This larger error is due to an inaccurate estimate of the capacitive load on the clock drivers at the leaf of this plane. The capacitance is not easily extracted due to a non-systematic approach to placing the decoupling capacitors, and related interplane capacitive coupling. TABLE III MODELED CLOCK DELAY FROM THE ROOT TO THE LEAVES OF EACH PLANE FOR EACH BLOCK, AND PER CENT ERROR BETWEEN MODELED AND EXPERIMENTAL CLOCK DELAY.

|                            | Clock delay [ns] |       |       | Clock delay, % error |       |       |
|----------------------------|------------------|-------|-------|----------------------|-------|-------|
| Clock distribution network | $t_A$            | $t_B$ | $t_C$ | $t_A$                | $t_B$ | $t_C$ |
| H-tree: Figure 3(a)        | 0.359            | 0.355 | 0.351 | 0                    | -4.9  | -7.4  |
| Local rings: Figure 3(b)   | 0.325            | 0.323 | 0.321 | 36.9                 | 8.4   | -6.5  |
| Global rings: Figure 3(c)  | 0.340            | 0.295 | 0.272 | 1.5                  | 5.7   | 9.1   |

### VI. CONCLUSIONS

The design of a clock distribution network for application to 3-D circuits is considerably more complex than the design of a 2-D clock distribution network. Three topologies to globally distribute a clock signal within a 3-D circuit have been evaluated. Clock delay simulations incorporating both numerical simulation and analytic expressions produce comparable results to experimentally extracted clock delay measurements from a fabricated 3-D test circuit, exhibiting less than 10% error.

#### REFERENCES

- V. F. Pavlidis and E. G. Friedman, "Interconnect-Based Design Methodologies for Three-Dimensional Integrated Circuits," *Proceedings of the IEEE*, Vol. 97, No. 1, pp. 123–140, January 2009.
- [2] V. F. Pavlidis and E. G. Friedman, *Three-Dimensional Integrated Circuit Design*, Morgan Kaufmann, 2009.
- [3] I. Savidis and E. G. Friedman, "Closed-Form Expressions of 3-D Via Resistance, Inductance, and Capacitance," *IEEE Transactions on Electron Devices*, Vol. 56, No. 9, pp. 1873–1881, September 2009.
- [4] R. Weerasekera, System Interconnection Design Trade-offs in Three-Dimensional Integrated Circuits, Ph.D. thesis, KTH School of Information and Communication Technologies, Sweden, December 2008.
- [5] W. Bailey and B. J. Benschneider, "Clocking Design and Analysis for a 600-MHz Alpha Microprocessor," *IEEE Journal of Solid State Circuits*, Vol. 22, No. 11, pp. 1627–1633, November 1998.
- [6] E. G. Friedman, "Clock Distribution Networks in Synchronous Digital Integrated Circuits," *Proceedings of the IEEE*, Vol. 89, No. 5, pp. 665– 692, May 2001.
- [7] MIT Lincoln Laboratories, Cambridge, MITLL Low-Power FDSOI CMOS Process Design Guide, 2006.
- [8] J. A. Burns, B. F. Aull, C. K. Chen, C.-L. Chen, C. L. Keast, J. M. Knecht, V. Suntharalingam, K. Warner, P. W. Wyatt, and D.-R. W. Yost, "A Wafer-Scale 3-D Circuit Integration Technology," *IEEE Transactions on Electron Devices*, Vol. 53, No. 10, pp. 2507–2515, October 2006.
- [9] N. Hedenstiema and K. O. Jeppson, "CMOS Circuit Speed and Buffer Optimization," *IEEE Transactions on Computer-Aided Design* of Integrated Circuits and Systems, Vol. 6, No. 2, pp. 270–281, March 1987.
- [10] B. S. Cherkauer and E. G. Friedman, "A Unified Design Methodology for CMOS Tapered Buffers," *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 3, No. 1, pp. 99–111, March 1995.
- [11] H. B. Bakoglu, Circuits, Interconnections, and Packaging for VLSI, Addison-Wesley, 1990.