# Asynchronous Dynamic Single-Flux Quantum Majority Gates

Gleb Krylov<sup>®</sup>, Student Member, IEEE, and Eby G. Friedman<sup>®</sup>, Fellow, IEEE

Abstract—Among the major issues in modern large-scale rapid single-flux quantum (RSFQ) circuits are the complexity of the clock network, tight timing tolerances, poor applicability of existing CMOS-based design algorithms, and extremely deep pipelines, which reduce the effective clock frequency. In this article, asynchronous dynamic single-flux quantum majority gates are proposed to solve some of these problems. The proposed logic gates exhibit high bias margins and do not require significant area or a large number of Josephson junctions as compared to existing RSFQ logic gates. These gates exhibit a tradeoff among the input skew tolerance, clock frequency, and bias margins. Asynchronous logic gates greatly reduce the complexity of the clock network in large-scale RSFQ circuits, thereby alleviating certain timing issues and reducing the required bias currents. Furthermore, asynchronous logic allows existing design algorithms to utilize CMOS approaches for synthesis, verification, and testability. The adoption of majority logic in complex RSFQ circuits also reduces the pipeline depth, enabling higher clock speeds in very large scale integration RSFQ circuits.

*Index Terms*—Single-flux quantum (SFQ), superconductive digital electronics, superconductive integrated circuits.

## I. INTRODUCTION

T HE slowdown in scaling of CMOS technology has led to increasing interest in beyond CMOS computing technologies, including cryogenic and superconductive electronics. Rapid single-flux quantum (RSFQ) [1] is a well established and widely used logic family in superconductive electronics [2]. The increased interest [3] combined with recent advances in fabrication technology for single-flux quantum (SFQ) circuits [4] provides enhanced complexity while requiring novel design algorithms, methodologies, and tools specifically targeting large scale RSFQ circuits [5], [6].

Distribution of the clock signals in very large scale integration (VLSI) RSFQ circuits is an important issue [7]. A distinctive feature of RSFQ circuits is clocked logic gates. In SFQ technology, the data are represented by the presence or absence

Manuscript received August 7, 2019; revised November 8, 2019; accepted January 14, 2020. Date of publication March 4, 2020; date of current version May 5, 2020. This work was supported by the Department of Defense Agency—Intelligence Advanced Research Projects Activity through the U.S. Army Research Office under Contract W911NF-17-9-0001. This paper was recommended by Associate Editor H. Rogalla. (*Corresponding author: Gleb Krylov.*)

The authors are with the Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY 14627 USA (e-mail: gleb.krylov@rochester.edu; friedman@ece.rochester.edu).

Color versions of one or more of the figures in this article are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TASC.2020.2978428

of an SFQ pulse within a clock period. In logic gates, the input pulses are processed by switching Josephson junctions (JJs), and an output pulse is produced based on the target logic function. Most RSFQ logic gates require a clock signal either to release the output pulse or to reinitialize the state of the gate to process the next datum. A large scale circuit utilizing these gates requires a complex clock network. RSFQ circuits are capable of operating at extremely high clock frequencies (up to hundreds of gigahertz [8]), resulting in narrow timing tolerances.

Multiple synchronization approaches exist for reducing the effects of timing on circuit operation. These approaches range from globally asynchronous locally synchronous techniques [9], [10] to fully asynchronous structures [11], [12] and dual rail logic families [13]. These approaches, however, introduce significant performance and area overhead as compared to conventional synchronous RSFQ circuits, partially negating the primary incentives of RSFQ technology.

Another distinctive feature that adversely affects the increasing integration of RSFQ circuits is limited fanout. Typical RSFQ logic gates and flip flops exhibit a fanout of one. A splitter gate is used to deliver a signal to multiple inputs. This property necessitates large splitter trees that can dominate the area of a complex circuit. Insertion of these splitter trees into a logic path increases the delay of the path, significantly reducing the effective clock frequency of the overall system—degrading a primary advantage of RSFQ circuits. RSFQ clock networks consist mostly of splitters, decreasing the area available for logic. Novel solutions are, therefore, necessary to either support larger fanout or to reduce the number of splitters.

Existing RSFQ circuits typically utilize AND/OR/NOT logic with clocked AND and OR gates and, therefore, exhibit these aforementioned issues. An alternative functionally complete logic set is the MAJ/INV set, consisting of majority gates and inverters. Logic synthesis utilizing MAJ/INV logic has been shown to reduce logic depth, power, and delay in both CMOS benchmarks [14] and beyond CMOS technologies [15].

No efficient RSFQ majority gate has currently been described in the literature. A majority gate is typically composed of a combination of AND/OR/NOT gates and requires a clock signal with a network of multiple splitters. This topology, therefore, exhibits a large overhead both in area and delay and is, therefore, infeasible in complex circuits. In other SFQ logic families, such as quantum flux parametron (QFP) [16] and reciprocal quantum logic [17], majority gates exist and are widely used [18], [19].

1051-8223 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.

See https://www.ieee.org/publications/rights/index.html for more information.



Fig. 1. DSFQ storage loops. (a) Traditional dynamic loop. (b) Novel dynamic loop introduced in [20].



Fig. 2. Loop current (a) within a loop with a single time constant (dashed line) and (b) within a loop with dual time constants (solid line).

In this article, novel asynchronous RSFQ majority gates are proposed based on a recently introduced dynamic SFQ (DSFQ) logic family [20]. In Section II, a discussion of DSFQ is provided, and the benefits of DSFQ storage loops are discussed. In Section III, circuit design issues and related parameters for DSFQ loops are discussed. In Section IV, majority gates are introduced, the operation of these gates is described, and margin characteristics are presented. In Section V, the application of majority gates to VLSI RSFQ circuits is discussed. Finally, Section VI concludes this article.

## II. DSFQ STORAGE LOOPS

A novel family of asynchronous RSFQ logic gates—DSFQ gates—is discussed in this section. DSFQ [20] is a recently introduced type of RSFQ logic, where the state of a gate is temporarily stored, and the gate self-resets to the initial state after a period of time. This capability enables asynchronous operation, significantly reducing the size and energy requirements of complex clock networks in large scale RSFQ circuits. DSFQ circuits are more similar to CMOS than regular RSFQ circuits and can be separated into sequential and combinatorial logic [20], enabling the use of relevant CMOS design techniques and algorithms.

The primary features of DSFQ are self-resetting storage loops, as shown in Fig. 1. In conventional RSFQ, logic gates contain storage loops, which temporarily store a state based upon the input pulses received by the gate. The state of these storage loops is typically reset by either the clock pulse (e.g., DFF) or by another input pulse (e.g., Muller C element). In the absence of a reset pulse, logic gates can store a state indefinitely. This capability greatly complicates the application of certain CMOS techniques to RSFQ circuits and exacerbates flux trapping effects, while increasing the complexity of the clock network. Self-resetting storage loops have been used for many years and utilize either resistors or overdamped JJs inserted within the loops for flux leakage [21]–[23]. These loops exhibit a narrow timing tolerance due to a nearly constant rate of flux leakage, controlled by a single time constant. An example of this type of dynamic loop is shown in Fig. 1(a). A recently introduced DSFQ storage loop [20], shown in Fig. 1(b), utilizes a critically damped JJ in parallel with a JJ in series with a resistor to reset the state. This topology produces two different time constants, for hold and reset, combining the slow leakage process during the hold period with fast relaxation to the initial state. A comparison of the current stored within a dynamic loop with single and double time constants, exhibiting a twofold increase in hold time, is shown in Fig. 2.

Novel DSFQ logic gates [20] combine the self-resetting storage loops with existing RSFQ circuits to achieve AND and OR logic functions. As the inversion operations can be propagated to the boundary of the combinatorial logic clouds [24] and integrated into registers with complementary outputs, these gates produce a functionally complete set.

In the following section, operation of DSFQ storage loops is described. The effects of different circuit parameters are also discussed.

#### III. CIRCUIT DESIGN OF DYNAMIC LOOPS

In this section, the operation and circuit components of DSFQ loops are described. The effects of these components on the operation of the logic gate are discussed, and guidelines for these parameters are provided.

DSFQ storage loops require a leakage mechanism for the magnetic flux to escape a superconductive loop. In the DSFQ loop shown in Fig. 1(b), a combination of JJs and resistors is



Fig. 3. Dependence of  $\tau_H$  on  $R_L$  and  $I_C(J_H)$ .

used. Initially, upon arrival of an input fluxon, the loop current is distributed between the two branches of the storage loop. The fraction of total current passing through resistor  $R_L$  produces a small voltage  $V_L$  (~15  $\mu$ V) at the common node  $n_1$ . This current is dissipative and contributes to the initial slow leakage of flux. The voltage  $V_L$  produces a gradual increase in the phase of the junction  $J_H$ , setting the hold time. This increase in phase redistributes the loop current between the branches, eventually producing a  $2\pi$  phase change in junction  $J_H$ , dissipating any remaining flux, and resetting the loop to the initial state.

Two major parameters characterizing a dynamic loop are the hold time  $\tau_H$  and reset time  $\tau_R$ . The hold time is set by the voltage  $V_L$  and is, therefore, affected by the resistance  $R_L$ and critical current  $I_C(J_H)$  of junction  $J_H$ . A larger  $R_L$  and  $I_C(J_H)$  reduce the fraction of current in the leaking branch, thereby increasing  $\tau_H$ . The dependence of  $\tau_H$  on  $R_L$  and  $J_H$ is shown in Fig. 3. While the same  $\tau_H$  can be produced by different combinations of  $R_L$  and  $I_C(J_H)$ , a larger  $R_L$  for the same  $\tau_H$  more quickly resets the gate to the initial state (with less current remaining within the loop after the same time), as shown in Fig. 4. Note that the reset time is faster with a larger series resistor  $R_L$  for a comparable hold time  $\tau_H$ . A larger  $R_L$ is, therefore, generally preferable for dynamic loops.

The junction  $J_D$  serves two purposes in DSFQ circuits. The nonlinear inductance of  $J_D$  initially affects the distribution of current between the two branches of the dynamic loop. At the time of reset, this junction contributes to resetting the loop. The effect of the critical current  $I_C(J_D)$  on  $\tau_H$  and  $\tau_R$  is depicted in Fig. 5. Although a larger  $I_C(J_D)$  increases  $\tau_H$ , it also increases  $\tau_R$  at a faster rate. It is, therefore, preferable to use the smallest possible  $I_C(J_D)$  to decrease the reset time.

# IV. MAJORITY GATES

In this section, novel majority gates are proposed, and an example circuit configuration of a gate suitable for use in large scale RSFQ/DSFQ circuits is described. A margin analysis is



Fig. 4. Dependence of  $\tau_R$  on  $R_L$  for similar  $\tau_H$  (~45 to ~90 ps). (a)  $R_L = 0.2 \Omega$ . (b)  $R_L = 0.4 \Omega$ .



Fig. 5. Timing behavior of  $I_C(J_D)$  in DSFQ loops. (a) Dependence of  $\tau_H$  on  $I_C(J_D)$  (solid line). (b) Dependence of  $\tau_R$  on  $I_C(J_D)$  (dashed line).

also presented, and tradeoffs between the dynamic hold time and maximum clock frequency are discussed.

A proposed three-input DSFQ majority gate is shown in Fig. 6. The gate consists of three DSFQ loops, as described in Section III,  $J_1 - J_4$ ,  $J_2 - J_4$ , and  $J_3 - J_4$ , with large storage inductors,  $L_A$ ,  $L_B$ , and  $L_C$ . These loops share the common junction  $J_4$ . Individual loop currents combined with the bias current  $I_B$  contribute to the total current through  $J_4$ .

The proposed gate operates in a similar manner to the DSFQ AND gate [20]. Individual input pulses are temporarily stored in corresponding dynamic loops in the form of loop currents, and an output pulse is produced by  $J_4$  upon the arrival of a second pulse within the hold time window  $\tau_H$ . Individual loop currents decay shortly after this time, resetting the gate to the initial state. A third pulse, in the case of a "111" input, is stored for the same time  $\tau_H$ . Other pulses cannot be accepted during this time, corresponding to the traditional CMOS setup time [25].

A simulation of the operation of the proposed gate is depicted in Fig. 7. The circuits are simulated in WRSpice [26] based on the junction parameters of the MIT LL SFQ5ee process [27], with



Fig. 6. DSFQ three-input majority gate.  $I_C(J1, J2, \text{ and } J3) = 120 \ \mu\text{A}$ , critically damped. L1, L2, and L3 = 9.5 pH (the input JJs and part of the inductance are shared with the input JTLs).  $I_c(J_{L1}, J_{L2}, \text{ and } J_{L3}) = 50 \ \mu\text{A}$ , unshunted.  $I_c(J_{H1}, J_{H2}, \text{ and } J_{H3}) = 106 \ \mu\text{A}$ , critically damped.  $R_{L1}, R_{L2}$ , and  $R_{L3} = 1.4 \ \Omega$ .  $I_c(J_4) = 160 \ \mu\text{A}$ , critically damped.  $I_B = 82 \ \mu\text{A}$ .



Fig. 7. Operation of the proposed three-input majority gate. A, B, and C are inputs and D is the output.

a JJ capacitance of 70 fF/ $\mu$ m. The clock frequency is 10 GHz, and the input data skew is 5 ps. Parameters of the dynamic loops are tuned to produce a 30 ps hold time—a maximum input skew for this parameter set. The average output delay of a three-input majority gate is 5.5 ps.

A layout of the proposed majority gate in the Hypres 4.5 kA/cm<sup>2</sup> technology [28] is depicted in Fig. 8. The parasitic inductances have been extracted using InductEx [29]. The primary effect of these parasitic inductances resulting from the layout is a small increase in the inductance of the corner loops as compared to the central loop. In addition, the inductance of the corner loops exhibits a small variation between the different branches. This variation produces a different hold time between different branches of the gate. The largest variation in inductance for a three-input majority gate is 0.3 pH. For a more compact layout of the gate enabled by more advanced fabrication processes, this variation can be larger, reducing the bias margins.



Fig. 8. Layout of a three-input majority gate in the Hypres 4.5 kA/cm<sup>2</sup> technology [28].

Wide conductors should therefore be used to connect the storage loops to  $J_4$ .

The simulated bias current margins for the proposed gate are  $\pm 20\%$ . A higher (lower) bias current increases (decreases) the hold time, causing timing violations, which places limits on the upper and lower bias margins—a notable tradeoff for the proposed gates. The delay and bias margins are for a manually optimized gate: these characteristics can be further enhanced.

Note that DSFQ gates also exhibit small variations in bias margins for different arrival times of the input pulses due to the dependence of the hold time on the bias current. In the simulations, all of the input pulses arrive within a 25 ps window, where the largest skew between individual inputs is 20 ps. For a 10 ps input skew, the bias margins are larger by 2 to 3%.

Another important tradeoff existing in the proposed gates is between the input skew tolerance and the maximum clock frequency. For high-frequency operation, it is desirable to decrease the hold time, allowing the gates to reset faster to accept a new set of inputs. This choice reduces the tolerance of the gate to the input skew. The frequency of operation can be increased if the input skew tolerance and target bias margins are relaxed.

Similarly, it is possible to increase the number of input loops for the proposed gate, producing a majority gate with five, seven, or more inputs. The five-input majority gate is schematically shown in Fig. 9. Operation of this gate is depicted in Fig. 10.

The extracted parasitic inductances of the five input gate exhibit a similar small variation as in a three input gate. This effect is due to the wide conductors connecting the dynamic storage loops to  $J_4$  due to the large area of the gate. A more compact layout of a five (or more) input gate in a more advanced technology is likely to exhibit greater variations in inductance as compared to a three-input gate, reducing the bias margins. These gates, although larger and less robust, increase the flexibility and benefits of majority gate logic.



Fig. 9. DSFQ five-input majority gate.



Fig. 10. Operation of the proposed five-input majority gate. A, B, C, D, and E are inputs and F is the output.

#### V. APPLICATIONS AND ADVANTAGES

Possible applications for the proposed majority gates and majority-based RSFQ circuits are discussed in this section. The benefits of this approach and compatibility with energy efficient RSFQ (ERSFQ) are also discussed.

The primary purpose of the proposed gates is to enable majority-based large scale RSFQ circuits. Majority logic has been shown to reduce power and area in CMOS circuits and emerging beyond CMOS technologies [15]. Due to the need for splitter trees to support fanout, RSFQ technology is sensitive to a large logic depth, greatly increasing the delay and thereby lowering the frequency of operation in combinatorial logic. Majority logic circuits partially alleviate this issue, increasing the performance of large scale RSFQ circuits.

The reduction in logic depth is offset by greater area and delay of the proposed gates, making this approach particularly suitable for specific classes of logic functions—in particular, nested expressions. Consider the following Boolean function

$$Y = A * (B + C * (D + E * (F + G * H))).$$
(1)

This function in conventional RSFQ AND/OR two-input logic consists of four AND gates and three OR gates and exhibits a logic depth of seven. A majority-inverter graph (MIG) representation [30] of this function is shown in Fig. 11(a), with the same logic depth. This majority-based expression can be optimized using the properties of majority logic [24] as follows:

$$Y = MAJ_3(A, 0, MAJ_3(B, 1, X))$$
(2)

$$X = MAJ_3(C, 0, MAJ_3(D, 1, Z))$$
(3)

$$Z = MAJ_3(E, 0, MAJ_3(F, 1, MAJ_3(G, 0, H)))$$
(4)

$$Z = MAJ_3(E, 0, MAJ_3(F, E, MAJ_3(G, 0, H)))$$

$$= MAJ_3(E, MAJ_3(F, E, 0), MAJ_3(G, H, 0))$$
(5)

$$X = MAJ_3(C, 0, MAJ_3(D, C, Z))$$
  
= MAJ\_3(MAJ\_3(C, D, 0), C, Z) (6)

$$Y = MAJ_{3}(A, 0, MAJ_{3}(B, A, X))$$
  
=  $MAJ_{3}(MAJ(A, B, 0), A, X).$  (7)

The resulting MIG, depicted in Fig. 11(b), is functionally equivalent to (1) and exhibits a logic depth of four. Specific degenerate majority gates in this graph, where one of the inputs is set to zero, can be mapped to regular DSFQ AND gates [20], further reducing the area.

A comparison of different circuits producing (1) is listed in Table I. For RSFQ, the Stony Brook cell library [31] is used to estimate the number of JJs, while the delay characteristics are based on [32]. For DSFQ, a delay of 3.5 ps [20] is used for the AND gate, while a delay of the DSFQ OR gate is assumed to be 5.5 ps—equal to the delay of the confluence buffer from [32, Fig. 3].

For (1), majority-based DSFQ logic produces a smaller output delay (by 33%) due to the reduced logic depth with an area overhead of 25% as compared to regular DSFQ. By replacing degenerate majority gates with DSFQ AND gates, the area overhead is reduced to 10%, while the timing characteristics are further improved with an over 50% decrease in delay.

Conventional RSFQ logic gates are clocked. A multigigahertz clock network [7] is necessary to support operation of VLSI SFQ circuits, adding significant area in both the JJ and wiring layers, dissipating higher power, and is a major source of added complexity for related design algorithms. The proposed majority gates do not require a clock signal. Asynchronous combinatorial logic is therefore an attractive solution.



Fig. 11. MIG for (1). (a) Original. (b) Optimized.

 TABLE I

 COMPARISON OF DIFFERENT LOGIC TYPES FOR (1)

| Logic type          | # of gates | # of JJs | Logic depth | Minimal delay, ps | Additional cost                    |
|---------------------|------------|----------|-------------|-------------------|------------------------------------|
| RSFQ [1]            | 7          | 80       | 7           | 56                | 7 splitters for clock tree (21 JJ) |
| DSFQ [20]           | 7          | 55       | 7           | 30.5              |                                    |
| DSFQ Majority logic | 7          | 70       | 4           | 22                | 3 splitters for inputs (9 JJ)      |
| DSFQ AND/MAJ        | 7          | 61       | 4           | 20                | 3 splitters for inputs (9 JJ)      |

Asynchronous RSFQ majority logic resembles conventional CMOS circuits by providing a boundary between the combinatorial logic and the sequential logic [33] and, therefore, facilitates the reuse of existing CMOS design algorithms, methodologies, and techniques [34]. Tools and methodologies developed for majority-based emerging technologies, including QFP logic [35], can also be partially reused.

The proposed gates require reasonable area and number of JJs, similar to clocked RSFQ gates. This area can be further reduced by sharing the input JJs  $(J_1, J_2, \text{ and } J_3 \text{ in Fig. 6})$ , and some of the storage inductance  $(L_1, L_2, \text{ and } L_3 \text{ in Fig. 6})$  with the preceding JTLs.

DFSQ majority logic, as is DSFQ in general, is self-resetting. This feature supports the initial reset of the circuit and reduces some of the possible complications due to flux trapping. Flux trapped in holes or moats located close to the gates can couple to storage loops [36]. Self-resetting of these loops can mitigate this effect. While the flux trapped within the JJs will not be affected, this issue is not significant in modern small junctions [37].

The proposed gates are fully compatible with ERSFQ bias schemes [38]. The junction at the bias point ( $J_4$  in Fig. 6) only switches once during a clock cycle. The average gate voltage, therefore, never exceeds the voltage of the bias bus produced by the feeding JTL.

# VI. CONCLUSION

Novel asynchronous RSFQ majority gates are introduced based on the recently proposed DSFQ logic topology. These gates utilize self-resetting storage loops to perform the majority function and reset to the initial state without requiring a clock signal. The proposed gates exhibit wide parameter margins and are capable of high-frequency operation. Majority gates in large scale DSFQ-based RSFQ circuits reduce the pipeline depth, increase performance, and simplify the design process. The use of asynchronous logic gates greatly simplifies and reduces the size of the clock network and enables the use of certain CMOS algorithms and techniques by providing a well defined boundary between the combinatorial and sequential logic.

### ACKNOWLEDGMENT

The content of the information does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred.

#### REFERENCES

K. K. Likharev and V. K. Semenov, "RSFQ logic/memory family: A new Josephson-junction technology for sub-terahertz-clock-frequency digital systems," *IEEE Trans. Appl. Supercond.*, vol. 1, no. 1, pp. 3–28, Mar. 1991.

- [2] I. I. Soloviev, N. V. Klenov, S. V. Bakurskiy, M. Y. Kupriyanov, A. L. Gudkov, and A. S. Sidorenko, "Beyond Moore's technologies: Operation principles of a superconductor alternative," *Beilstein J. Nanotechnol.*, vol. 8, pp. 2689–2710, Nov. 2017.
- [3] M. A. Manheimer, "Cryogenic computing complexity program: Phase 1 introduction," *IEEE Trans. Appl. Supercond.*, vol. 25, no. 3, Jun. 2015, Art. no. 1301704.
- [4] S. K. Tolpygo *et al.*, "Advanced fabrication processes for superconductor electronics: Current status and new developments," *IEEE Trans. Appl. Supercond.*, vol. 29, no. 5, Aug. 2019, Art. no. 1102513.
- [5] K. Gaj, Q. P. Herr, V. Adler, A. Krasniewski, E. G. Friedman, and M. J. Feldman, "Tools for the computer-aided design of multigigahertz superconducting digital circuits," *IEEE Trans. Appl. Supercond.*, vol. 9, no. 1, pp. 18–38, Mar. 1999.
- [6] C. J. Fourie, "Digital superconducting electronics design tools—Status and roadmap," *IEEE Trans. Appl. Supercond.*, vol. 28, no. 5, Aug. 2018, Art. no. 1300412.
- [7] K. Gaj, E. G. Friedman, and M. J. Feldman, "Timing of multigigahertz rapid single flux quantum digital circuits," *J. VLSI Signal Process. Syst. Signal, Image Video Technol.*, vol. 16, no. 2, pp. 247–276, Jun. 1997.
- [8] W. Chen, A. V. Rylyakov, V. Patel, J. E. Lukens, and K. K. Likharev, "Rapid single flux quantum T-flip flop operating up to 770 GHz," *IEEE Trans. Appl. Supercond.*, vol. 9, no. 2, pp. 3212–3215, Jun. 1999.
- [9] R. N. Tadros and P. A. Beerel, "A robust and self-adaptive clocking technique for SFQ circuits," *IEEE Trans. Appl. Supercond.*, vol. 28, no. 7, Oct. 2018, Art. no. 1301211.
- [10] G. Krylov and E. G. Friedman, "Globally asynchronous, locally synchronous clocking and shared interconnect for large-scale SFQ systems," *IEEE Trans. Appl. Supercond.*, vol. 29, no. 5, Aug. 2019, Art. no. 3603205.
- [11] Y. Nobumori *et al.*, "Design and implementation of a fully asynchronous SFQ microprocessor: SCRAM2," *IEEE Trans. Appl. Supercond.*, vol. 17, no. 2, pp. 478–481, Jun. 2007.
- [12] Z. J. Deng, N. Yoshikawa, S. R. Whiteley, and T. Van Duzer, "Data-driven self-timed RSFQ digital integrated circuit and system," *IEEE Trans. Appl. Supercond.*, vol. 7, no. 2, pp. 3634–3637, Jun. 1997.
- [13] M. Maezawa, I. Kurosawa, M. Aoyagi, H. Nakagawa, Y. Kameda, and T. Nanya, "Rapid single-flux-quantum dual-rail logic for asynchronous circuits," *IEEE Trans. Appl. Supercond.*, vol. 7, no. 2, pp. 2705–2708, Jun. 1997.
- [14] L. Amarú, P. Gaillardon, and G. De Micheli, "Majority-inverter graph: A new paradigm for logic optimization," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 35, no. 5, pp. 806–819, May 2016.
- [15] E. Testa, M. Soeken, L. G. Amarú, and G. De Micheli, "Logic synthesis for established and emerging computing," *Proc. IEEE*, vol. 107, no. 1, Jan. 2019, pp. 165–184.
- [16] M. Hosoya *et al.*, "Quantum flux parametron: A single quantum flux device for Josephson supercomputer," *IEEE Trans. Appl. Supercond.*, vol. 1, no. 2, pp. 77–89, Jun. 1991.
- [17] Q. P. Herr, A. Y. Herr, O. T. Oberg, and A. G. Ioannidis, "Ultra-low-power superconductor logic," J. Appl. Phys., vol. 109, no. 10, May 2011, Art. no. 103903.
- [18] K. Inoue, N. Takeuchi, K. Ehara, Y. Yamanashi, and N. Yoshikawa, "Simulation and experimental demonstration of logic circuits using an ultra-low-power adiabatic quantum-flux-parametron," *IEEE Trans. Appl. Supercond.*, vol. 23, no. 3, Jun. 2013, Art. no. 1301105.

- [19] A. L. Braun, "Large fan-in RQL gates," U.S. Patent 10,171,087, Jan. 1, 2019.
- [20] S. V. Rylov, "Clockless dynamic SFQ and gate with high input skew tolerance," *IEEE Trans. Appl. Supercond.*, vol. 29, no. 5, Aug. 2019, Art. no. 1300805.
- [21] A. Silver, R. Phillips, and R. Sandell, "High speed non-latching SQUID binary ripple counter," *IEEE Trans. Magn.*, vol. 21, no. 2, pp. 204–207, Mar. 1985.
- [22] O. A. Mukhanov and A. F. Kirichenko, "A superconductive highresolution time-to-digital converter," in *Proc. Int. Supercond. Electron. Conf.*, Jun. 1999, pp. 353–355.
- [23] S. B. Kaplan, A. F. Kirichenko, O. A. Mukhanov, and S. Sarwana, "A prescaler circuit for a superconductive time-to-digital converter," *IEEE Trans. Appl. Supercond.*, vol. 11, no. 1, pp. 513–516, Mar. 2001.
- [24] L. Amarú, P. Gaillardon, A. Chattopadhyay, and G. De Micheli, "A sound and complete axiomatization of majority-n logic," *IEEE Trans. Comput.*, vol. 65, no. 9, pp. 2889–2895, Sep. 2016.
- [25] E. Salman and E. G. Friedman, *High Performance Integrated Circuit Design*. New York, NY, USA: McGraw-Hill, 2012.
- [26] S. R. Whiteley, "Josephson junctions in SPICE3," *IEEE Trans. Magn.*, vol. 27, no. 2, pp. 2902–2905, Mar. 1991.
- [27] S. K. Tolpygo *et al.*, "Advanced fabrication processes for superconducting very large-scale integrated circuits," *IEEE Trans. Appl. Supercond.*, vol. 26, no. 3, Apr. 2016, Art. no. 1100110.
- [28] D. Yohannes, S. Sarwana, S. K. Tolpygo, A. Sahu, Y. A. Polyakov, and V. K. Semenov, "Characterization of HYPRES' 4.5 kA/cm<sup>2</sup> & 8 kA/cm<sup>2</sup>Nb/AlO<sub>x</sub>/Nb fabrication processes," *IEEE Trans. Appl. Super*cond., vol. 15, no. 2, pp. 90–93, Jun. 2005.
- [29] C. J. Fourie, O. Wetzstein, T. Ortlepp, and J. Kunert, "Three-dimensional multi-terminal superconductive integrated circuit inductance extraction," *Supercond. Sci. Technol.*, vol. 24, no. 12, Nov. 2011, Art. no. 125015.
- [30] L. Amarú, P. Gaillardon, S. Mitra, and G. De Micheli, "New logic synthesis as nanotechnology enabler," in *Proc. IEEE*, vol. 103, no. 11, Nov. 2015, pp. 2168–2195.
- [31] P. Bunyk, K. Likharev, and D. Zinoviev, "RSFQ technology: Physics and devices," *Int. J. High Speed Electron. Syst.*, vol. 11, pp. 257–305, Mar. 2001.
- [32] D. Amparo, M. Çelik, S. Nath, J. P. Cerqueira, and A. Inamdar, "Timing characterization for RSFQ cell library," *IEEE Trans. Appl. Supercond.*, vol. 29, no. 5, Aug. 2019, Art. no. 1300609.
- [33] X. Liu, M. C. Papaefthymiou, and E. G. Friedman, "Retiming and clock scheduling for digital circuit optimization," *IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.*, vol. 21, no. 2, pp. 184–203, Feb. 2002.
- [34] G. Krylov and E. G. Friedman, "Design for testability of SFQ circuits," *IEEE Trans. Appl. Supercond.*, vol. 27, no. 8, Dec. 2017, Art. no. 1302307.
- [35] R. Cai et al., "A majority logic synthesis framework for adiabatic quantumflux-parametron superconducting circuits," in *Proc. ACM Great Lakes Symp. VLSI*, May 2019, pp. 189–194.
- [36] K. Jackman and C. J. Fourie, "Flux trapping analysis in superconducting circuits," *IEEE Trans. Appl. Supercond.*, vol. 27, no. 4, Jun. 2017, Art. no. 1300105.
- [37] V. K. Semenov and M. M. Khapaev, "How moats protect superconductor films from flux trapping," *IEEE Trans. Appl. Supercond.*, vol. 26, no. 3, Apr. 2016, Art. no. 1300710.
- [38] O. A. Mukhanov, "Energy-efficient single flux quantum technology," *IEEE Trans. Appl. Supercond.*, vol. 21, no. 3, pp. 760–769, Jun. 2011.