# An 8-bit 3.2GS/s CMOS Time-Interleaved SAR ADC with Non-Buffered Input Demultiplexing

Benjamín T. Reyes<sup>†</sup>, Laura Biolato<sup>†</sup>, Agustin C. Galetto<sup>†</sup>, Leandro Passetti<sup>†</sup>, Fredy Solis<sup>†</sup>, and Mario R. Hueda<sup>‡</sup>

<sup>†</sup> Fundación Fulgor - Romagosa 518 - Córdoba (5000) - Argentina

<sup>‡</sup> Laboratorio de Comunicaciones Digitales - Universidad Nacional de Córdoba - CONICET

Av. Vélez Sarsfield 1611 - Córdoba (X5016GCA) - Argentina

Emails: {benjamin.reyes, mario.hueda}@unc.edu.ar

Abstract—An 8-bit, 3.2 GS/s, time interleaved (TI) successive approximation register (SAR) analog-to-digital converter (ADC) with a non-buffered hierarchical demultiplexing architecture is proposed and fabricated. Compared to a typical hierarchical TI-ADC, (*i*) all track-and-hold (T&H) related noise sources and (*ii*) wide-band amplifiers for buffering of the input signal are avoided. In this way, the proposed solution can improve the signal-to-noise-ratio and reduce power consumption. The concept is demonstrated in an 8-bit 3.2 GS/s TI-ADC design based on 32 asynchronous SAR ADCs and fabricated in a  $0.13\mu m$  CMOS process. The prototype includes (*i*) a programmable delay cell array to adjust four front sampling phases, and (*ii*) a 25.6 Gb/s low voltage differential signaling (LVDS) interface. Measurements of the fabricated TI-ADC show 44.6 dB of peak signal-to-noise-anddistortion ratio and 105 mW of power consumption at 1.2 V.

#### I. INTRODUCTION

Time-interleaved analog-to-digital converters (TI-ADC) are widely used to supply the high sampling rates required in digital communication receivers (e.g. wireless, wireline and optical links). A basic TI-ADC includes M single converters (or channels) operating in a parallel fashion at frequency  $1/T_s$ , but with different sampling phases in order to achieve an overall sampling rate of  $M/T_s$ . In recent years, TI-ADCs based on successive approximation register (SAR) ADCs have been used for high-speed medium resolution applications because of their advantages in power efficiency [1]–[4].

It is well-known that the number of SAR converters increases with the sampling rate. This fact not only has an impact on the TI-ADC complexity as a result of numerous effects (e.g. clock synchronization problems, more phases to calibrate, signal routing complexity), but also reduces the input bandwidth (BW) due to the large equivalent input capacitance [3], [5]. To address these drawbacks of TI-ADC with a large number of SAR converters, different hierarchical architectures have been proposed [1]-[3]. Fig. 1 depicts a typical hierarchical implementation. The input signal is first sampled on capacitors  $C_s$  of track-and-hold (T&H) circuits and then, during the hold time, the signal is buffered and re-sampled over  $C_{SAR}$  as a sub-interleaving stage (i.e. SAR DAC arrays). The objective of this hierarchical approach is to reduce the number of *front* channels in order to mitigate some of the limitations previously mentioned (e.g. time error, BW reduction). Unfortunately, an increment in the capacitors size



Figure 1: Typical hierarchical TI-ADC architecture.



Figure 2: Proposed non-buffered TI-ADC architecture.

is required to compensate performance degradation caused by new noise sources ( $kT/C_S$  and buffer noise), which impacts on the area and BW of the TI-ADC. Noise sources constitute a main limiting factor for most Giga-Sample ADCs, even for low resolution [5]. To reduce noise impact, extra area and power are required to minimize kT/C and active circuits noise contributions. Furthermore, the use of power-hungry T&H buffers fairly degrades the original SAR ADC efficiency.

In this work, an 8-bit,  $3.2 \,\mathrm{GS/s}$  TI-ADC SAR with a nonbuffered hierarchical demultiplexing architecture is designed and fabricated in a  $0.13 \mu \mathrm{m}$  CMOS process (see Fig. 2). The proposed TI-ADC avoids all extra T&H noise sources, enabling a reduction of the area required by each SAR ADC unit. In addition, since T&H buffers are not required, the efficiency of the TI-ADC is similar to that achieved by a single SAR ADC unit. We highlight that our proposal maintains the features of hierarchical demultiplexing with the advantage of non trade-off between *hold* time of front phases and settling time on SAR re-sampling phases. The fabricated prototype includes (*i*) a programmable delay cell array to adjust four front sampling phases, and (*ii*) a 25.6 Gb/s low voltage differential signaling (LVDS) interface.



Figure 3: Capacitance vs. Resolution bits.

The rest of this paper is organized as follows. Section II discusses the trade off comparison between both mentioned sampling architectures. Section III explains the implemented architecture along with its building blocks. Experimental results are mentioned in Section IV, and finally, in Section V, conclusions are drawn.

# II. SAMPLING ARCHITECTURE

The basic performance of a SAR-ADC can be defined by its *signal-to-noise-distortion-ratio* (SNDR). For a sinusoidal input signal  $V_{in}cos(2\pi f_{in}t)$ , the SNDR is given by

$$SNDR = 10 \log_{10} \frac{V_{in}^2}{2(\sigma_T^2 + \sigma_n^2)}$$
(1)

where  $\sigma_n^2$  is the thermal noise power due to sampling circuits (see (4) and (5)), whereas  $\sigma_T^2$  is given by

$$\sigma_T^2 = \frac{V_{FS}^2}{2^{2N}} (\frac{1}{12} + \frac{1}{4} \sigma_{DNL}^2 + \sigma_{INL}^2) + (\sqrt{2}\pi f_{in} V_{in} \sigma_j)^2 + \sigma_{cmp}^2.$$
(2)

In (2),  $V_{FS}$  is the full scale voltage,  $\sigma_{cmp}^2$  is the input referred comparator noise,  $(\sqrt{2}\pi f_{in}V_{in}\sigma_j)^2$  represents noise power due to clock jitter,  $\frac{V_{FS}^2}{12\times2^{2N}}$  is the quantization noise, and  $\frac{V_{FS}^2}{2^{2N}} \times (\frac{1}{4}\sigma_{DNL}^2 + \sigma_{INL}^2)$  depicts the nonlinearities effects [5].

Since linearity, jitter and comparator noise terms are not dependent on sampling architecture, they can be neglected for comparison purposes. Thus, signal-to-noise ratio (SNR) penalty caused by  $\sigma_n^2$  in an ideal ADC can be defined as:

$$\Delta SNR = 10 \log_{10} \frac{\frac{1}{12} \frac{V_{FS}^2}{2^{2N}} + \sigma_n^2}{\frac{1}{12} \frac{V_{FS}^2}{2^{2N}}}.$$
 (3)

For a typical hierarchical architecture (Fig. 1),  $\sigma_n^2$  is given

by 
$$\sigma_n^2 = \frac{kT}{C_S} + \frac{kT}{C_{SAR}} + \sigma_{buff}^2, \qquad (4)$$

where  $\sigma_{buff}^2$  represents the buffer noise,  $kT/C_S$  and  $kT/C_{SAR}$  (with k and T being the Boltzmann constant and temperature, respectively), are the kT/C noise referred to  $C_S$  and  $C_{SAR}$ . On the other hand, for the non-buffered hierarchical architecture considered here (see Fig. 2), the thermal noise due to sampling circuits is reduced to

$$\sigma_n^2 = \frac{kT}{C_{SAR}}.$$
(5)

As an example, let us consider a total penalty of  $\Delta SNR = 1 \,\mathrm{dB}$  over an 8-bit ideal ADC SNR (49.93 dB).

For  $V_{FS} = 400 \,\mathrm{mV}$ , the penalty for a typical hierarchical architecture represents a  $\sigma_n^2 = (230 \,\mu\mathrm{V})^2$ , that can be assumed equally distributed among the noise terms. This means,  $kT/C_S = kT/C_{SAR} = \sigma_{buff}^2 = (230 \,\mu\mathrm{V})^2/3$ , which results in  $\sigma_{buff} \approx 132 \,\mu\mathrm{V}$  and  $C_S = C_{SAR} = 236 \,\mathrm{fF}$  (two 472 fF single-ended capacitances).

In contrast, for the non-buffered hierarchical architecture, the total sampling noise budget of 1 dB can be assigned to  $kT/C_{SAR} = (230 \,\mu\text{V})^2$ . It results in  $C_{SAR} = 79 \,\text{fF}$  (two single-ended capacitors of 158 fF), that is, three times smaller than each capacitor in the buffered architecture. Fig. 3 shows how the capacitance of each architecture scales according to the ADC resolution, for a fixed SNR penalty of 1 dB.

Considering the architecture of Fig. 1 with 4 T&H stages and 8 SAR ADCs per T&H, the total capacitor area is that required by a 34 pF capacitor. However, in the proposed architecture (Fig. 2), it is reduced to the equivalent of a 10.1 pF capacitor. Another advantage is the absence of a high speed buffer and the reduced power consumption. In addition, DAC  $(C_{SAR})$  reduction means a size decrease of all logic gates and control switches, improving SAR-ADC efficiency.

Bandwidth of previous circuits can now be analyzed using first order approximations. In the first case (Fig. 1), BW is primarily defined by  $C_S$  and the switch resistance  $R_S$ , so that time constant will be  $\tau_1 = R_S C_S$ . For example, let us consider  $R_S = 100\Omega$  and a non-overlapping sampling phases in front switches. Under these conditions, and using  $C_S = 236$  fF, the cutoff frequency of sampling circuit will be  $f_{C_1} = 1/(2\pi\tau_S) \approx 6.7$  GHz. In the second case (Fig. 2), there are two switches in series  $(2R_S)$  and a  $C_{SAR} = 79$  fF. Now, for similar conditions to the previous case,  $\tau_2 = 2R_S C_{SAR}$ and  $f_{C_2} = 1/(2\pi\tau_2) \approx 10$  GHz, which results ~ 50% larger<sup>1</sup>.

A further key feature of the proposed architecture is the lack of *hold* time trade-off between *front* and SAR clock signals. In buffered architectures (Fig. 1) a relative long hold time on  $C_S$ is required so that there is enough settling time for  $C_{SAR}$ re-sampling (i.e. reducing tracking time and/or increasing parallelism). This sequence also requires a power hungry widebandwidth buffer and a precise synchronization between front and SAR clock signals. In contrast, for the circuit of Fig. 2, no hold time is required and the synchronization between front and SAR clock signals is more robust under process, temperature, and voltage variations because front tracking pulse falls inside SAR sampling window that is twice as wide.

Another drawback of the circuit of Fig. 1 is the complexity to implement *reset* before sampling when it operates at high sampling rates. The lack of reset on  $C_S$  combined with an incomplete settling time will degrade the input signal due to inter-symbol interference (ISI). In the case of Fig. 2, the reset is executed directly on  $C_{SAR}$  after quantization and previous to sampling, thus no ISI effect will occur, even if sampling settling time is incomplete.

<sup>&</sup>lt;sup>1</sup>These assumptions are only valid for a typical implementation where wiring parasitics are not dominant over  $C_{SAR}$ .



Figure 4: Implemented TI-ADC chip architecture.



Figure 5: Asynchronous SAR ADC topology.

# **III. CIRCUIT IMPLEMENTATION**

#### A. Time-Interleaved ADC

The implemented chip architecture is presented in Fig. 4. The TI-ADC consists of four main interleaved channels, each one composed of one front sampler followed by 8 subinterleaved SAR-ADCs, with no buffer between the stages. Four 50% duty cycle, 800 MHz clock signals with a  $90^{\circ}$  phase shift between them are provided by a multiple-phase clock generator in order to manage the front sampler switches. SAR sampling switches are driven by 100 MHz, 12.5% duty cycle clock signals derived from a frequency divider (by 8) and a combinational logic. Front and SAR clock signals are timed to track the input signal sequentially on one SAR DAC capacitor (see Fig. 6). After the SAR switch turns off, the sampled value is quantized while this sequence is repeated on every SAR-ADC. Finally, the SAR-ADC digital outputs (32 lines of 8bits at 100 MHz) are multiplexed (16 lines at 1.6 GHz) and sent off-chip through a 16 parallel LVDS channels interface. Since no decimation is applied, a total of 25.6 Gb/s are transmitted off-chip in real time.

It is well-known that sampling time error due to process mismatch on clock phase generators and T&H circuits degrades the high frequency performance of TI-ADC [6]. Thus, a calibration circuit based on programmable delay cells was implemented on each front clock [1]. Each cell consists of 3 delay stages of 7-bits time step resolution capacitive DAC combined with programmable CMOS buffers. The implemented circuit is capable of operating on a wide clock frequency range (200 MHz to over 1 GHz) providing a flexible



Figure 6: Asynchronous SAR ADC timing diagram.

time step configuration (100 fs to 750 fs) and a wide calibration range (40 ps to 300 ps).

## B. Asynchronous SAR ADC

Fig. 5 depicts the asynchronous SAR ADC topology. It comprises a charge-redistribution DAC, a self-clocked comparator, an asynchronous control logic cell and a sampling signal generator. The DAC is based on a binary-weighted split capacitor array [4]. Note that metal-fringe capacitor unit on DAC was set to 3.3 fF because of technology relative large mismatch. Then, *half* DAC is ~425 fF single ended, resulting larger than the value required by  $kT/C_{SAR}$  limit, but still valid to demonstrate the concept.

Fig. 6 shows the asynchronous SAR timing diagram. After SAR reset, the sample signal enables tracking of the input signal by one of the front switches. Then, asynchronous control logic enables the MSB comparison and waits for the decision. After decision, the result is stored and DAC is set for the next comparison using a high-speed logic path. In parallel, the comparator is reset by detection logic. After the LSB is quantized, comparator inputs are shorted and an additional comparison is performed to calibrate the input offset. As shown in Fig. 6, the calibration circuit stores the extra decision result as a small fixed charge (positive or negative) that is then averaged in an integration capacitor  $C_{cal}$  [2], [7]. After offset comparison, the comparator self-clock is disabled and the cycle is restarted after new *reset/sample* signals. In case any comparison cycle takes excessive time (e.g. due to long metastability state) one calibration cycle can be lost without affecting conversion results. Common mode voltage is generated on-chip from external, non-buffered voltage references (see Fig. 4). These references have been carefully distributed and decoupling capacitors have been added to ensure low impedance at high-frequency.

### **IV. EXPERIMENTAL RESULTS**

The prototype chip is fabricated using a  $0.13 \,\mu\text{m}$  CMOS process (see Fig. 7). The measurement setup is similar to



Figure 7: Chip die micrograph. Die size:  $2 \text{ mm} \times 2 \text{ mm}$ .

Table I: Performance summary and comparison.

|                            | This Work | [7]    | [8]    | [9]    |
|----------------------------|-----------|--------|--------|--------|
| Architecture               | TI-SAR    | TI-SAR | TI-SAR | TI-SAR |
| Process [nm]               | 130       | 130    | 130    | 40     |
| Resolution [bits]          | 8         | 6      | 10     | 8      |
| $F_S$ [GS/s]               | 3.2       | 2.0    | 1.35   | 2.64   |
| $V_{DD}$ [V]               | 1.2       | 1.2    | 1.2    | 1.2    |
| ENOB <sub>Nyq</sub> [Bits] | 5.68      | 4.92   | 7.7    | 6.0    |
| Power [mW]                 | 105       | 192    | 168    | 39     |
| FoM [fJ/CS]                | 640       | 3163   | 600    | 230    |
| Area [mm <sup>2</sup> ]    | 1.1       | 3.24   | 1.6    | 0.18   |

[7]. In Fig. 8, the ADC performance as a function of input frequency  $f_{in}$  is plotted. The measurement conditions are sampling rate of 3.2 GS/s,  $V_{FS} = 0.4 \text{ V}$ , and 1.2 V of TI-ADC supply voltage. Additionally, the DC offset of each SAR is self calibrated on-chip. The front phases errors are detected with a foreground sine-fit LMS algorithm and then adjusted on-chip by the programmable delay cells. Nevertheless, the prototype is designed to use background phase calibration algorithms like [1] for communication applications.

The performance of a single SAR ADC achieves a peak SNDR of 45.3 dB (7.23 effective-number-of-bits (ENOB)) at  $f_{in} = 399$  MHz. At the same  $f_{in}$  the phase calibrated TI-ADC SNDR is 44.6 dB (7.12 ENOB). The calibrated TI-ADC performance shows up to 3.79 dB SNDR degradation compared to a single SAR. The measured input BW is over 1 GHz which is mainly limited by the QFN64 type package.

Each SAR ADC consumes 3.28 mW and achieves a figure-of-merit  $FoM = Power/(2^{ENOB_{Nyq}} \times F_s) = 218 \text{ fJ/conv-step}$ . Table I summarizes the performance of our design and compares it with other TI-ADC SAR with similar  $F_S$ . To the best of our knowledge, this prototype presents the highest sample rate for the same topology in this technology. The total chip consumption from a 1.2 V source (including TI-ADC and clock buffering/generation) is 260 mW. The LVDS Tx draws 15.2 mW per channel from 2.5 V supply voltage at 1.6 Gb/s. The complete Tx consumes 258 mW (16 ch. + 1 ref. clock).

#### V. CONCLUSIONS

A non-buffered hierarchical TI-ADC architecture has been proposed. As a proof of concept, an 8-bit 3.2 GS/s TI-SAR ADC has been designed, fabricated, and characterized. The



Figure 8: Measured SNDR and ENOB vs. input frequency.

design exhibits very high efficiency to interleave several SAR ADC units with a simplified high-speed front sampling. The implemented phase calibration capabilities and the offset auto adjustment allow significant performance improvement. The proposed architecture is especially suitable for new CMOS technologies due to the great capacitor matching and improvements on SAR ADC implementation efficiency.

#### ACKNOWLEDGMENT

The authors would like to thank MOSIS for fabricating their design through the MEP research program.

#### REFERENCES

- [1] B. T. Reyes, R. M. Sanchez, A. L. Pola, and M. R. Hueda, "Design and Experimental Evaluation of a Time-Interleaved ADC Calibration Algorithm for Application in High-Speed Communication Systems," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 64, no. 5, pp. 1019–1030, May 2017.
- [2] L. Kull, T. Toifl, M. Schmatz, P. A. Francese, C. Menolfi, M. Braendli, M. Kossel, T. Morf, T. M. Andersen, and Y. Leblebici, "A 90gs/s 8b 667mw 64x interleaved SAR ADC in 32nm digital SOI CMOS," in *Solid-State Circuits Conference Digest of Technical Papers (ISSCC), 2014 IEEE International*, Feb. 2014, pp. 378–379.
- [3] Y. M. Greshishchev, J. Aguirre, M. Besson, R. Gibbins, C. Falt, P. Flemke, N. Ben-Hamida, D. Pollex, P. Schvan, and S.-C. Wang, "A 40gs/s 6b ADC in 65nm CMOS," in 2010 IEEE International Solid-State Circuits Conference - (ISSCC), San Francisco, CA, USA, Feb. 2010, pp. 390–391.
- [4] V. Tripathi and B. Murmann, "An 8-bit 450-MS/s single-bit/cycle SAR ADC in 65-nm CMOS," in ESSCIRC (ESSCIRC), 2013 Proceedings of the, 2013, pp. 117–120.
- [5] B. Razavi, "Design Considerations for Interleaved ADCs," *IEEE Journal of Solid-State Circuits*, vol. 48, no. 8, pp. 1806–1817, 2013.
- [6] C. Vogel, "The impact of combined channel mismatch effects in timeinterleaved ADCs," *IEEE Transactions on Instrumentation and Measurement*, vol. 54, no. 1, pp. 415–427, 2005.
- [7] B. T. Reyes, G. Paulina, R. Sanchez, P. S. Mandolesi, and M. R. Hueda, "A 2gs/s 6-bit CMOS time-interleaved ADC for analysis of mixed-signal calibration techniques," *Analog Integrated Circuits and Signal Processing*, vol. 85, no. 1, pp. 3–16, Jul. 2015.
- [8] S. M. Louwsma, A. J. M. van Tuijl, M. Vertregt, and B. Nauta, "A 1.35 GS/s, 10 b, 175 mW Time-Interleaved AD Converter in 0.13 um CMOS," *IEEE Journal of Solid-State Circuits*, vol. 43, no. 4, pp. 778–786, Apr. 2008.
- [9] S. Kundu, J. H. Lu, E. Alpman, H. Lakdawala, J. Paramesh, B. Jung, S. Zur, and E. Gordon, "A 1.2 V 2.64 GS/s 8bit 39 mW skew-tolerant time-interleaved SAR ADC in 40 nm digital LP CMOS for 60 GHz WLAN," in *Proceedings of the IEEE 2014 Custom Integrated Circuits Conference*, Sep. 2014, pp. 1–4.