Sie sind auf Seite 1von 4

Channel Adaptive ADC and TDC for 28 Gb/s PAM-4

Digital receiver
Aurangozeb, AKM Delwar Hossain, Masum Hossain
ECE Dept., University of Alberta, Edmonton, Canada
masum@ualberta.ca

Abstract— A low power channel adaptive 28 Gb/s PAM-4 SR Channel LR Channel MR Channel
receiver is presented utilizing a predictive ADC, a SAR TDC, and (< 15 dB loss) (> 25 dB loss) (15 25 dB loss)

an FFE in digital domain. The variable resolution Flash ADC can

Channel Response (dB)


SFP+ CDR TX TX
RX RX
SR ch. (10 dB)

Connector
provide 5.5-bit resolution using only 16 comparators taking ASIC
advantage of the channel ISI. The ADC resolution is QSFP+ CDR
TX
RX
TX
RX
MR ch. (17 dB)
programmable from 2-bit to 5.5-bit consuming 40mW to 90mW Line
LR ch. (30 dB)
respectively. The SAR TDC generates 5-bit output with 2-bit ISI Cards

Connector
Connector
ASIC
information and 3-bit timing error information to assist low ASIC
latency, low jitter timing recovery. Following that 3-to-8 Mid-plane/
programmable tap FFE is used to equalize up to 30 dB loss with Backplane
Frequency (Hz)
BER lower than 10-8. Measured power consumption is 130mW
(excluding DSP) with active chip area is 0.2025 mm2. As a result, Fig. 1. ASICs with transceiver supporting LR, MR and SR channels and their
frequency responses.
the receiver enables energy efficiency proportional to loss
LR channel (30 dB loss) MR channel (20 dB loss)
compensation.
ADC res.= 5.5 ADC res. = 4.5
FFE Tap = 7 FFE Tap = 4
Keywords—Channel adaptive ADC; SAR TDC; PAM-4 digital

Eq. Eye (mV)


Eq. Eye (mV)

receiver; digital equalization.

I. INTRODUCTION
High-speed signaling and Serdes architecture are evolving
rapidly to accommodate higher data rates and higher insertion ADC res. ADC res.
FFE Taps FFE Taps
loss. Digital equalization is a natural progression to that trend
where traditional analog equalization techniques are becoming Fig.2. Equalized ADC input referred eye opening as a function of ADC
resolution and number of FFE taps in DSP. Circled point shows the needed
more challenging to realize in deep submicron technologies resolution and taps to achieve 40+ mV opening for BER < 10 -6.
where scaled supply making it difficult to meet linearity. In
addition, variation and mismatch in these technologies make
the equalizer design more challenging. Therefore, in ADC ADC and TDC works in a collaborative way to generate
based receivers where equalization is done digitally, can take multibit data and edge information while keeping the power
advantage of the technology scaling, enables advanced low at the same time. Finally, ADC and following equalization
equalization that can compensate higher loss compared to is built with scalable resolution to save power. FFE only
traditional mixed signal equalization. However, the challenge equalization is adopted that does not have critical timing
in this architecture is the front-end ADC that consumes constrain like DFE. As shown in Fig. 2, required ADC
significant power to provide the required resolution. When resolution and number of FFE taps are proportional to channel
compared to traditional mixed signal receiver, ADC based loss. Main contribution of this work is to enable a digital
solutions’ power consumption is 2x higher [3-4]. This limits receiver solution that can adapt to the channel loss and sets its
their adoption in large ASICs where 100s of transceivers are resolution and number of taps accordingly. Main component
used within strict thermal budget for the entire device of such solution is variable resolution ADC that maintains
including digital core. Note that ASIC needs to support many good energy efficiency over different resolution. In addition,
links where channel loss varies over a wide range depending number of taps and their resolution are adjustable to reduce
of the channel length (Fig. 1). In this work, we introduce three power consumption for lower loss cases.
techniques to improve energy efficiency of ADC based II. ISI AWARE ADC DESIGN
receiver to less than 4 pJ/bit: First, rather than designing a
general purpose ADC we use ADCs that takes advantage of ADC resolution requirement is set by the channel loss. For
the ISI in the channel. In simple words ISI creates correlations SR channels after linear equalization we can directly make
between consecutive samples and by taking advantage of this symbol decision with 2-bit ADC. However, for LR channels
correlation ADC power can be significantly reduced. Second, where most of the equalization will be done in DSP, we need
to address the critical timing margin of PAM-4 signaling we 5.5-bit ADC to keep the quantization noise low. Therefore, we
use multi-bit TDC instead of a single bit PD. As it turns out, need to cover 2-bit to 5.5-bit resolution, and unfortunately,
existing SAR or flash ADCs fails to remain efficient over this

978-1-5090-5191-5/17/$31.00@2017 IEEE
Odd Edge ref.
Even
2.5 Coarse ref. update
based on edge SH value for
T-to-B Fine ref.
2 prediction fine(even) comp.
High BW P0 HR SH value for
(Fine S/H) Mode coarse comp.
Amplifier
1 Selection
Reference T-to-B 5.5
Generator SH value for edge comp.
5
Passive
Equalizer 4
2
T-to-B P0
3 P180 P90
Edge comp. fired Data comp. fired Fine comp. fired
P0 2
(Coarse S/H) 2UI 3UI
(b)

CK CK
1.5 + CK
Input - OUTN
T-to-B + OUTP
P315 Ref. - ON/ CN<3:0> CP<3:0>
(Edge S/H) 4
OFF 4 SRAM 4 4 4
1 0 SRAM
CH0 1 0
CH90 Corr. 1
bits
CH180
REFN INP INN
CH270 REFP

CP<3:0> CK 4
3.5 GHz TDC T<3:0>
Splitter
Clock Gen (See fig. 6) Offset Correction
CN<3:0> …
(a) (c)
Fs = 14 GHz SFDR, SNDR vs. Input Frequency (5.5-bit mode)
Fs = 12.5 GHz ENOB= 3.67
DNL= +0.50 / -1.00bit (before)
(Before) & +0.38& 4.52
/ -0.31 bitLSB
(After) (after)
SFDR W/ cal. SFDR WO cal. SNDR W/ cal. SNDR WO cal.
0 1 Before offset correction After offset correction
DNL= +0.50 / -1.00 (Before) & +0.38 / -0.31 (After) LSB
1
DNL (LSB)

40 Before offset correction After offset correction

DNL (LSB)
Before After
0 0
Magnitude (dB)

SNDR (dB) 23.9 29


-20
SFDR, SNDR (dB)

35
-1 SFDR (dB) 30.35 33.97
0 5 10 15 20 25 30 -1
0 5 10 15 20 25 30
30 -40 INL= +1.00 / -0.00 (Before) & +0.19 / -0.41 (After) LSB
INL= +1.00 / -0.00 (Before) & +0.19 / -0.41 (After) LSB
1
1
INL (LSB)

INL (LSB)
25 0
-60 0
-1
0 5 10 15 20 25 30
20
0 1 2 3 4 5 6 7 0 1 2 3 4
Code 5 6.25 -10 5 10 15 20 25 30
Input Frequency (GHz) Code
Analog Input Frequency (GHz)
(d) (e) (f)
Fig.3 .(a) ADC block diagram, (b) prediction based reference switching, (c) offset correction for fine compartors , (d) measured SNDR/SFDR vs. input
frequency for Fs=14GS/s with and without offset correction, (e) measured FFT for a Nyquist input with and without offset correction . (f) measured INL/DNL
with and without offset correction
range. Flash ADC power increases exponentially with its mainly to support loop unrolled DFE in NRZ mode or PAM-4
resolution. Therefore, beyond 4-bit resolution SAR ADCs are SR channels. In 3-bit or 4-bit modes, the ADC operates
preferred [3]. However, in this design we choose 5.5-bit Flash similar to the unified two-step sub-ranging flash ADC as
ADC for several reasons: First, flash ADCs can be configured described in [1] with one exception. Rather than subtracting
to directly decode 2-bit information using 4 coarse the MSBs (coarse output) from the input, an identical signal
comparators for SR channels more efficiently. Second, faster path is introduced for the fine comparator and the appropriate
conversion rate of the flash ADC allows lower latency wider reference levels are selected for the fine comparators. Note
bandwidth timing recovery loop. Finally, channel ISI allows that, this technique does not require any additional gain
the comparators to be recycled and as a result power calibration or mismatch correction. In 5-bit mode, ADC takes
consumption is lower even compared to SAR ADC. The advantage of the ISI - 4 coarse comparators can generate 3-bit
receiver architecture shown in Fig. 3a includes 3 fixed edge information by covering only half of the signal range at a time
comparators, 4 floating coarse comparators and two sets (even based on the edge comparators output. These 3-bits along with
and odd) of 5 floating fine comparators to support both NRZ
and PAM-4. However, this work will only focus on PAM-4. 90 mV
10 ps 31

A. Variable Resolution ADC


ADC Code

20

The ADC resolution can be programmable from 2-bit to 10


5.5-bit. Low resolution ADC modes (2-bit or 3-bit modes) are
0
-0.5 0 0.5
Time (UI)
200 (a)
SAR Flash 90 mV
150 10 ps 31
Power (mW)

100
ADC Code

20

This Work 10
50
0
-0.5 -0.25 0 0.25 0.5
Time (UI)
0 (b)
2 2.5 3 3.5 4 4.5 5 5.5
ADC Resolution (No. of bits) Fig. 5. Input eye and digitally reconstructed eye generated from ADC
Fig 4. ADC resolution vs power @ 14 GS/s in 65nm CMOS output.
2-bits generated from the fine comparators provides 5-bits of be corrected during downtime. Each comparator has four
resolution. NMOS inputs and a strong-ARM latch. There are capacitive
The outputs of the edge comparators are used to predict the load arrays attached to the inputs of the strong-ARM latch
placement of coarse comparators given that the analog input is (Fig. 3c). During the offset correction, all inputs are tied
not changing abruptly within 0.5UI due to ISI. The edge together to Vcm, and the offset correction block starts to
comparators decision time and reference update for the data unbalance the load array so that load to the faster input path
comparators settles within 2UI. Coarse comparators decision becomes higher than the slower input path [2]. To reduce the
and fine reference settling takes another 3UI. Therefore, fine hardware, a common offset correction control block is used,
comparators are triggered after 5UI. Note that the half-rate and offset correction is performed sequentially. The bit
S/H holds the data for 7UI, providing sufficient time for the decisions for unbalanced load array are stored in conventional
two conversion cycles to complete (Fig 3b). When all 5 fine 6T SRAM instead of Flip-Flop.
comparators are used, resolution of the fine conversion cycle Fig. 3d shows the measured SFDR and SNDR vs. input
improves to 2.5-bit providing over all resolution of 5.5-bits. frequency with and without calibration at 14 GHz sampling
Comparators outputs are processed in segmented frequency. There is an average 4.7 dB improvement in SNDR
thermometric-to-binary converters; therefore, complexity due to offset correction. The measured FFT at a sampling rate
linearly increases with resolution as opposed to exponential of 12.5 GHz for a Nyquist input with and without offset
increase in conventional implementation. This also allows us correction is shown in Fig. 3e. Offset correction improves
to turn-off those T-to-B sections when operated at lower SNDR and SFDR from 23.9 and 30.35 dB to 29 and 33.97 dB,
resolution. respectively. Fig. 3f shows the measured DNL and INL before
B. Offset Correction and after comparator offset calibration. For the given
Due to relaxed resolution, coarse comparators can be resolution range (2-bit to 5.5-bit) the proposed ADC achieves
corrected during initial power up. However, fine comparators highest energy efficiency (Fig. 4) and yet achieves sufficient
need to have periodic offset correction to track V,T change to SNDR to regenerate the input eye in digital domain as shown
achieve better than 5mV resolution. Since even and odd in Fig. 5.
comparators are enables in time interleaved manner, offset can
III. SAR TDC BASED TIMING RECOVERY
Thermo.
Data
To IDAC Time to digital converter (TDC) generates multi-bit timing
Comp.
Binary
Phase
D
+ error information adjustable from 3-bits to 5-bits. Similar to the
5 bit
Edge Detector
SAR
Logic
PDAC ADC, TDC resolution is also set based on channel loss. Unlike
Comp. Logic
conventional Multi-bit TDC used for digital PLL, the TDC
SAR update path used in this implementation is simpler and based on BBPD as
4 edge sampling phase
ΔT ΔT
Fixed shown in Fig. 6.
Pulse
skew
Gen.
4 data sampling phase For bi-modal (NRZ/PAM4) operation, BBPD considers
Fig. 6. TDC based timing recovery loop. only the transitions close to zero crossings (Fig. 7) (i.e. ignores
PD ISI
Transition CE2 CE1 CE0
Output Info the top and bottom eye in PAM-4). Single or three edge
Region 3 0 0 0 0 1
comparators can be used based on the channel loss. For SR
0 0 1 0 0
CE2
3→0
0 1 1 1 0 channels only single edge comparator is used, but we can still
Region 2 1 1 1 1 1 extract 3-bit edge information using SAR algorithm. After each
CE1 0 0 0

Region 1
2→0 x x
1 1 0 UP/DN, SAR logic directly updates the edge clock through
CE0 3→1
0
x x
0 0 phase (Φ) mixer where edge phases are blended (Fig. 8).
1 1 0
Region 0 0 0 0
However, the DAC to the VCO remains unchanged – therefore,
2→1 x
1
x
1 0 data sampling phase remains unchanged. After three
(a) All cases of region of transition (b) Truth Table of !!PD for top to bottom transitions consecutive decisions SAR conversion cycle completes
Fig. 7. Timing error and ISI information from three edge comparator. generating 3-bit phase code is generated. These 3-bits along
ISI Information → 1+0+0 →01 with 2-bit ISI codes are directly applied to 5-bit DAC to control
ISI → 1 ISI → 0 ISI → 0 the digitally controlled oscillator (DCO) and that updates the
data sampling phase.
Compared to conventional implementation this provides 4x
0 0 x improvement in dithering jitter while achieving the same
0 x 1 tracking bandwidth. For LR channels 3 edge comparators are
0 x x used to serve dual purposes: (a) to provide the timing error
information with higher resolution and (b) predict approximate
location of the signal to place the data comparators in the
100 ΦDATA 110 ΦEDGE 101 vicinity of the next sample. Three edge comparators generate
2-bit timing error information from ISI free transitions (Fig. 8).
0→ 3 Transition 3→ 1 Transition 1→ 2 Transition For the edges affected by ISI - 1-bit timing error information is
PD → Late PD → Early PD → Early
After 3 transitions, 3bit SAR Output → 101
extracted using top and bottom edge comparators. Putting
together all these information 5-bit TDC output have 2-bit ISI
Fig. 8. Cycle by cycle PD output and SAR TDC operation.
Jitter Tolerance (UIpp)
Equipment limit
LSB
MSB

BER
30 dB loss
BER target
25 dB loss

Frequency (MHz) Phase code (1 UI = 64)


Fig. 12. Jitter tolerance @ 28 Gb/s with BER< 10-9 and receiver BER for
Fig. 9. Jitter histogram of recovered clock. 25 dB and 30 dB loss channels.
5.7 pJ/bit

Power (mW) @ 28 Gb/s


4.6 pJ/bit FFE

3.25 pJ/bit TDC

2.1 pJ/bit 2.1 pJ/bit

ADC

Channel Loss (dB)


Fig. 13. Power consumption of the receiver for different channel losses.
TABLE I. COMPARISON WITH STATE-OF-ART
Shafik Frans Cui Rylov ISSCC This Work
ISSCC 2015[4] VLSI 2016[5] ISSCC 2016[3] 2016 [6]
Fig. 10. Phase noise plot of recovered clock.
Technology 65 nm CMOS 16 nm FinFET 28 nm CMOS 32 nm CMOS 65 nm CMOS
Digital AFE Data Rate 10 56 32 25 28
450 µm (Gb/s) NRZ PAM-4 PAM-4 NRZ PAM-4
Digital ADC 32x TI SAR 32x TI SAR 32x TI SAR 4x Flash ADC 4x Flash ADC
Coarse Comps Edge Comps

Architecture ADC ADC ADC


Interface CH0 CH90
ENOB@ 4.74 4.9 5.85 4 4.1
Nyquist
Timing N/A Baud-rate Baud-rate Baud-rate Edge & Data Sampled
450 µm

Clocking (TDC) Recovery


Tracking BW --- --- --- --- 10+ MHz
Jitter --- ---- --- --- 0.2 UIpp @ 50 MHz
CH180 CH270 Tolerance
Channel Loss 36.4 dB 25 dB 32 dB 40 dB 30 dB
Equalization @ 5 GHz @ 14 GHz @ 8 GHz @ 12 GHz @ 7 GHz
Power (mW) 79(w/o DSP) 410(w/o DSP) 320 453 130@30 dB w/o
Off. Cal. Fine Comps REF GEN 87(w DSP) 45 @ 15 dB DSP
Fig. 11. Implemented prototype in TSMC 65nm. 160@30 dB with
60 @ 15 dB DSP
information and 3-bit timing error that directly updates the FOM (pJ/bit) 8.7 7.32 10 18.12 5.71@ 30 dB with
2.14@ 15 dB DSP
PDAC of the VCO.
The advantage of TDC is clearly visible with the recovered Considering a large ASIC with 10% LR, 50% MR and 40%
clock phase noise and jitter profile in Fig. 9 and 10. Benefit of SR, this solution can achieve better than 4 pJ/bit average
ADC bypass path allows us to have much lower loop latency energy efficiency.
and therefore, peaking free jitter transfer that results in V. CONCLUSION
improved jitter tolerance performance (Fig. 12). The proposed ADC-TDC based solution improves FoM by
2x when compared to the state-of-the-art ADC based solution
IV. HARDWARE EFFICIENT DIGITAL EQUALIZATION in Table-I. Most of this improvement is achieved from the
Implemented prototype of ADC and TDC is shown in Fig. variable resolution ISI aware ADC compared to general
11. The FFE is implemented in FPGA, however it’s power purpose ADC. Similarly, TDC allows ISI filtering and higher
estimated from transistor level simulation in 65nm process. resolution PD to achieve 10+ MHz BW
The digital equalizer is built with programmable 3 to 8 tap REFERENCES
FFE, with 1-pre and 7-post taps. Unlike DFE, FFE does not [1] R. C. Taft & M. R. Tursi, "A 100-MS/s 8-b CMOS subranging ADC
have critical latency concern. Therefore, FFE implementation with sustained parametric performance from 3.8 V down to 2.2 V,"
is simpler, can fully take advantage of supply scaling and its JSSC 2001.
[2] P. Nuzzo et al.,"A 6-Bit 50-MS/s Threshold Configuring SAR ADC in
area scales gracefully with technology. More importantly, 90-nm Digital CMOS," TCAS-I, Jan. 2012.
unused post cursor taps can be gated to reduce FFE power [3] D. Cui et al., "3.2 A 320mW 32Gb/s 8b ADC-based PAM-4 analog
from 30 mW to 16 mW. For 30 dB loss case, we needed to use front-end with programmable gain control and analog peaking in 28nm
CMOS," ISSCC 2016.
all 8 taps with ADC set to highest resolution of 5.5 bit to [4] A. Shafik et al., "3.6 A 10Gb/s hybrid ADC-based receiver with
achieve BER lower than 10-8 (Fig.12). However, for lower embedded 3-tap analog FFE and dynamically-enabled digital
loss, at 20 dB we only needed 4.5-bit resolution with 4 FFE equalization in 65nm CMOS," ISSCC, 2015.
taps only. For 15 dB or lower only front-end linear equalizer [5] Y. Frans et al., "A 56Gb/s PAM4 wireline transceiver using a 32-way
time-interleaved SAR ADC in 16nm FinFET," VLSI 2016.
with Tx-FIR is sufficient. Therefore, this solution allows us to [6] S. Rylov et al., "3.1 A 25Gb/s ADC-based serial line receiver in 32nm
linearly scale the power consumption with loss (Fig.13). CMOS SOI," ISSCC 2016.

Das könnte Ihnen auch gefallen