Sie sind auf Seite 1von 4

1156

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 7, JULY 1997

Clock/Data Recovery PLL Using Half-Frequency Clock


M. Rau, T. Oberst, R. Lares, A. Rothermel, R. Schweer, and N. Menoux

Abstract A clock and data recovery PLL is described for serial nonreturn-to-zero (NRZ) data transmission. The voltage controlled oscillator (VCO) works at half the data rate, which means for a 1-Gb/s data rate, the VCO runs at 500 MHz. A specially designed phase comparator uses a delay-locked loop (DLL) to generate the required sampling clocks to compare clock and data. The VCO can typically be tuned from 350 MHz to 890 MHz and the phase-locked loop (PLL) locks between 720 Mb/s and 1.3 Gb/s. Data recovery is error free up to 1.2 Gb/s with a 9-b pseudorandom data sequence. The core consumes 85 mW (3.3 V) at 1 Gb/s. Index TermsBang-bang control, CMOS digital integrated circuits, data communication, high-speed integrated circuits, phase locked loops, synchronization.

Fig. 1. Classic PLL.

I. INTRODUCTION

IGITAL signal processing becomes economical in consumer applications. The main requirement there is low cost in mass production. Digital processing and transmission has to be carried out with low power and in cheap IC packages. Data transmission between different digital signal processing ICs inuences signicantly the power consumption and the system cost. For video signal transmission in 100 Hz TV sets, typically 16 data lines in parallel are driven with 27 MHz rail-to-rail nonreturn-to-zero (NRZ) data signals. Sharp data transitions are in use to ensure reliable synchronous operation. A power saving alternative could be found in low-swing highspeed serial data transmission in the range of 500 Mb/s or more. However, this kind of high-speed data transmission has to be asynchronous. The most economic solution avoids separate transmission of the clock. In that case, clock recovery from the NRZ data stream is required. In this paper we describe a phase-locked loop (PLL) which is designed to process more than 1 Gb/s data in a 0.5- m CMOS technology.

2) loop lter, ltering the phase detector output and forming the control signal for the oscillator; 3) voltage controlled oscillator (VCO). The unusual feature in our design is the phase detector, which uses a delay-locked loop (DLL) to generate multiple sampling clocks. Thus, the VCO can run at only half the data rate, which means that we can detect a 1-Gb/s serial data stream with a 500-MHz VCO. This relieves the timing constraints in the phase detector logic and results in well correlated and data independent control signals. Also, at the lower frequency the VCO tuning range is large enough to compensate all technology parameter variations. With this architecture we could achieve higher data rates. The block diagram of the circuit is shown in Fig. 2. No external components are required for the PLL. The loop lter capacitor is integrated on chip together with the VCO, the phase comparator, and a charge pump. The data stream is retimed in two ip-ops with the inverted and noninverted clock. Two ip-ops are required because the clock has only half the data rate. These two half-speed data streams are combined in a multiplexer, forming an output stream at the original data rate. A lock-in circuit is realized on chip, because the phase comparator is not frequency sensitive. III. PHASE COMPARATOR The PLL adjusts the clock to an incoming data stream. Because of the random nature of data there is not necessarily a data transition at every clock cycle. The loop has to handle sequences of consecutive zeroes or ones in the data stream. The following phase comparator output signal properties are essential. First, the phase comparator must not give any output signal if there is no data edge. Second, the duration of the control signal pulses at the data transitions is important, especially if there are few of them. In general, for a good loop performance the control signal should be proportional to the phase error. However, for very high operating frequencies, analog signals depend on the data pattern and become highly nonlinear, because they do not settle during the bit duration. It was found by simulation that different phase detectors with analog outputs [1], [2] limit the PLL operating frequency. On the other hand, clock recovery schemes based on sampling techniques [3], [4] result in uniform digital control pulses. They are

II. ARCHITECTURE The PLL generally consists of three building blocks (Fig. 1): 1) phase comparator, detecting the phase difference between the data and the recovered clock;
Manuscript received December 15, 1996; revised February 6, 1997. This work was supported in part by the German Ministry for Education and Research under Contract 01M2880A. M. Rau was with the University of Ulm, Germany. He is now with Siemens AG, 81359 Munich, Germany. T. Oberst was with the University of Ulm, Germany. He is now with DASA, D-89077 Ulm, Germany. R. Lares and A. Rothermel are with the Microelectronics Department, University of Ulm, D-89081 Ulm, Germany. R. Schweer is with Thomson Multimedia, D-78048 VillingenSchwenningen, Germany. N. Menoux is with Thomson, 38240 Meylan, France. Publisher Item Identier S 0018-9200(97)04386-2.

00189200/97$10.00 1997 IEEE

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 7, JULY 1997

1157

Fig. 2. Clock recovery block diagram.

(a)

Fig. 3. Phase comparator. (b)

best suited to support highest possible data rates at a given technology. The phase comparator used here is an extension of the circuit from [3], modied to work with half the normal clock frequency (Fig. 3). The data stream is sampled at four equally spaced timepoints. The logic circuitry driven by the ip-ops generates the up and down control pulses for the VCO according to Fig. 4. Because these control pulses are generated by clocked ip-ops, they are of well dened width. The advantage is that they do not depend on the data pattern. On the other hand, they do not reect the amount of the phase error, either. The pulse width is constant, even for very small phase errors. This so-called bang-bang operation generates an increased jitter in the locked state. However, the magnitude is much smaller compared to the one introduced by datadependent and nonlinear analog pulses at high frequencies. The phase logic evaluates only rising signal edges, in order not to depend on duty cycle variations of the input signal. There is an issue to be taken care of when dimensioning the ip-ops and the phase logic. The stable operating point of the loop is reached when the signal is sampled exactly at its transition (see Fig. 4). Thus the loop forces the ipop to sample the metastable state, which is not allowed in normal ip-op operation. In this application, however, it is not critical for the operation. If the metastable state is sampled, it does not matter whether it will be interpreted as up or down, because any decision is equally wrong, as we are at the stable operating point, i.e., zero phase error. Only the jitter of the

(c) Fig. 4. Operation of the phase detector: (a) data at sampling time B equals data transition is late the data at the preceding sampling time A frequency up, (b) data at sampling time B equals the data at the following data transition is early frequency down, (c) data at sampling time A sampling time A equals the data at the preceding sampling time A no data edge, no control signal output.

) )

bang-bang operation results. Also, there is an increased short current inside the ip-ops that has to be limited. For uniform pulses and small jitter, absolutely identical sampling intervals are required. Therefore, a DLL has been implemented to generate four 90 shifted clock phases clk1 clk4 from the VCO output signal (Fig. 5). The loop compares the phase of the original clock to a clock fed through four adjustable delay elements. The clock signal repeats with a period . A delay element in Fig. 5 can therefore delay by , or as well by . By rearranging the output signals, are also possible. With a delay times of , it is not possible to compensate for delay element for all technology and environment variations. Therefore, it is necessary to select a larger value for the delay, to just be able to deal with all technology parameter variations.

1158

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 7, JULY 1997

Fig. 5. DLL to generate all 90 phase shifted sampling clocks with high accuracy.

Fig. 7. VCO schematic.

Fig. 8. VCO frequency versus control voltage. Fig. 6. Current mirror charge pump.

VI. LOCK-IN CIRCUIT The bang-bang operation and the data dependent phase detector output signal require a narrow loop bandwidth for a low jitter. This results in a reduced pull-in range of the PLL. Instead of adapting the loop bandwidth during operation we created a lock-in circuit which is active only after power up. For lock-in, a 1010-sequence has to be fed to the circuit. The VCO is swept, starting with the highest frequency. When clock and input frequencies are the same, the sampled data (before the Mux) do not change. An edge-triggered monoop then stops the frequency sweep and closes the PLL. VII. LAYOUT Fig. 9 shows the test chip. A large area is used for the onchip loop lter capacitor (upper left). A comparable area is required for the ring oscillator, including its load capacitors (lower left). Because the series resistance of those load capacitors is more critical compared to the one in the loop lter, a ner nger structure was chosen. All capacitors have been realized as MOS-transistor gates. No special mask is required. In the top right area are located the lock-in circuit and the DLL with its loop lter, whereas in the lower middle and to the right, buffers and control logic can be seen. VIII. MEASUREMENT RESULTS We veried locking of the PLL at data rates from 720 to 1300 Mb/s with pseudorandom sequences up to bit at the data input. However, data recovery is not guaranteed under these conditions because of the clock jitter. Fig. 10 shows the maximum available data rates for different lengths

The stability of the system containing two coupled loops can be guaranteed for two reasons. First, the DLL is a rstorder loop and inherently stable. Second, the time constants of the two loops are two orders of magnitude different.

IV. CHARGE PUMP

AND

LOOP FILTER

The control pulses drive a current mirror charge pump [6] (Fig. 6) which assures that the charge delivered to the loop lter does not vary with the VCO control voltage. The charge pump allows the realization of an ideal integrator transfer function (pole at ) with no additional active amplier, resulting in a zero-phase error in steady state. A simple RC network shown in Figs. 2 and 6 is used for the low-pass loop lter. The current level of the charge pump and hence the charge delivered at every rising data transition can be set to a small value. This allows the implemention of the loop capacitor on chip.

V. VCO Both high oscillation frequency and a wide tuning range are required. We choose a ring oscillator design with variable load capacitors (Fig. 7) based on [5]. Duty cycle is not an issue here, because the ip-ops all are triggered with the same edge; the DLL generates the required phase shifts. This circuit can safely cope with all parameter variations. Fig. 8 shows the VCO tuning characteristic.

IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 32, NO. 7, JULY 1997

1159

Fig. 11. VCO clock output and data output eye pattern at 1 Gb/s with a (215 1)-bit length pseudorandom input sequence.

Fig. 9. Chip micrograph.

Chip core area is 0.38 mm , power consumption without pad drivers is 85 mW at 1 Gb/s, 0.5- m CMOS, 3.3 V supply. Only 1/4 of the power consumption is proportional to the clock frequency, 3/4 are constant. The circuit consumes 91 mW at 1.3 Gb/s. The power saved by using only half the conventional clock frequency is partly used to supply the DLL, which needs 21 mW ( 1/4 of total power at 1 Gb/s). No external components are required, except one reference current, which is not very critical (a 20% variation is allowed). IX. CONCLUSION Complete on-chip clock and data recovery at 1 Gb/s is feasible with a standard 0.5- m CMOS technology. Onchip clock is only 500 MHz in this case. Data are directly demultiplexed one to two in the retiming ip-ops. A multiplexer to regenerate the original data stream was included for measurement purposes only. In applications, serial-to-parallel conversion will normally follow the PLL. In that case, the halved clock frequency is an advantage, because the following blocks can be designed more easily. ACKNOWLEDGMENT The authors greatly acknowledge perfect layout support by Y. A. Savalle and G. Kimmich from TCEC. They thank J. Borel from SGS-Thomson for providing the design kit and acknowledge the fast sample production in the factory. REFERENCES
[1] T. H. Lee, A 155-MHz clock recovery delay- and phase-locked loop, IEEE J. Solid-State Circuits, vol. 27, pp. 17361746, Dec. 1992. [2] B. Thompson, A 300-MHz BiCMOS serial data transciever, IEEE J. Solid-State Circuits, vol. 29, pp. 185192, Mar. 1994. [3] B. Lai and R. C. Walker, A monolithic 622 Mb/s clock extraction data retiming circuit, in Int. Solid-State Circuits Conf., San Francisco, CA, 1991, vol. 306, pp. 144145. [4] A. Pottbaecker, U. Langmann, and H.-U. Schreiber, A si bipolar phase and frequency detector IC for clock extraction up to 8 Gb/s, IEEE J. Solid-State Circuits, vol. 27, pp. 17471751, Dec. 1992. [5] M. Bazes, A novel precision MOS synchronous delay line, IEEE J. Solid-State Circuits, vol. 20, pp. 12651271, Dec. 1985. [6] A. Waizman, A delay line loop for frequency synthesis of de-skewed clock, in Int. Solid-State Circuits Conf., San Francisco, CA, 1994, pp. 298299.

Fig. 10. Maximum data rate versus pseudorandom sequence length for error-free receiving during time of measurement (complies with error rate smaller than 1 10011 ).

of the pseudo random sequences for correct data recovery. Measurement period was 10 clocks (corresponding to a bit ). error rate of At very high data rates, clock and data phase precision has to be better at the input of the retiming ip-ops, because the eyes become smaller. The lower required phase jitter corresponds to shorter pseudorandom sequences. -bit Fig. 11 shows the locked PLL at 1 Gb/s with a length pseudorandom sequence. The clock jitter is about 350 ps, which is caused mainly by the bang-bang operation of the phase comparator. We believe that this behavior can be improved by reducing the uncertain time interval of the sampling ip-op, i.e., reducing their setup-and-hold times and increasing the clock slope. All measurements have been done with the IC housed in a standard 16-pin dual in-line ceramic package which shows rather poor high-frequency performance. It was our goal to demonstrate the circuit in a critical environment. Better results could be expected when using packages with shorter leads.

Das könnte Ihnen auch gefallen