Sie sind auf Seite 1von 3


4.4 A VLSI Analog Computer/Math 4.4.2, the solid, bold line shows how the output of block X is rout-
Co-Processor for a Digital ed to the input of block Y within the same MB. To route a block’s
output to the input of a block not in the same MB, shared, global
Computer wires are used. The dotted, bold line in Fig. 4.2.2 shows how the
output of a block in MB W is routed to the input of a block in MB
Glenn E. R. Cowan, Robert C. Melville, Yannis P. Tsividis Z.

Columbia University, New York, NY Digital programmability has been incorporated into most blocks.
For example, the programmable mirrors ( labeled A:B and B:A in
Analog computers of the 1960s were physically large, tedious to Fig. 4.4.3) allow for variable input and output bias/signal ranges,
program, and required significant user expertise. However, once while maintaining a fixed bias of 1µA to the core. Composite
programmed, they rapidly solved a variety of mathematical prob- devices ARRAY1, ARRAY2, and ARRAY3 generate the different
lems, notably differential equations, without time-domain dis- bias currents, while remaining in a nearly constant level of inver-
cretization artifacts [1]. Analog computers were superseded by sion. The core contains two differential to single-ended log-
digital ones long ago; the technical community has hardly con- domain integrators similar to [2]. With M1 and M2 off, the volt-
sidered the advantages that modern VLSI techniques could age necessary to cancel the integrator’s offsets (input and output)
impart to analog computers. Reported in this paper are the can be sampled through M3 and M4 and stored on CHold1 and
results of an attempt to investigate whether analog computers CHold2. The integrator also features common-mode feedback, a
could be revived in a modern VLSI context, with significant DAC for tuning ITUNE, and a memory block for storing range set-
advantages as complements to digital computers. A single-chip tings and the DAC’s input.
VLSI analog computer (AC), capable of handling large-order
problems (e.g., nonlinear differential equations of up to 80th order) The 100mm2 AC chip dissipates 300mW. Performance is summa-
is presented. This AC is meant to operate in a symbiotic environ- rized in Figure 4.4.6. The rest of the paper presents representa-
ment with a digital computer (Fig. 4.4.1), complementing the lat- tive results of solutions using the AC. Figure 4.4.4 shows results
ter and acting as a co-processor to it. The combination has sever- for a one-dimensional heat equation solved using the method of
al features: (a) tight coupling to the digital computer through an lines on the AC. The spatial variable, x, was discretized into 16
A/D-D/A interface which facilitates a pre-computation calibration points, requiring 14 integrators; each end is defined by a bound-
process, (b) digital programmability through a standard on- ary condition. For a one degree step at one end, the maximum dif-
screen Simulink interface, (c) real-time simulations and real-time ference between the AC’s solution and a numerical solution com-
observation of the effects of mathematical parameter changes, (d) puted using MATLAB was 0.025 degrees, with the former taking
fast approximate solutions to difficult mathematical problems, 1.2ms. The same approach was extended to the solution of a 45th
sometimes much faster than is possible with a digital computer, order two-dimensional heat equation; the rms steady-state error
and with guaranteed convergence to physical solutions, (e) capa- is 2.6%. Such errors may be reduced by improving the calibration
bility of passing an approximate solution as a first guess to a dig- techniques.
ital computer, thus speeding up a numerical solution, and (f)
interfacing to measurement instruments. A second example is a transient noise simulation. Such simula-
tions take a long time on a digital computer because including
Analog circuits can solve ordinary differential equations (ODEs) wideband noise necessitates small time steps and the generation
of the form x = f (x, u, t) where x is a state-variable vector of of meaningful statistics can necessitate a long simulation inter-
length n, u is an input vector of length m and f is a length n vec- val. Figure 4.4.5 shows results for a simple example. There is a
tor of possibly nonlinear functions. To solve the ODE, an AC good agreement between the AC and MATLAB simulations while
needs n integrators, m inputs, and sufficient circuitry to imple- MATLAB simulation takes 96s (running on a Sun Blade 1000)
ment f or an adequate approximation of it. Several techniques for whereas the AC takes 4s - a factor of 24 times faster. This simu-
converting partial differential equations (PDEs) to ODEs of the lation used only one MB, allowing for 15 other simulations to take
above form exist, e.g., method of lines. place simultaneously and a potential speed-up of nearly 400.

The reported IC contains 416 functional blocks as detailed in Fig. As a third example the AC’s periodic solution of Duffing’s equa-
4.4.6 and a large number of signal-routing switches. Variables are tion has been used as the initial guess to reduce the number of
represented by differential currents (i.e., the circuits are current- iterations of a Newton-Raphson based steady-state ODE solver
in/current-out) and hence signals are added by connecting multi- (based on algorithms in [3]) from 37 to 5 and its computation time
ple signal routes together. Cross-coupling of a differential signal from 12.5s to 0.76s.
allows it to be inverted or subtracted from another signal. The
chip contains 160 blocks that allow a signal to be fanned out to at The work in progress, in consultation with applied mathemati-
most three other blocks. Extensive use of weak inversion circuits cians and numerical analysts, is to apply the AC to the study of
keeps power dissipation low despite the large amount of circuitry problems such as nonlinear stochastic PDEs and real-time con-
(the chip area is 1cm2). High-speed computation is still possible trol of chemical reactions, for which existing numerical methods
with the ensuing low currents, due to the inherent speed advan- are problematic.
tages of analog computation.
The circuits are divided into a 4×4 array of identical macroblocks The authors thank Y. Kevrekidis, D. Keyes, B. Ogunnaike, S. de la Veaux,
and M. Weinstein for their valuable suggestions, MOSIS for fabrication
(MBs) (Fig. 4.4.7). Each block’s input is connected to a wire run- and packaging, and NSERC for graduate student funding.
ning horizontally (Fig. 4.4.2) and each block’s output is connected
to a wire running vertically. These wires extend outside of the MB References:
to allow for connection between blocks in other MBs. For simplic- [1] G.A. Korn and T.M. Korn, Electronic Analog and Hybrid Computers,
ity each block is shown with one input and one output. Each wire McGraw-Hill Book Company, 1964.
[2]M. Punzenberger and C. Enz, “A New 1.2V BiCMOS Log-Domain
represents a pair of wires, carrying a differential current. There Integrator for Companding Current-Mode Filters,” Proc. ISCAS, pp. 125-
is an array of CMOS pass-transistor switches and SRAM that 128, May, 1996.
holds their states wherever two groups of wires cross. In Fig. [3] K. Kundert, J. White, and A. Sangiovanni-Vincentelli, Steady-State
Methods for Simulating Analog and Microwave Circuits, Kluwer, 1990.

82 • 2005 IEEE International Solid-State Circuits Conference 0-7803-8904-2/05/$20.00 ©2005 IEEE.

ISSCC 2005 / February 7, 2005 / Salon 8 / 3:15 PM

Matlab & Simulink

card ( in a PC )
Analog computer /
math co processor
Test equipment
computational ( Optional, for real
time simulation and
( accelerated by
observation )
analog computer’s
solution )

Figure 4.4.1: Analog computation environment. Figure 4.4.2: Block diagram of the analog computer.
iin1- + 1uA

iin2- + 1uA

icore- + 1uA

d( iout+ - iout-) = KITUNE( iin+ - iin- )

iin1+ = iin2+ = B/A*(iin+)
iout+ = A/B*(i +)
ITUNE core

10-bit Mirror Image T(1,t)

Integrator Core
DAC (Offset cancellation and 1
bias circuits duplicated above)
ITUNE x=1 t
iin1+ + 1uA

iin2+ + 1uA

icore+ + 1uA

iout+ For 100nA bias

iin+ For 1uA bias T(0,t)
A:B B:A x
x=0 t
M1 M2 M4
M3 2
w T ( x, t )
D T ( x, t )
CHold2 w2x
ARRAY1 ARRAY2 V 1 V2 where T is the
For 20uA bias temperature along the
ARRAY3 rod and D is 0.0178
Nine transistors connected in
Offset cancellation one of three ways.

Figure 4.4.3: Integrator core, input and output circuitry. Figure 4.4.4: Results for a 1-D heat equation.

Supply voltage 2.5 V core, 3.3 V digital I/O

Results for: Technology TSMC025 ( CM025, one poly, five metal )
Statistics for the solution of x
Die area 100mm2
dx df ( x) 1.5

 n(t ) . . PDF of x, Simulink

PDFof x, Analog Computer Number of integrators 80
dt dx .
Scaled noise PDF, f(x), PDFs of x

_ _ Noise PDF
f(x) f(x) Number of VGAs / two-input multipliers 80
where n(t) is noise with a PDF of x, Matlab
Number of fanout blocks 160
probability density 1 PDF of x, AC
Number of programmable nonlinear blocks 64 or 32 (1)
function (PDF) shown on Number of logarithmic blocks 16
the right and f(x)= 0.19x4 Number of exponential blocks 16
+ 0.014x3 -0.25x2 (also
Total number of functional blocks 416
shown on the right). 0.5

Number of analog inputs 64

Number of analog outputs 64

The solution for x was Integrator time constant range 5 - 60Ps

computed using Matlab
MATLAB Fanout blocks RMS deviation from unity gain 0.2%
and the AC and is VGA linearity for unity gain and a gain of 2.5(2) 0.044%, 0.11%
summarized by PDFs, Power dissipation, with all circuits active 300mW
shown on the right. −0.5
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 (1) The 64 programmable nonlinear blocks can each implement sign, absolute value or
saturation. When used in pairs, they can implement minimum, maximum, greater than,
x less than, gate, track and hold, or sample and hold.
(2) RMS nonlinearity over +/- 60% of the input range across all VGAs.

Figure 4.4.5: Simple differential equation with noise. Figure 4.4.6: Performance summary.
Continued on Page 586



Figure 4.4.7: Chip micrograph.

586 • 2005 IEEE International Solid-State Circuits Conference 0-7803-8904-2/05/$20.00 ©2005 IEEE.