Optical Computing (TDMA)

Time Multiplexed Optical Computers
Harry F. Jordan and Vincent P. Heuring

Center for Optoelectronic Computing Systems
and Department of Electrical and Computer Engineering
Campus Box 425 Boulder CO 80309-0425
both latency and bandwidth criteria. Optical interconnec

tions give a minimum latency, limited only by the physi
cal distance between devices. Digital optical logic is thus
an effective domain for studying architectural techniques
for dealing with latency.
Abstract
This paper describes a technique for time multi
plexing multiple processors on a single hardware plat
rorm. This technique should be useful whenever an archi
tecture employs devices that have high bandwidth and
simultaneously high temporal latency. These architec
tures were explored during the development of a bit serial
optical computer using optical fibers for interconnection
and directional couplers as logic devices. We describe an
optical counter developed in our laboratory as a proof
?f-principle experiment that permits two simultaneous,
mdependent counts to be accumulated on the same
hardware. Future machines may be envisioned imple
menting hundreds to thousands of processors running on
one optical computer.
2. Background
2.1. Serial Optical Architectures
The background of this work is a project of the
Optoelectronic Computing Systems Center at the Univer
sity of Colorado to build an optical, stored program, digi
tal computer[l]. It uses fiber optics and guided-wave opt
ical switch technology developed in the communications
industry to build a bit serial computer that exploits the
optical information capacity of the time domain and also
uses fewer components by operating on all bits of a data
item using the same logic elements. A bit serial architec
ture for a computer with n bit words requires about a fac
tor of n less logic than an equivalent word parallel
machine. This is important because currently available
fast optical switches are expensive. At the same time
serial operation focuses attention on high bandwidth:
Capacitive and inductive effects are absent when bits are
encoded photonically. Very short optical pulses can be
!>roduced and transmitted with very low dispersion, giv
mg perhaps a three orders of magnitude improvement in
time domain capacity over electronics.
1. Introduction
High speed switching and logic devices have two
distinct speed limitations. One is bandwidth, the number
of switching or logic operations that can be done per
sond. The other is latency. the time from the presenta
bO of correct values at the inputs to the development of a
lOgICally correct output signal. Digital electronic logic
has usually been done with devices whose latency is small
compared with their reciprocal bandwidth. Microwave
amplifiers, on the other hand, often have a latency longer
than the cycle time of any frequency within their pass
band. Thus there is no direct, device independent rela
tionship between bandwidth and latency.
The otical bit serial computer has some similarity

to a convenbonal one. It has a program counter' instruc
tion register, accumulator, ALU, main memory, and con
trol unit with an instruction set equivalent to that of a
simplified microcomputer. Memory and registers are
dy lines. The main memory is a large fiber loop con
tammg, say, m words of n bits as one cycle of mXn bits.
A register is an n bit loop and a single bit store is
represented by a loop with an optical transit time of one
cock priod. The computer is constructed using pre
lsely n:mmed lengths of fiber and approximately 75 opt
Ical logIc elements. The similarity to conventional archi
tecture is deceptive because the machine contains no ftip
flops or latches. All storage is by means of delay loops,
and fiber lengths are designed to ensure simultaneous
This paper shows how to make use of devices with

long latency but high bandwidth in digital logic circuits so
that the effective bit rate is equal to the bandwidth rather
than the reciprocal of the latency. The technique is to
time multiplex streams of independent information on a
singl se of hardware. The simultaneous availability of
mulbple mdependent data streams relates the situation to
that o! parallel processing, while the temporal aspects
make It correspond directly to pipelined processing.
The architectural techniques are illustrated with cir
cuits and experiments using travelling wave optical
witches. Such switches potentially require lower switch
mg energy per bit than optical devices that are fast by
1991 ACM 0-89791-459-7/9110370 $01.50
370
light pulse at a clock time is a logic one, and no light
arrival of pulses to be combined by a logic gate. The syn

chronization conditions are described
in
[2].
These
represents a zero. The waveguide switch computes the

multiplexer function shown, and is logically complete.
Interconnection is done with single mode fiber, and fanout
by 3 dB fiber couplers, which are also used to merge sig
nals from two sources when they are non-interfering. The
"time-of-flight," latchless architectures are possible

because of the low dispersion of optical pulses and have
the characteristic of being speed scalable. Speed scala
bility means that the architecture is invariant to clock
speed in the sense that as clock speed increases, the only
change that needs to be made in the architecture is to
shorten the distance between devices by a distance pro
portional to the increase in clock speed. (Provided the
delay schematic shown in Fig. 1 represents a coil of fiber

of delay K, which can implement a K bit serial memory
if is the clock period.
device bandwidth allows operation at the higher speed.)
2.2. Potentially High Speed Limited by Latency
Few optical logic gates are commercially available

at this time, although new devices are rapidly being
developed. Choice of an implementation domain for the
computer was strongly influenced by device availability.
To understand the potential value of speed scalable

architectures, one can extrapolate system speeds for dev
ices that are still in the research stage. It is physically
possible to produce and propagate 10 femtosecond optical
pulses, which translates to a bandwidth of 100
Prior work in optics applying most directly to this type of

system is in communications and signal processing. Sin
gle and multi-mode fiber and connector systems have
been developed and commercialized[3]. Static directional
couplers are available with specified power splitting ratios
and can be used for fan-out or for combining noninterfer
ing signals. Electrically switched directional couplers[4]
terabits/second. Haner[6] has demonstrated 100 fem

tosecond resolution in a time compressed waveform,
showing the attainability of 10 terabits/second data rates.
are commercially available, with bandwidths above 1

GHz. They are used for modulation, multiplexing and
demultiplexing[5] of optical communications signals and
can be used as optical logic elements. To get a com
A fast, logically complete optical switch has been demon

strated by Islam[7] who built NOR gates using 300 fem
tosecond solitons. His gates show that optical switching
and transmission may attain similar speeds. Significant
speed improvement may also be expected for integrated
electro-optic switches, waveguides, detectors and elec
ponent with all inputs and outputs optical, we add a pho
tronics in a III-V materials system[8].
todetector, amplifier and electrode driver to allow the
For
switch to be optically controlled. The above devices are

used as shown in Fig. 1 to provide an implementation
time-of-flight,
latchless
architectures,
the
minimum achievable length for the shortest delay loop

determines the clock period. As an example, consider the
one bit storage element for the carry of a binary adder.
The minimum propagation time in the carry feedback
loop places an upper bound on the clock rate. Two rea
domain for digital optical computing. The computer uses

intensity encoding of bits and synchronous operation. A
sons for a minimum loop time arise in the implementation

described: the directional coupler length and the latency
of the electronics used to drive the switching electrodes.
These effects are independent of miniaturization possible
with integrated optics and combine to increase the end
to-end latency of the switches. Their bandwidth, how
Z9
D
E
=AC +BC
=BC +AC
ever, is independent of latency. This well known charac

teristic of high speed electronic amplifiers extends to the
control of directional couplers, where traveling wave
electrodes can be used to effect the optical switching.
LiNbO 3 Waveguide Switch

JL
..n.
=>-C ..n.
1L-
..nrL
---IL
-;!Ioo ..nrL
As an extreme example of long latency, the soliton

gates[7] cited as a demonstration of very high speed logic
used 20 meters of fiber to obtain sufficient interaction
length yet have bandwidths in the terahertz region. Such
extreme ratios of reciprocal latency to bandwidth are not
=>-C
as fanout
as wired OR
3 dB Static Directional Coupler
-----------
expected in mature optical devices, but interaction lengths

on the order of a centimeter in terahertz bandwidth gates
would not be surprising because a long interaction length
Cilll lead to lower power switching. Thus long latency
limits the minimum feedback loop length. In the next
section, we propose multiplexing or pipelining such dev
ices to increase their effective throughput.
Fiber Delay Memory Schematic

Figure 1: Guided Wave Implementation Domain.
371
Input
Streams
Time
Mux
Bit
Serial
Computer
Time
DeMux
Output
Streams
Figure 2: Time Multiplexed Multiprocessor

Input signals can be multiplexed with differential
3. Time Multiplexed Architectures
delays and passive couplers if signals can withstand a fac
tor of N power loss. Alternatively, the active switches of
3.1. Multiplexing Many Computers on the Same
Fig. 1 can multiplex differentially delayed signals, or the
Hardware
delays can be built into the multiplexer tree. On output, a

demultiplexer tree routes pulses from different multi
Time multiplexing is one way to use high speed,

long latency devices at their full bandwidth, as opposed to
running them at a maximum clock rate of their reciprocal
plexed time slots to different spatial points. Any differen
tial delays required before detecting or further processing

the outputs can either be built into the tree or follow it.
Figure 3 shows a passive input multiplexer and active out
put demultiplexer.
latency. This technique decouples consecutive bits of the

serial data stream and makes use of high bandwidth in
spite of a limit on the length of the smallest loop. Time
multiplexing interleaves independent bit streams on the
same hardware. Assume the clock consists of N minor
cycles within a major cycle. The data of a bit serial com
puter has a separation between bits of one major cycle.
But multiplexing N independent bit streams means that
one bit passes through each logic element every minor
3.2. Example: A Time Multiplexed Binary Counter

The general discussion can be motivated by a sim
ple existing system. We have implemented time multi
cycle, where the bandwidth of the logic element exceeds
plexing for a simple four bit serial binary counter. Figure

4 shows a non-multiplexed four bit counter consisting of
one half-adder and an OR gate to insert the increment sig
the reciprocal of the shortest loop time by a factor of N.

These time multiplexed multiprocessors require multi
plexing of processor inputs and demultiplexing of outputs,
as shown in Fig. 2. Since multiplexing and demultiplex
ing do not require feedback, they can be implemented
with long latency devices.
nal, which appears as a one bit every fourth bit time. As

shown, the counter is about to increment from 0001 to
0010. This counter was built[ll] using the implementa
tion domain of Fig. l. The bandwidth required for 100
MHz operations was not difficult to achieve[12], but low
latency was harder to obtain. The initial implementation
ran at 50 MHz simply because the optical path for the
carry loop passed through two switches, two 3 dB
Time multiplexed multiprocessors have been built

with electronics. An early commercial one implemented
the ten peripheral processors of the CDC6600[9], and a
more recent pipelined multiprocessor, the DeneIeor
couplers, connectorized fiber and the drive electronics of
HEP[IO], multiplexed up to 128 instruction streams on

one set of processor hardware. Pipelined vector units in
current supercomputers time multiplex independent vec
tor components to achieve high speed. Pipelining for
latency tolerance is done at the numeric operation level in
arithmetic pipelines and systolic arrays, but pipelining at
the gate level occurs only in the highest speed designs.
The optical domain considered here can give insight into
ing the number of switches in the carry loop led to a

design which has only one switch in the carry loop. It
was successfully run at 100 MHz
systems pipelined at the gate level, especially if such

designs use no latches. The elimination of latches is not
long latency devices. 100 Mbits per second could move

through the original counter design, provided that a feed
intrinsically desirable, but since latching implies a device
back signal generated from a bit at time t need not com
entering a stable state, and since time constants associated

with stable states are long compared to those of unstable
states, the highest speed designs may well avoid latches.
Thus another way to achieve 100 MHz operation is to use
a switch. Its length could not be shortened to the required

10 ns. Redesign of the counter with the goal of minimiz
Time multiplexing the counter hardware is a more

fundamental way to achieve high bandwidth circuits using
bine with a bit arriving any sooner than time t + 20ns.

the original circuit but interleave two independent count
372
riL..--O 1
L.,.......J77"\-
11
02
12
13
14
Mux
In
... --03
r..04
L,...7:\-
Mux
Out
Is
ri--o
L,...7:\-06
17
Is
r-i---07
L.-77\-08
Figure 3: Time multiplexing and demultiplexing of input and output
Carry
c
Half
Adder
r---y
Sl------,
Figure 4: A Bit Serial Binary Counter of Four Bits.

of pipelining rests on the independence of values that
occupy pipelines simultaneously, in this case the carry
and count loops.
values, which are independently incremented by inter

leaved Increment signals. A diagram for a dual, time
multiplexed, counter at the level of Fig. 4 appears in Fig.
5. The figure shows the counter associated with the bits
in the white background boxes about to be incremented
from 3 to 4 while that associated with the stippled back
ground boxes is about to change from 8 to 9. In this
design, a carry feedback generated from a bit at time t
combines with an increment input bit no sooner than two
minor cycles later. The two Increment input streams can
be multiplexed using only differential delay and a 3 dB
coupler, and the count outputs can be demultiplexed by
one switch toggling at the minor cycle rate. The success
3.3. Physical Parameters of Multiplexing

Some of the parameters important to understanding
the multiplexing process are bandwidth, latency and mul
tiplexing factor. The influence of both devices and sys
tem logic design on these parameters is considered, and
the various clocks in such a system and their duty cycles
are described.
373
Carries
c
Half
Adder
S 1------,
Counts
Figure 5: Two Independent Counters Multiplexed on the Same Hardware
j
Lj
OJ
Define device bandwidth as the maximum mte at

which pulses can be inserted into the device and produce
a correctly shaped pulse at the device output, and system
bandwidth as the maximum clock mte at which intercon
nected devices can be stimulated and retain the pulse
shape at their outputs in order to correctly stimulate dev
ices further down the logic chain. Bandwidth could be
more precisely defined, but the definitions given above
are sufficient for this discussion. Device latency is
defined as the end-ta-end propagation time when per
forming logic on pulses as defined above. Latency is
assumed to be independent of the mte at which bits arrive
at device inputs, and of pulse amplitude and duration.
Lumped delay specified for loop j ;

Sum of device latencies in loop j;
Sum of minimum interconnect
delays in loop j.
Assume that the system runs without time multiplexing

with a clock period. Lumped delays are always an
j .
integer multiple of the clock period, so takej = m
Then the clock period has a lower limit given by:
>
A
L.1_m
OJ
+Lj
.
mj
Intuitively, this equation reflects the fact that the irreduci

ble loop delay, OJ + Lj , can be distributed evenly over mj
clock periods. If all devices have a bandwidth greater
than NI, time multiplexing can be used to interleave a
number of independent copies of the system data given by
a system multiplexing factor N . If i ranges over devices
in the system, and 'ti is the reciprocal bandwidth of device
For a specific device, define a device multiplexing

factor, n , as device latency, T, times device bandwidth,
1/'t:
n = TI't.
there are several device types in the system, latencies

and bandwidths are defined for each. The bandwidth of
interconnecting fibers is assumed to be higher than that of
any device, but connectors and layout will lead to irredu
cible interconnection latencies.
i,
The system latency and the system multiplexing

factor for a specific logic design can now be derived. The
formulation of [2] is used to represent an optical circuit as
a graph with nodes representing logic devices, fan-in or
fan-out, and edges representing interconnections. A cycle
in the graph is associated with a lumped delay that
specifies correctly synchronized arrival of logic signals at
combining points. It is thus useful to represent the graph
by a loop basis. Let j range over the loops of the basis
and for each loop define:
That is. the maximum number of independent processors

that can be multiplexed on system hardware being
clocked at its maximum rate is given by the bandwidth of
the slowest device in the system divided by that max
imum clock rate. In the implementation domain for opti
cal logic described above, only the directional coupler
switch is taken to have a finite bandwidth and N is at least
as large as its device multiplexing factor, n .
If
O'
max J
J
+T
J
mj
m'ti
,
The system multiplexing factor can be viewed as

the number of minor cycles which can be accommodated
within one major cycle of the multiplexed system. This
374
determined by the physics of the device. The length of
may be smaller than the above limit because of the need

to provide "guard bands" between data pulses, the need to
this period determines the detector's bandwidth, since two

pulses separated by a time less than this period cannot be
distinguished. Output stream k data in a pulse coded sys
tem is a modulated version of the phase clock CPk. The
response time of an output detector can be almost as large
as A, provided there is sufficient energy in a pulse of duty
cycle d IN to trigger it. At input k , a data bit must gate a
pulse of the CPk clock, which is AdIN in duration, but the
input bit doing the gating need only be shorter than A.
provide synchronizing pulses for phase locking, or for

other systems-related reasons. If the clock period A of the
non-multiplexed circuit were to be made larger than the
limit imposed by the shortest feedback loop then N may
be increased by the factor by which A is larger than the
limit computed above. In this case, more systems could
be multiplexed on the same hardware, but each would run
at lower speed.
Now define the period of a minor cycle clock in a
multiplexed system as tm
MN, and assume a duty cycle
of d , say 50%, for this clock. Duty cycle is the ratio of the
active period of the clock pulse to the total clock period.
Input or output stream k, 1 $; k $; N , of Fig. 2 is associ
ated with a "phase" clock CPk which has period A, duty
cycle dIN and is shifted in time by tm with respect to
CPk-l- The clocks and their relationships are shown in Fig.
6. Because of the lack of synchronizing memory ele
ments (flip-flops), synchronism with the minor cycle
clock must be maintained by other means. Signal restor
ing switches[13] can be used to perform this function with
the implementation domain of Fig. 1. This method of
resynchronizing requires gating a standardized clock
pulse with a stretched version of a degraded signal pulse.
The relationship between duty cycle, pulse stretch, fiber
length error and bit error rate is discussed in reference
[15].
As an example of determining multiplexing factors,

consider an oscillator of period 2A built with a single
directional coupler switch. Referring to Fig. 1, input A is
attached to a constant clock source, and output E is fed
back to the control terminal C to form an oscillator. A
switching rate of 36GHz has been demonstrated[14] for
traveling wave electrode LiNbO 3 switches.
This
corresponds to 1: 2.8xlo-ssec. A packaged switch with
feedback loop for the oscillator will involve at least 20 cm
of fiber, or 1 ns of optical path latency. The photodetector
and amplifier at the control terminal input add about 9 ns
of latency, for a total of IOns. The multiplexing factor
for this simple system is potentially
If system input streams are derived from electronic

data and output streams are converted to electronic form
for later use, the phase clocks have implications for the
optoelectronic conversion process. Most photodetectors
integrate incoming optical power over a time period
As a second example, using one of Islam's NOR

gates[7] to build an oscillator requires 20 m of fiber for
the gate alone, so we can neglect interconnection. The
demonstrated bandwidth was .33 THz or t 3xlQ-12gec.
The minimum non-multiplexed clock period A (half the
Multiplexed
CP2
Clock
Duty cycle
Cycle time tm
n
rL
___
tm
JUlJUl
CPN
Jl
___
Jl
Duty cycle
Cycle time A
= 360.
n
rL
-1 r-
lO-ssec
2.78xlO-ll
Thus 360 independent oscillators of 50 MHz each could

be time multiplexed on this one switch feedback loop.
I/O stream clocks
CPl
diN
=
Ntm
Figure 6: Master Clock and I/O Stream Clocks
375
signal into slots in different relative positions of the out

put frame is associated with a frame delay if any slot
moves toward the start of the frame. Time multiplexed
communications can be switched by demultiplexing into
separated channels, switching in the space domain, and
re-multiplexing the result An architecture developed by
Thompson[16] uses waveguide switches to demultiplex
an input stream into individual time slots, uses fiber loops
to individually delay them, and uses more switches to
multiplex them into the output stream. Leaving out
switches needed to vary the delays, 2N - 2 switches are
used in the multiplexer and demultiplexer.
oscillator period) is
20m
2x108m/s
100 ns.
Thus
1O-7sec
3x1o-1ZSec
3x1()4
oscillators, each running at 5 MHz could be multiplexed

on the hardware. For this system, a lower limit would
probably be imposed by the ability to synchronize the sys
tem. For example, temperature effects in fiber allow an
accuracy of only one part in 1()4 per degree C varia
tion[15]. Thus only about 1()4 multiplexed oscillators run
ning on one soliton gate could be kept synchronized
without some additional means of synchronization control
if environmental temperature varied by one degree.
A time slot interchanger in a time multiplexed mul

tiprocessor corresponds to a multiprocessor interconnec
tion network in a spatially parallel multiprocessor system.
By pursuing time domain equivalents of multistage spatial
interconnection networks, Ramanan[17] developed archi
tectures for time slot interchange using only order 10g:zN
fiber delays and exchange switches. The basic building
block is a switch connected to a delay loop in a feedback
configuration. It can selectively interchange pairs of time
slots separated by a fixed time given by the length of the
delay loop, which is taken as a multiple of the slot time,
d. Figure 7 shows the situation for a delay of one slot
time. Any number of pairs can be interchanged by setting
the control for exchange (x) for all time slots except the
second of a pair to be exchanged, for which it is set for
straight connection (=). The Benes[18] network, with
2Nlog2N - 1 exchange switches, is a universal space
domain switch which can be described by a recursive
construction involving pairwise exchanges. Ramanan's
time domain analog of this network can perform any time
slot permutation on a frame of N
2k slots with only
2log2N - 1 of the above building blocks.
4. COMMUNICATION AMONG PROCESSORS
The processors in a time multiplexed multiproces

sor must occasionally exchange information. One can
think of exchanging of information in main memory,
although the same principles would apply to information
in registers or the ALU. Information exchange is imple
mented by rearranging the order of individual bits in time,
in effect permitting access by a given processor to infor
mation in the multiplex slot of another. Schemes for
implementing interprocessor communication in these
time-multiplexed architectures include minor cycle bit
phase shifting and time slot interchanging.
Minor cycle bit phase shifting involves providing
paths to memory for communicating processors that have
a delay differential of the number of minor cycles
between the multiplex slots of the two processors. Circui
try to implement this shift may be as simple as delaying
memory data to or from a given processor by an amount
equal to the time shift of this processor with respect to the
first minor cycle, effectively permitting all processors
access to the same global memory, or complex enough to
permit access to any other processor's memory space
through activation of the appropriate control.
One block with delay loop of length N 12 can selec

tively exchange any pair of slots separated by N 12 units.
Using terminology from time division multiplexing

in communications, one bit from each of N processors
constitutes a frame consisting of N time slots. Time slot
interchangers in communications map an incoming
transmission associated with the i -th time slot to an out
going message for a receiver associated with the j-th time
slot. In a multiplexed computer, communication among
processors corresponds to time slot interchange of bits
within a major cycle so that they appear at new minor
cycles. An N way time multiplexed n bit register con
tains n frames of N bits each. Applying the same inter
change operation to each of the n frames will exchange n
bit words among the N processors sharing the register.
The permutation of information from slots of an input
Input Seq.
Control Seq.
Output Seq.
Figure 7: Exchange of Time Slot Pairs
376
The frame suffers an overall delay of N12 slot times. If

we now use an N12 exchange switch at both input and
output of a universal interchanger for frames of length
N12, as shown in Fig. 8, we have a recursive construction
for a universal interchanger of length N. The input stage
allows time slots to be selectively exchanged between
first and last half frames, the center section permutes each
half frame arbitrarily, and the output stage again allows
selective exchange of pairs between half frames. This is
sufficient to apply the Benes looping algorithm[19] to
show that if the center can permute frames of N12 slots,
the whole network can permute frames of length N. If
N = 2" is a power of two, continuing the recursion until a
one block exchanger for adjacent slots is left in the center
yields a general time slot interchanger with 210gzN - 1
switches and delay loops, as shown in Fig. 9. An alterna
tive design in which the delays increase toward the center
as powers of two is also possible but more difficult to
describe. Thompson's design requires 2N 2 switches
for the demultiplexer and multiplexer alone. For permut
ing 1024 time slots, the new design requires 19 switches
compared with more than 2046 for the other architecture.
-
The above form of the time slot interchanger cannot

be implemented with long latency devices, since its shor
test feedback loop must be one minor cycle, tm., in order
to rearrange bits in a multiplexed frame of one bit per
processor. However, with two switches and three delay
loops of lengths , K and (K + l) an exchange stage
performing the function of Fig. 7 but having a frame
delay of K instead of can be built. Using such stages
in the recursive construction of Fig. 8 also yields a univer
sal time slot interchanger but with a longer frame delay
and twice as many switches. If the constant K is chosen

so that Ktm. is at least twice the switch latency, a distri
buted delay design for the stage can be realized with
non-negative fiber lengths.
The time domain network can interconnect N
independent time multiplexed processors just as the Bens
network can interconnect N processors separated 10
space. This new architecture shows how optics can give
insight into time-space tradeoffs which may even have
advantages for electronic implementation. Since time slot
interchange forms a large fraction of all telecommunica
tions switching, the practical value of a method using
order lo equipment instead of order N may be large.
5. CONCLUSIONS
The paper has considered computer architectures in
a realm where device latency is long relative to reciprocal
bandwidth. These "latency limited" architectures have
the characteristic that their maximum clock rate is limited
by the size of their smallest feedback loop rather than the
switching speed of their logic devices. We propose a
solution to this problem that employs multiple processors
executing on the same hardware, and we show how to
compute the upper bound on the number of procssors
that a given architecture can support. We also dISCUSS
several means of permitting communications between
processors. While the device technology that leads to
these architectures is in its infancy, we can expect such
devices to become increasingly prevalent in the future.
N/2 Permute
(used twice)
N/2
N/2
Figure 8: Recursive Construction of a General Time Slot Interchanger.
Figure 9: A Time Slot Interchanger with 2log2N - 1 Switches.
377
6. ACKNOWLEDGEMENT
The Center for Optoelectronic Computing Systems

is sponsored in part by NSF grant number CDR 8622236
as part of the Engineering Research Centers Program, and
the Colorado Advanced Technology Institute (CATI), an
agency of the State of Colorado.
[9]
J.E. Thornton, Design of a Computer: The Control

Dat a 6600, Scott, Foresman and Co., Glenview IL
(1970).
[10]
J.S. Kowalik, Ed., Parallel MIMD Computation:

The HEP Supercomputer and its Applications, MIT
Press (1985).
[11]
A.F. Benner, J. Bowman, T. Erkkila, R.I. Feuer

stein, V.P. Heuring H.F. Jordan, J. Sauer and T.
Soukup, "Digital Optical Counter using Directional
Coupler Switches," submitted to Applied Optics.
Also available as Technical Report 90-31,
Optoelectronic Computing Systems Center, Univer
sity of Colorado, Boulder CO 80309-0525 (1990).
[12]
RJ. Feuerstein, T.Soukup, V.P. Heuring, "100

MHz Optical Counter using Directional Coupler
Switches." Submitted to Optics Letters.
[13]
Harry F. Jordan, "Fiber Optic Computer Architec

tures," Proc. Digital Optical Computing (Critical
Reviews), SPIE Critical Reviews Vol. CR 35
(1990).
[14]
S.K. Korotky and J.I. Veselka, "Efficient switching

in
a
72-Gbit/s
Ti:LiNb0
binary
3
multiplexer/demultiplexer," in Digest of Con! on
Optical Fiber Communication, 1990 Tech. Digest
Series, V. 1, Optical Society of America, Washing
ton, DC, p. 32 (1990).
[15]
D. B. Sarrazin, H. F. Jordan and V. P. Heuring,

"Fiber optic delay line memory," Applied Optics, V.
29, No. 5, pp. 627-637 (10 Feb. 1990).
[16]
R A. Thompson, "Architectures with Improved

Signal-to-Noise Ratio in Photonic Systems with
Fiber-Loop Delay Lines", IEEE J. on Selected
Areas in Comm., V. 6, pp. 1096-1106 (1988).
[17]
S.V. Ramanan, H.P. Jordan, and J.R. Sauer, "A

New Time Domain Multistage Permutation Algo
rithm," IEEE Trans. on Inform. Theory, Vol. IT-36,
No.l, pp.I71-173 (Jan. 1990).
[18]
V.E. Benes, Mathematical Theory of Connecting

Networks and Telephone Traffic, Academic Press
7. REFERENCES
[1]
[2]
[3]
V.P. Heuring, H.F. Jordan, J.P. Pratt "Bit Serial

Optical Computer Design," Proc. SPIE, V. 963, pp.
346-353 (1988). Also available as Technical
Report 88-01, Optoelectronic Computing Systems
Center, University of Colorado, Boulder CO
80309-0525 (Jan. 1988)t.
J.P. Pratt and V.P. Heuring, "Designing Continuous
Dataflow Optical Computing Systems, I. Synchron
ization," Optical Society of America 1990 Annual
Meeting Technical Digest, V. 15, p. 123 (Nov. 4-9,
1990).
C.K. Kao, Optical Fiber Systems: Technology,
Design and Applications, McGraw-Hill (1986).
[4]
S.K. Korotky and RC. Alfemess, "Waveguide

Electro-optic Devices for Optical Fiber Communi
cation," in Optical Fiber Telecommunications, I.P.
Kaminow and S.E. Miller, Eds., Academic Press
(1988).
[5]
RS. Tucker, S.K. Korotky, G. Eisenstein, U.

Koren, G. Raybon, J.J. Veselka, L.L. Buhl, B.L.
Kasper and R.C. Alfemess, "4 Gb/s optical time
division multiplexed system experiments using
Ti:LiNbO switCh/modulators," Tech. Digest Top.
Meet. on hotonic Switching, Paper FD3, Incline
Village, NV (1987).
[6]
M. Haner and W.S. Warren, "Generation of arbi

trarily shaped picosecond optical pulses using an
integrated electrooptic waveguide modulator,"
Applied Optics, V. 26, No. 17, pp. 3687-3694
(1987).
[7]
M.N. Islam, C.E. Soccolich and D.A.B. Miller,

"Low energy ultrafast fiber soliton logic gates:,"
Optics Letters, V. 15, pp. 909 (1990).
[8]
(1965).
[19]
O. Wada, "Optoelectronic integration based on

GaAs material, a tutorial review," Optical and
Quantum Electronics, V. 20, p. 441 (1988).
tCenter Technical Reports are available without charge by writ

ing to the address above.
378
D. C. Opferman and N. T. Tsao-Wu, "On a Class of

Rearrangeable Switching Networks - Part I: Control
Algorithm; Part II: Enumeration Studies of Fault
Diagnosis", Bell System Technical Journal, pp.
1579-1618 (1971).

Optical Computing (TDMA)

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Optical Computing (TDMA)

Hochgeladen von

Copyright:

Verfügbare Formate

Time Multiplexed Optical Computers

Harry F. Jordan and Vincent P. Heuring

both latency and bandwidth criteria. Optical interconnec

The otical bit serial computer has some similarity

This paper shows how to make use of devices with

1991 ACM 0-89791-459-7/9110370 $01.50

light pulse at a clock time is a logic one, and no light

arrival of pulses to be combined by a logic gate. The syn

represents a zero. The waveguide switch computes the

"time-of-flight," latchless architectures are possible

delay schematic shown in Fig. 1 represents a coil of fiber

device bandwidth allows operation at the higher speed.)

2.2. Potentially High Speed Limited by Latency

Few optical logic gates are commercially available

To understand the potential value of speed scalable

Prior work in optics applying most directly to this type of

terabits/second. Haner[6] has demonstrated 100 fem

are commercially available, with bandwidths above 1

A fast, logically complete optical switch has been demon

ponent with all inputs and outputs optical, we add a pho

tronics in a III-V materials system[8].

todetector, amplifier and electrode driver to allow the

switch to be optically controlled. The above devices are

minimum achievable length for the shortest delay loop

domain for digital optical computing. The computer uses

sons for a minimum loop time arise in the implementation

ever, is independent of latency. This well known charac

LiNbO 3 Waveguide Switch

As an extreme example of long latency, the soliton

expected in mature optical devices, but interaction lengths

Fiber Delay Memory Schematic

Figure 2: Time Multiplexed Multiprocessor

3. Time Multiplexed Architectures

delays and passive couplers if signals can withstand a fac

tor of N power loss. Alternatively, the active switches of

3.1. Multiplexing Many Computers on the Same

Fig. 1 can multiplex differentially delayed signals, or the

delays can be built into the multiplexer tree. On output, a

Time multiplexing is one way to use high speed,

plexed time slots to different spatial points. Any differen

tial delays required before detecting or further processing

latency. This technique decouples consecutive bits of the

3.2. Example: A Time Multiplexed Binary Counter

cycle, where the bandwidth of the logic element exceeds

plexing for a simple four bit serial binary counter. Figure

the reciprocal of the shortest loop time by a factor of N.

nal, which appears as a one bit every fourth bit time. As

Time multiplexed multiprocessors have been built

couplers, connectorized fiber and the drive electronics of

HEP[IO], multiplexed up to 128 instruction streams on

ing the number of switches in the carry loop led to a

systems pipelined at the gate level, especially if such

long latency devices. 100 Mbits per second could move

intrinsically desirable, but since latching implies a device

back signal generated from a bit at time t need not com

entering a stable state, and since time constants associated

Thus another way to achieve 100 MHz operation is to use

a switch. Its length could not be shortened to the required

Time multiplexing the counter hardware is a more

bine with a bit arriving any sooner than time t + 20ns.

Figure 3: Time multiplexing and demultiplexing of input and output

Figure 4: A Bit Serial Binary Counter of Four Bits.

values, which are independently incremented by inter