Sie sind auf Seite 1von 15


8, AUGUST 2011


A Low-Cost Stand-Alone Multichannel Data

Acquisition, Monitoring, and Archival System
With On-Chip Signal Preprocessing
Mohammed Abdallah, Member, IEEE, Omar Elkeelany, Senior Member, IEEE, and
Ali T. Alouani, Senior Member, IEEE

AbstractThe objective of this paper is to design a new generation of affordable sophisticated data acquisition and processing
(DAQP) systems. Because of the proposed system hardware reconfigurability, it can be used to meet the need of many real-world
applications while keeping the cost of a device low. The hardware
implementation of the different processing functions of the device
allows for high-speed processing without the need of expensive
general-purpose processors, as is the case of computer-based or
microcontroller-based data acquisitions (DAQs). The target technology of implementing the proposed design is the system on chip
via field-programmable gate array (SoC-FPGA). A four-channel
DAQP was designed, developed, and tested in the Embedded
Systems Design Laboratory, Tennessee Technological University,
Cookeville. Various modules of the conceptual design are implemented and verified. Performance evaluation and cost comparisons are provided. The comparison showed that the results of the
proposed instrument are comparable to existing technologies at a
fraction of the cost.
Index TermsAnalog-to-digital (ADC) conversion, data acquisition (DAQ), data processing, field-programmable gate array
(FPGA), multiplexing.


N MANY real-world applications, multichannel data acquisition (DAQ) is needed for the purpose of surveillance,
monitoring, and/or control. These applications include wideband communications, command communication and control,
space exploration, medical diagnosis, etc. Many sophisticated
DAQ systems exist in the market. However, they are either
expensive or cumbersome or both [1][4]. For example, the
cost of an analog-to-digital (ADC) board can be as high as
$3000 [5]. Furthermore, in another example, a computer-based
biomedical DAQ system consumes 600 W of power and thus
requires an isolated power supply unit. That system cost is
around $5500, not including a laptop [6]. To make, for example,
medical diagnosis affordable, one would want to be able to buy
Manuscript received June 2, 2009; revised October 14, 2009. Date of
publication June 2, 2011; date of current version July 13, 2011. This work was
supported by the Center of Manufacturing Research, Tennessee Technological
University, Cookeville. The Associate Editor coordinating the review process
for this paper was Dr. Dario Petri.
The authors are with the Department of Electrical and Computer Engineering, Tennessee Technological University, Cookeville, TN 38505 USA (e-mail:;;
Color versions of one or more of the figures in this paper are available online
Digital Object Identifier 10.1109/TIM.2009.2036402

a similar sophisticated device and use it at home. In such a case,

affordability plays a major role in the decision of a patient with
chronic disease that requires frequent monitoring of some of
his/her body signals. The processing capability of the proposed
system allows for the patient to obtain preliminary diagnosis
without leaving home/work. The same can be done for complex
diagnosis where the collected data can be securely sent to
appropriate laboratories using the built-in network capability of
the DAQ and processing (DAQP). Simple DAQP such as blood
sugar and vital signs exist in the market. Although such devices
are affordable, they are very limited in terms of the number
of channels and diagnosis functions. The ultimate goal of this
research is to design DAQP that is capable of simultaneously
acquiring multiple heterogeneous signals and processing them
in real time. In the medical field, for example, this capability
of a portable device can be crucial to saving life in emergency
and/or war situations.
Any DAQ system needs some sort of display capability. The
display may enable the viewer to overlap different data on the
same display. Hence, it becomes easy to compare acquired data
with reference data and see the difference between two signals
when they have been plotted as graphs [7]. Moreover, in vehicle
maintenance, for example (e.g., in wheel alignment), a DAQ
system with a display module is used to collect the data and
provide instructions on the display on how to adjust the wheels
[8], [9]. In the medical field, DAQ systems with a displayer
are known to display temperature, respiration rate, and pulse
rate, to mention a few. In addition to display capabilities, any
DAQ system should provide a data-archiving mechanism. For
example, a graphical documentation system allows clinicians to
record information about specific zones of an anatomical image
in a fast, intuitive, and easy-to-understand manner [10], [11].
Existing DAQ systems can be classified into three main
categories. The first one is the computer-based class, which
utilizes computer-processing power to perform the desired data
manipulation, visualization, storage, and/or decision making.
In this category, DAQ can be further subdivided into internal
and external modules. Internal DAQs are in a form of extension
cards that connect to PC expansion slots [Peripheral Component Interconnect (PCI) and Industry Standard Architecture]. It
is worth mentioning that DAQs in this category are dependent
on the obsolescence of the PC in use. External DAQs are in
a form of a module that connects to computer ports (parallel,
serial, universal serial bus, etc.) [12]. External DAQ modules

0018-9456/$26.00 2011 IEEE




often require special cables to connect to the PC [13]. This

computer-based DAQ category obviously has the disadvantages
of high cost, cumbersomeness, high power consumption, and
fixed hardware architecture.
The second DAQ category is the embedded microcontrollerbased systems [14]. These systems have advantages such as
portability and high performance, but they have fixed architecture [15]. In fixed architecture, DAQs, robustness, and fault
tolerance are achieved by redundancy. This, in turn, increases
the size, the power consumption, and the cost of the system.
Hence, another DAQ technology is needed, which is adaptive
to the changing requirements, robust against faults, and small
in size. Dynamically reconfigurable DAQ systems will have a
crucial impact in some applications such as hazardous environments or isolated areas for remote architecture reconfiguration
[16], [17].
However, there exists a third category of DAQs, which
involves a hardware reconfigurable field-programmable gate
array (FPGA). National Instruments (NI), which is a leading industry in DAQs, has recently provided PCI and PCI
extensions for instrumentation DAQ cards with FPGAs, such
as the NI LabVIEW FPGA module [18]. However, since its
DAQs are computer based, the FPGA is only used for limited
purposes such as timing and triggering or reconfigurable control
algorithms. DAQs involving FPGAs can be redesigned while
mounted on the target system (hence the name field programmable) to reach fine-tuned performance or to reroute a faulty
circuit to a new place. The recently introduced high-capacity
FPGA allows for the integration of multiple components on a
single chip. In addition, it can have all the processing, storage,
and inputoutput capabilities that are needed by a DAQ system.
Nallatech [19]. It also has only dual channels. Therefore, it
is not scalable. It also lacks the adaptive optimized sampling
technique. Some other DAQs such as those in Lyrtech, Bittware,
Hunt Engineering, and Southwest Research Institute [20][23]
only use the FPGA for limited purposes, where the FPGA
works as a coprocessor for fixed-architecture-based processing

units. This prevents the design from achieving low-cost and

compact-size advantages, should the design be fully integrated
in a high-capacity FPGA. Other research teams [24][29] have
utilized the FPGA (fully or partially), but they can only acquire
a single input channel. Our proposed research provides a design
philosophy that takes full advantage of the capabilities of the
FPGA, as well as uses a single multiplexed ADC for multichannel DAQP. This will lead to small size, cost, and power
consumption for the DAQP, as well as design hardware scalability (i.e., to add more channels as desired without changing the
system board). The optimal sampling capability of the device
allows for the sampling of a large number of heterogeneous
signals without increasing the size of the ADC.
In Table I, the literature review on DAQ systems is summarized illustrating the contribution of some research teams and/or
affiliations. One can notice that each listed instrument has its
own advantages. However, the proposed system is unique as
related to existing technologies. Instead of using multiple ADCs
for simultaneous multichannel DAQ, the proposed design uses
a single high-speed ADC along with a multiplexer (MUX)
to perform quasi-simultaneous DAQ. In the medical field, for
example, where various biomedical signals are in the lowfrequency range from 25 Hz to 5 kHz [30], the proposed DAQ
can be appropriate without the need for additional hardware
or cost. For applications that require very fast simultaneous
multichannel DAQ, such as in the military field, a dedicated
ADC per channel will be more appropriate. A single superhigh-speed ADC can be efficiently used with an optimal sampling schedule to acquire multiple channels. Hence, this can
reduce the circuit size, the cost, the power consumption, and the
system scalability. Second, full system reconfigurability based
on the FPGA is the best solution in terms of fault tolerance
and portability, and the system can be reused with different
configurations. Third, hardware real-time adaptive sampling is
only available in the proposed system. It leads to the design
security where using the hardware design makes the reverse
engineering immune and secures the design. In the case of


Fig. 1.


Mapping of functions in an existing computer-based DAQP system to the proposed system using the FPGA system-on-chip technology.

input signals with different bandwidths, the hardware real-time

adaptive sampling is the best way to optimize the ADC sampling rate. Meanwhile, it also reduces the overall sampling rate
that is required, which leads to the reduction of the cost of the
required ADC with a large number of channels, particularly in
high-frequency inputs.
From Table I, one can see that the proposed instrument is
unique, as compared with that of any of the research teams
mentioned in the table.

one can choose to apply any of the signal preprocessing cores

such as the fast Fourier transform (FFT) core, the discrete
cosine transform (DCT) core, or even a custom core as desired.
For faster performance, these on-chip cores can be designed
and integrated into the systems FPGA chip without the need
to replace any hardware or major changes in the software.
This way, the FPGA-based system will have faster processing
(compared with comparable software-based fixed architecture)
at lower clock frequencies, lower power consumption, and
limited cost of manufacturing.

A. Why Reconfigurable FPGA?

It is well known that application software can be automatically converted or custom converted to equivalent hardware
in the FPGA technology using hardware description languages
(HDLs). For example, a software segment that adds elements
of an array can be converted to a hardware register file, with an
adder and an accumulator using HDLs. These HDL modules are
typically called cores that can be acquired or custom designed.
The reconfigurable feature of the FPGA can help the user
to change the design after acquiring the system. For example,


Conceptually, to eliminate the need for cumbersome hardware, interface, and a PC, one has to map all the functions
of a classical DAQ system inside a single FPGA chip. This
is summarized in Fig. 1. As can be seen from the figure, all
the needed functions of the computer-based DAQP system are
mapped into the proposed portable multipurpose system. Note
that the FPGA will be designed to manage all the aspects of
subsystem communication, processing, and storage, as needed.



Fig. 2. Conceptual design of the UHIM.

In particular, to make the proposed DAQP a low-cost standalone reconfigurable instrumentation, one has to build the following capabilities.
Capability 1: To accept various input signals with different amplitudes and frequencies. It is desired to accept analog input
voltage signals of an amplitude with a range from millivolts
to volts. Furthermore, it is desired to accept input signals
in the frequency range from hertz to megahertz. This will
allow the system to accommodate a variety of sensors at the
same time (e.g., low-frequency electric pulses, acoustic,
ultrasounds, etc.).
Capability 2: To perform automatic signal conditioning such
as bias addition and removal, adaptive signal scaling, and
filtering. Only the information about the signal type will be
needed. A library that contains the parameters of filters and
amplifiers will be built.
Capability 3: To store the acquired signals without the need
for an external computer, a detachable Flash memory write
module will be built. A network control module will be
needed to securely transmit the acquired signals to authenticated destinations via the Internet.
Capability 4: To have the built-in capability to perform digital
signal processing such as FFT and DCT for a 1-D digital
Capability 5: To perform adaptive scheduling for the ADC
multiplexed interface for a variable number of channels.
If all channels have the same characteristics, then it will
be equivalent to a round-robin sampling technique (i.e.,
uniform sampling, one sample per channel per cycle).

Although some of the above-listed capabilities (15) may

be achieved by existing DAQ technology for a limited and
fixed number of channels, none of the existing DAQs are
capable of performing automatic adaptive optimal sampling. In
addition, existing DAQs have one or more of the following disadvantages: cumbersomeness, high cost, and limited hardware
scalability. This is because reconfigurable hardware design,
as discussed next in the proposed approach, was not fully
The aforementioned capabilities require the design of the
following modules, which are depicted in Figs. 2 and 3.
A. On-Chip UHIM
Existing multichannel ADC converters have a limited and
fixed number of channels due to technical limitations [31], [32].
Therefore, one has to choose the design of the architecture
and the maximum number of channels. In addition, existing
multichannel converters use more than one internal ADC modulator. This leads to hardware redundancy (high cost) and
causes synchronization problems (complex design). To avoid
the hardware redundancy and synchronization problems, one
can use a single high-speed ADC converter in conjunction with
a high-speed analog MUX. However, as the MUX switches
between input channels, transient effects happen at the ADC
output [33], [34]. This can be alleviated using a superfast ADC
[35], [36] and bandpass or high-pass filters.
In a single multiplexed ADC, it is known that the sampling
frequency Fx must be greater than the Nyquists sampling


Fig. 3.


Conceptual design of the RAMM and the NCM.

frequency, i.e.,
Fx N Fs
Fs 2 Bw


where N is the total number of channels, and Bw is the highest

frequency component of the input channels (widest bandwidth)
[37]. This means that the single ADC must be faster than the
Nyquist rate multiplied by the number of channels. If input
signals are not in the same frequency range, then an optimized
ADC scheduler will be required to perform adaptive sampling,
instead of using the highest frequency to set the sampling rate.
It is a challenge to achieve the optimal control module of the
ADC without affecting the quality of the high-frequency input
channels or oversampling low-frequency spectrum channels.
Second, the determination of optimal scheduling requires identification of input features, which are generally not known at the
design time. Hence, optimal scheduling cannot be static. Realtime dynamic (optimal) scheduling is proposed in the conceptual design. The timing of such a scheduler must be accurate
to avoid channel skipping or data corruption [38]. It also has
to adapt to changing input features from one application to
The universal host interface module (UHIM) is the module
that will manage and perform ADC conversion of various
analog signals. By designing an ADC scheduler inside the
FPGA, the need for an external (i.e., a fixed structure as in
[31] and [32]) DAQ device will be eliminated. The conceptual
design of the UHIM is shown in Fig. 2. The FPGA will have
mixed analog and digital inputs. For particular applications that
require the acquisition of up to 30 1-D signals at once, an
FPGA chip with 30 analog inputs (such as [39]) will be needed.

For more than 30 analog signals, external analog MUXs can

be used along with the FPGA board. For example, the Actel
FPGA Fusion AFS1500 [40] with an embedded 30-analoginput MUX can be expanded with 16 MUXs of 4 : 1 size to
quasi-simultaneously collect up to 120 1-D signals because of
very short time, on the order of 20 s, which is negligible in
many applications such as medical diagnosis.
Multipurpose electronic interface will be needed to acquire
different types of input signals. This will be achieved via the
bridging block shown in Fig. 2, which consists of electric
circuits that match the output impedance of the sensor to the
input impedance of the analog filter bank. This is necessary
to maximize the power transfer from the sensor to the DAQ.
In addition, each channel will be modulated at the appropriate
sampling frequency to maximize the total number of simultaneous channel acquisitions. An optimal ADC scheduler will be
needed in the FPGA to manage the variable switching time of
the ADC MUX such that an arbitrary large number of channels
can be sampled without loss of signal quality.
B. On-Chip RAMM
Data storage plays an increasingly essential role in data
monitoring, control, and safety protection. Analog data storage
is subject to deformation with time and poor privacy protection.
On the other hand, digital instrumentation technologies are
known for high processing capabilities, which allow them to
perform intelligent onboard computing that supports functionality such as universal data storage. In addition, they also provide improved accuracy, flexibility, and easier data protection.
This is because the conversion from various analog signal types
to a digital format simplifies universal data archiving and unifies



data protection and communication schemes. This is applicable

to 1-D time-varying signals (heart electrocardiogram, motor
vibration, etc.), as well as to 2-D imaging signals (magnetic
resonance images, etc.). One of the evolving standards for digital data storage in digital portable devices is the secured digital
nonvolatile memory devices, which are commonly known as
SD cards or Flash disks.
Existing embedded data archiving systems need an embedded microcomputer to operate and a real-time operating system
(RTOS), which is the brain that manages every task according
to a timely schedule. To do away with the need for these two
components, one has to map the functions of the RTOS inside
the FPGA. Given the complexity of this task, a prerequisite to
the success of this mission is to design new optimized modules
inside the FPGA to map the functions of the RTOS.
To provide visual monitoring of the acquired data, a
real-time archiving and monitoring module (RAMM) will
have to interface to a high-resolution graphical liquid crystal display (LCD) to continuously plot time-varying signals.
High-resolution, low-power, compact, and colored LCDs are
available to integrate in this device [41], [42]. The FPGA must
directly interface with the LCD to minimize circuit components
and reduce power consumption.
C. On-Chip NCM
If the proposed DAQP instrument is used in remote applications, where privacy and remote connectivity are required, then
an on-chip hardware reconfigurable network controller module
(NCM) will be needed and incorporated in the FPGA. Such a
controller will securely transmit stored data to the Internet with
a minimal interface and thus eliminate the need for an external
network interface card. Since the FPGA has reconfigurable
resources, no redundancy (high cost) is needed. The chip will
only contain required components. This, in turn, keeps the
instrument cost at minimum.
The NCM must securely communicate with the Internet using two subsystems. First, the FPGA will form network packets
that will contain data identification numbers and collected data.
It is crucial that these data are securely sent to authenticated
destinations, such as authorized laboratories. More critically,
the authenticated laboratory must be immune to false data
origins and process samples that are provided only by legitimate
patients. If the portable handheld device of one legitimate user
is involuntarily shared by unauthorized people, it becomes
possible to identify and reject data at the laboratory if it is using
an authentication server. This can be achieved by designing an
optimal and flexible authentication algorithm in the FPGA that
is based on arbitrary digital complicated keys for encryption
and decryption purposes.
Different data records will be supported in the NCM. As
the data are stored in the Flash memory, a database must be
built based on a unique identifier for different types of sensors
(e.g., audio sensors, electric pulse sensors, and thermometers).
As the NCM retrieves stored data records, it will also retrieve
metadata such as time stamps and identification numbers that
will be embedded in each packet of transmission, in case further
analysis is needed.

Fig. 4.

Host interface module for the time-multiplexed ADC.

In the computer-based systems, database management engines can be used. However, in the proposed FPGA device,
these systems are not supported since they require a database
management layer to add, update, delete, and search records.
Hence, it is a technical challenge to design a database classification component that integrates the required functions of the
database management layer on the target FPGA chip.
Fig. 3 shows a conceptual design of both the RAMM and
the NCM. The RAMM is responsible for recording acquired
signals on a secure storage medium. A user ID will be entered at
the time of recording to protect the privacy of the input signals.
We developed a subsystem that temporarily writes stored digital
data directly to a Flash memory card. This subsystem has a
minimal interface and thus eliminates the need for an external
Flash controller.
To implement the proposed design with minimal cost and
circuit size, a strategic decision of not using any off-the-shelf
RTOS is carried out. The challenge with no RTOS is the
necessity to design each driver on the chip. In the following,
each system component is introduced in more detail.
A. ADC Interface
Let N channels be simultaneously acquired by the proposed
instrument design. Each channel is assigned a different number
of time slots of the MUX time schedule. Let Si be the number
of time slots that are assigned for channel $(i)$. Let n be the
number of samples per time slot, tsa be the sample acquisition
time, and T be the elapsed time. As shown in Fig. 4, the
acquired samples from the first channel
 are stored in a buffer.
If the buffer is not full (i.e., T <
n Si tsa ), the buffer
index is incremented, and the channel-assigned time slot is
checked. If this channel has more time slots, ADC will acquire
more data from this channel. If not, the selection lines (sel) are
changed. This change provides collecting data from another
channel and storing them into the same buffer to optimize


Fig. 5.


ADC interface design.

the utilization of the ADC. This utilization is achieved by

reducing the time consumed between two consecutive samples.
If both the sampling phase and the processing phase are done
concurrently, the time spent between two consecutive samples
will be increased. Thus, all samples are sampled into a single
buffer without any delay, and thereafter, the desired processing
is done. After data are collected from all channels and the buffer
is full, all data are ready to be processed and thereafter stored
into a storage device.
In the special case when Si is constant for all channels, the
situation reduces to static scheduling. The buffer will be full
in this case when T < n N tsa . A technique similar to
earliest-deadline-first scheduling is used to dynamically adjust
the select line of the analog MUX. Fig. 5 shows a basic flow
diagram of the ADC data interface.
B. On-Chip RAMM
The on-chip real-time archiving is designed to copy the
collected data that are temporarily stored in the FPGA internal
buffers to nonvolatile memory. This module is designed and
embedded in the FPGA to directly write data into the Flash
memory card with a minimal interface.
In traditional systems, Flash memories are written by a host
PC through a permanent interface (i.e., soldered chips on circuit
boards). Such use of Flash memory devices is common in
embedded systems to store configuration information [43]. One
design choice made in the proposed system is to use detachable
Flash memory devices. Sampled data must be written into
the Flash memory device to allow further remote analysis
and interpretation. This introduces technical design challenges.

In particular, Flash memory must be written by customized

hardware and not a PC. Hence, the proposed Flash memory
controller must address the following design requirements.
Synchronization: A detachable Flash memory should have a
small and robust physical interface. This limits the maximum
number of data and control pins in the interface. Consequently,
this affects the writing speed, as serial data communication
must be employed. We use high clock frequency in serial
mode to overcome this challenge. Timing and clock signals
between the Flash memory controller and the DAQ units must
be properly matched. We use a phase-locked loop and clock
dividers to achieve this matching.
Integrity: Since a Flash memory is detachable, it needs to be
detected before the beginning of writing operations. Hence, the
Flash memory must be initialized before the storing process.
The storing process cannot be done unless the Flash memory is
attached. Continuous monitoring of the Flash memory during
both the reading process and the writing process is done to
ensure that data are passed to the Flash memory. A continuous
error check code is used to check the valid arrival and storage of
the data into the Flash device to avoid error due to disconnection
while writing.
The writing process is limited by the access time of the
Flash memory device. For data integrity, error checking is
continuously performed to write properly ordered blocks (i.e.,
no incomplete or out-of-order blocks). There is no need to store
data that are incomplete or out of order. One solution we used
is to use an internal buffer. We assume no data compression and
a fixed data arrival rate.
Writing Rate: For real-time operation, the processing rate
must be faster than the arrival rate. It is advantageous to use



the FPGA internal memory cells for buffering because their

access time is much smaller than that of external memories.
The Flash memory controller will follow an error detection
algorithm in the data writing process to ensure the integrity
of the stored data. Overall, the FPGA will record acquired
data after filling its internal storage buffers. To optimize the
acquire-and-write processes, the Flash memory controller can
use single- or multiple-block-writing mechanisms [24]. Singleblock writing can be done to ensure data integrity by using an
integrity check value (cyclic redundancy check) at the end of
each block. However, if multiple blocks are to be sequentially
written into the Flash memory, the total write time can significantly be reduced by the use of a more sophisticated write
mechanism. In the multiple-block-writing mechanism, the total
write time decreases. This improves the overall performance of
the acquire-and-write cycles.
Flash memory write module: As stated before, there are
two approaches to write data into Flash memory devices. The
first one is to use a single-block write approach. A single-block
data write into the Flash memory device involves the write of
512 B of data. The authenticity of the data written into the Flash
memory device from the NIOS-II processor has been ensured
using the cyclic redundancy code (CRC) checksum. A 16-bit
CRC has been appended to each data block. The Flash memory
device controller continuously checks the CRC field, and if the
CRC is the correct CRC, then the data are written. Otherwise,
the block is rejected and retransmitted by the FPGA to the
Flash memory device. The number of blocks can vary but is
limited by the size of the file in which these blocks will be
stored. In the single-block write approach, a write command
is needed for each block. Therefore, if we need to capture and
store x blocks into the detachable Flash card, this command
must be called x times. Fig. 6 shows the basic steps of reading,
storing, and processing any number of blocks by the singleblock write approach. All the details about the Flash card and
how to write into it and more specifications about the contents
of each command and CRC are available in [44].
The second approach is to use multiple-block write. A write
command has been called once to write all blocks into the Flash
memory device. When this command is called, it has been followed by a sequence of the desired blocks. This process is terminated when either there is a stop command after all blocks are
written or the whole number of blocks must be predetermined
in advance. In this proposed approach, the number of blocks
has been predetermined before running the program. For the
purpose of obtaining the comparison study between the singleblock write approach and the multiple-block write approach,
the experiment has been repeated with different numbers of
blocks. For each number of blocks, the time required to write
the input data into the Flash memory device has been measured.
Fig. 7 shows the flowchart of an acquirewrite cycle for
a variable number of blocks by the multiple-block write approach. As shown in this figure, after reading the input signals
and storing them in the internal buffer, the writing function
is called. In the beginning of this function, the multiple-block
write command is called once. Then, 512 B for the first block
is sent to the Flash memory device with a proper CRC. The
block is successfully written into the Flash memory device if

Fig. 6.

Single-block write approach flowchart.

the Flash memory controller receives a valid CRC response.

Then, this process is repeated for the next blocks until all the
desired blocks are written.
Display module: As stated before, part of the RAMM is
the display and monitoring capability. The purpose is to enable
the user to monitor different acquired data on the same display.
Since, as stated before, no RTOS is to be used in the design,
all FPGAs used to display communication need to be designed
and embedded in the FPGA. An initial problem was mapping
the display physical coordinates to data-plotting coordinates to
be able to plot a data point into the 2-D plane of the liquid
crystal module (LCM). Such a problem was solved by hardware
design and not by using software drivers. In addition, dealing
with colors requires extra hardware complexity as well. To give
a color to a point, it is done in three steps because this LCM
works in the redgreenblue mode. Therefore, it needs three
clock pulses from the FPGA logic. Two separate counters must
be used to convert the sequential count of the scan pulses into
2-D x- and y-coordinates. Accordingly, there is a need for
precise synchronization. At the same time, the LCM needs
a specific number of idle clock pulses (considered as dark
margins of the display area) in the x- and y-directions. After
overcoming these problems, drawing a point with a specific
color in a specific location is possible.



signal by the ADC board. This bias of 2.5 V was added to

allow an analog input range from 1.25 to +1.25 V to be
acquired. This dc bias had to be removed to recover the original
signal. Second, the acquired signal suffered from amplitude
attenuation. This complicated the bias removal, as it was also
subject to amplitude attenuation. To calibrate the system to
avoid such problems, two steps were done. In the following,
each problem will be addressed. Real-life test signals were then
used to check for the soundness of this calibration, as shown in
Section IV.
Removal of a Bias Component in the Acquired Signal: For
the sake of calibration, a 1-kHz sine wave, with zero mean
and a 1-V amplitude, is applied to the input of the ADC.
Then, the acquired/stored signal is analyzed using MATLAB.
In Fig. 10(a), the stored sine wave is plotted without any
preprocessing. One can see that there is a bias component. It
appears where all the sine-wave values are greater than zero.
Therefore, there is a positive mean value of the stored signal.
To overcome the bias component, we calculated the mean of
the stored signal and then subtracted it from the stored signal
values. In Fig. 10(b), the zero-mean stored signal is shown after
the bias removal. The developed on-chip processor will perform
this bias removal after acquisition and before storing.
Scaling: As shown in Fig. 10(a) and (b), there is a slight
signal attenuation in the ADC-acquired signal. Hence, amplification is needed to restore the signal to the original voltage
amplitude. This amplification is simply done in the FPGA by
multiplying the acquired signal values by a factor before storing
them into the Flash memory. In Fig. 10(c), the amplified stored
signal is presented after getting rid of the dc component.
Fig. 7.

Multiple-block write approach flowchart.

Fig. 8 illustrates the display module architecture. The objective of the monitoring module is to be able to draw a number
of points in a specific area in the LCM concurrently with
two straight lines for x- and y-axes. This was accomplished
by giving the module the variable coordinates of each point
in sequential and periodic fashion. Points must be repeatedly
printed so that when a point is drawn, the previous one is
not erased. This is due to the way the scan of typical display
modules works. Hence, it was necessary to buffer the collection
of variable display points before they are sent to the display
module. At this step, real-time curves could be generated for
acquired data with arbitrary colors. The LCM data buffer is
treated in a cyclic fashion, assuming that the scan rate of the
display module will be much faster than the buffer fill rate.
Filling this buffer with the data values will be done by an
independent unit. A flowchart of the implementation of this
module is shown in Fig. 9.
C. Preprocessing Unit
Using a single harmonic sinusoidal wave, it was noticed
from preliminary experiments that the acquired/stored data had
minor problems. First, a dc bias was added to the analog input

Here, different experiments are implemented for the proposed multichannel DAQ instrument. Different signals with different characteristics are applied to the DAQ. A signal generator
is used to generate the input signals, which, in turn, are applied
to the input of the MUX in the case of the multichannel DAQ.
The configuration of the ADC is controlled by the FPGA ADC
interface. The MUX selection lines are controlled by the FPGA
as well. The 14-bit digital data are fed into the off-chip memory
in parallel. Unlike the case of the single-channel DAQ, in the
case of the multichannel DAQ, a sample from each channel
is stored in that buffer. In other words, the first location of
the buffer contains the first sample from channel 1, the next
location contains the first sample from channel 2, . . .. After
N locations, where N is the number of channels, the second
sample from channel 1 is stored. When the buffer is full, we
should extract channel-1 samples and store them together into
a single destination on the SD card. The same situation is
performed for every channel sample. The proposed instrument
design is tested and evaluated in terms of performance and cost.
Each parameter is discussed in more detail here.
The NI DAQ card is chosen because it has the closest similarity to the proposed instrument (although it is computer based,
it uses adaptive sampling and a MUX). The NI test bench is a
PCI 6024E 200-kS/s 16-channel DAQ card. An experimental



Fig. 8. Display module architecture.

tor Corporation FST3253), a 3.6 Terasic LCM, and a Texas

Instruments ADC (ads7891). This particular low-power ADC
consumes 85 mW and has an analog input range from 0.2 to
+0.2 V. An operational amplifier (THS4031) is used to increase
the analog input range from 1.25 to +1.25 V. This ADC
operates on a 3-MHz clock frequency. For the sake of fairness,
the sampling rate is fixed for both systems to be 50 kS/s/channel
in the case of multichannel acquisition. The sampling rate is
chosen to be 50 kS/s because four channels are acquired, and
the overall sampling rate of the used NI card is 200 kS/s. The
selected FPGA is Cyclone II (EP2C35F672C6) from Altera,
with a main clock frequency of 50 MHz. The system consumes
power as low as 12 mW.
B. Experiment Limitations

prototype of the proposed system with four multiplexed channels is shown in Fig. 11. A 16-channel system will be built in
future work, which will have an extra cost of about $5 for a
16 : 1 external analog MUX such as the Intersil DG406DY [45].

In this paper, to demonstrate the multiplexing capability, four

inputs were provided. Capability 3 is demonstrated below, with
two types of input signals: generated sine waves and real-life
sound waves. Since the prototype was tested with a 3-MHz
ADC, the collective multiplexed sampling rate could not exceed
3 MS/s. In addition, the bias removal was done by on-chip
software processing. A known dc bias of 2.5 V was used to
ensure that the input signal is always in the positive range.
After the acquisition, the added bias is removed to recover
the original signal. Moreover, faulty data were not simulated
since they are expected to happen only in Flash memory device
removal/malfunctioning. However, care is taken to resend/retry
the block writing, as discussed in Section III-B1, to remedy
nonpermanent errors if they happen. In such cases, the CRC
check would flag an incorrect writing, and a retransmission will
be tried.

A. Experimental Setup

C. Flash Memory Module Verification

A signal generator (Wavetek) is used to generate 1-, 4-, 8-,

and 10-kHz sine waves as four inputs to the DAQs. To validate
the implemented methodology, several tests were performed.
The developed FPGA prototype, shown in Fig. 11, is set up
with a dual 4 : 1 MUX/demultiplexer (Fairchild Semiconduc-

The Altera SoPC builder has been used to configure the

NIOS-II processor, the peripheral components, and the memory
to the target FPGA. Then, compiling and running processes
have been done in the Altera Quartus environment. Finally, the
NIOS-II integrated development environment (IDE) has been

Fig. 9. Flowchart of the display module.



Fig. 10. Test signal (1-kHz sine wave) with various preprocessing stages. (a) Unprocessed. (b) DC bias removal. (c) Gain adjustment.

numbers of blocks have been considered as the inputs of the

experiment. The time it takes to store the blocks into the Flash
memory device is considered as the output of this experiment.
Fig. 12 illustrates the performance of both approaches. The
x-axis represents the number of blocks, and the y-axis represents the time it takes to store these blocks. It is noticed, as
shown in the figure, that the multiple-block write approach is
more efficient than the single-block write approach. It takes less
time to store the desired blocks into the Flash memory device.
By the same token, it is shown that the performance of both
approaches is approximately linear in shape. As the number
of written blocks increases, the time it takes also increases.
However, the multiple-block write approachs performance is
better than the other approach. Therefore, if the number of
blocks is n, the time required to write a block in the singleblock write approach is s, and then
T s = n s.


Fig. 11. FPGA prototype of four multiplexed sound sensors.

used to compare both approaches. Both approaches have been

applied in this IDE via calling two different functions, one for
each approach. To get the results of each approach, different

T s is the total time to write $n$ blocks using the single-block

write approach. On the other hand, if the time it takes to execute
a multiple-block write command is and the actual time it takes



Fig. 12. Comparison between the write time of the single- and multiple-block
write approaches.

Fig. 14.

Data recorded using the proposed compact integrated DAQ system.

Fig. 15.

(Time zoom) Superimposed data recorded using both DAQ systems.

Fig. 13. Data recorded using the existing DAQ system.

to write a block is m, then the total time to write n blocks

using the multiple-block write approach is
T m = + n m.


Subtracting the time it takes in the case of the multiple-block

write from the time in the case of the single-block write, an
almost linear function is resulted.
D. Real-Life Data Verification
Several real-life tests have been performed using this prototype. We have used the existing computer-based system to acquire a lung sound. The same signal was provided to the newly
designed device (Fig. 11). Only the results of one experiment
are reported in this paper.
The computer-based system acquired the lung sound shown
in Fig. 13. The experiment run length is 2.6 s (horizontal axis).
The normalized signal amplitude is plotted on the vertical axis.
The data recorded using the proposed system are shown in
Fig. 14. Fig. 15 provides a zoomed view in time to show an
example of the faithful reproduction of the body sound signal.

Comparisons are done (using the MATLAB tool) between both

signals (i.e., computer based versus FPGA based). Initially,
calculations were done for the first 5000 samples. The absolute
maximum FPGA-based signal deviation from the computerbased signal is 2.7346 104 . The dynamic range of the input
signal was from +0.22 to 0.291, as shown in Figs. 13 and
14. It was found that the maximum absolute deviation of the
FPGA-based signal from the input signal is 0.0165, which is
3.22% of the dynamic range of 0.511. This has happened at
acquired sample number 194. For the same samples, the calculated RMS error is 0.0041. Furthermore, the RMS error was
recalculated over the whole length of the acquisition experiment
(124 800 samples), and its value is 0.0262.
E. Display Module Verification
The display that has been used in this system is the
TRDB_LCM. It is a 3.6-in digital panel. The feature set of the
TRDB_LCM has been listed in Table II [46].




Fig. 17. Comparison between the proposed instruments, i.e., NI-based DAQs,
for the 1-kHz sine wave in a multichannel DAQ (one of four channels).

Fig. 16. LCM data plots using the FPGA display module (window size of
300 s).

Several tests have been performed using this prototype to

display different analog signals. There are two axes that have
been drawn in blue. The background is black. These colors are
optional and can be changed, as shown in Fig. 16. The switches
in the FPGA board that are used to control the sampling rate
are used in the proposed display system. Fig. 16(a) shows the
acquired audio data from a medical device to monitor the sound
of a patient lung. Fig. 16(b) shows the displayed signal while
applying a 10-kHz sinusoidal wave for a preset window size of
300 s.
In the case of a simultaneous multichannel DAQ system and
due to the small size of the display module (LCM), the switches
that are available in the FPGA board have been used to select a
channel signal to be displayed.
F. Performance Evaluation
The proposed FPGA-based DAQ instrument stores the input
signal as a .wav file into a Flash memory. It works in a
stand-alone mode without the need of any computer. All the

processing and control are done by the FPGA. On the other

hand, the other DAQ needs a LabVIEW program that runs on a
computer to store the input signal into a file in the computer
attached with the card. Both the acquired/stored signals by
both systems are tested by MATLAB. The comparison of the
acquired data is shown in Fig. 17. One can find small deviations
in the FPGA-based stored signal. In this phase of our research, a
hybrid hardware and software acquisition is used. For a precise
acquisition and timing, a fully hardware acquisition should be
used. This is a subject of future research to further improve the
quality of the FPGA-based acquisition.
G. Cost Comparison
Using the current market value, the FPGA board, which
includes the Flash memory and the LCD, used in the proposed
instrument is $320, and the ADC board is $90. This includes
the cost of the external four-channel MUX (FST3253 dual 4 : 1
MUX/demultiplexer bus switch). Therefore, the total manufacturing cost of the proposed system is $410, which does not
need a computer. As mentioned before, in Section IV, this
experimental prototype can be upgraded to have 16 analog
channels instead of 4 with negligible extra cost, i.e., about
$5. On the other hand, the cost of the comparable NI-DAQ
is about $500. However, the NI DAQ requires a computer to
operate. In the final prototype of the proposed instrument, only
the needed components of the FPGA board will be integrated.
This customization and integration may even reduce the cost of
the proposed instrument to below $300.
Moreover, the proposed instrument will have an adaptive
scalable controller implemented in the hardware. This option
is not available in existing DAQ technologies.




In this paper, the conceptual design of a low-cost multipurpose stand-alone DAQP device has been provided. The
instrument is universal, which means it can be employed in
various areas such as wideband communications and medical,
environmental, and radar applications. In applications where
sampling rates are greater than 2 GS/s, such as wideband
multichannel communication, the significance of the proposed
instrument increases. As shown in Section IV-E, the cost of
the overall DAQ system is reduced with the use of the FPGA
reconfigurable hardware.
The main functions of the DAQ part have been designed,
built, and tested for a four-channel DAQ. The stand-alone DAQ
has good performance. The device cost is only a fraction of
existing multichannel DAQs.
The design detail, implementation, and testing of the optimal
on-chip scheduler for the time-multiplexed single ADC are
subjects of a future paper. In addition, future research will target
the network control module and the processing and diagnosis
capability of the stand-alone device.

The authors would like to thank G. Vince, a former graduate
student, for his help in the preliminary work of this paper.
[1] K. Arshak, A. Arshak, E. Jafer, D. Waldern, and J. Harris, Low-power
wireless smart data acquisition system for monitoring pressure in medical
application, Microelectron. Int., vol. 25, no. 1, pp. 314, 2008.
[2] White paper, Technical Series on Data Acquisition. accessed:
September, 2008. [Online]. Available:
[3] B. S. Pimentel, J. H. de Aliva Valgas Filho, R. L. Campos,
A. O. Fernandez, and C. J. Nunes Coelho, A FPGA implementation of
a DCT-based digital electrocardiographic signal compression device, in
Proc. IEEE 14th Symp. Integr. Circuits Syst. Des., 2001, pp. 4449.
[4] J. Gray, Building a RISC system in an FPGA, Circuit Cellar Mag.,
no. 116118, Mar.May 2000. accessed: September, 2008. [Online].
[5] Maxim Direct. accessed: May, 2009. [Online]. Available: http://www.
[6] Comet Inc., Portable EEG and PSG System, 2007. accessed: January 31,
2007. [Online]. Available:
[7] Yokogawa Elect. Corp., DAQ Station Dx Series Software Package for the
Pharmaceutical and Biotechnology Industries, Tokyo, Japan, 2000.
[8] Campbell Scientific Inc., Logan, UT, Oct. 2007. accessed: May, 2008.
[Online]. Available:
[9] RSTech, CAESAR Data Systems, World Class Testing Technology
Road Load Data Acquisition, RS Technol., Ltd., Farmington Hills,
MI, Oct. 2007. accessed: May, 2008. [Online]. Available: http://www.
[10] E. L. Hudspeth, Data acquisition, storage and display system,
U.S. Pat. 4 053 951, Oct.11, 1977. accessed: October, 2007. [Online].
[11] A. W. Dudley, R. F. Dayhoff, and R. S. Ledley, Muscle biopsy data
acquisition and display, in Proc. 7th Annu. Symp. Comput. Appl. Med.
Care, 1983, pp. 763766.
[12] NI-SCXIUSB Data Acquisition, Jun. 2008. accessed: December, 2008.
[Online]. Available:
[13] S. Martin, PC-based data acquisition in an industrial environment, in
Proc. IEE Colloq. PC-Based Instrum., 2002, pp. 2/12/3.
[14] I.-Y. Chen and C.-C. Huang, A service oriented agent architecture to support telecardiology services on demand, J. Med. Biomed. Eng., vol. 25,
no. 2, pp. 7379, 2005.

[15] Dewetron, DEWETRON PC Instruments, National Instruments, Data

Acquisition Card for Laptops and PDAs. accessed: February, 2007.
[Online]. Available:
[16] M. Surratt and H. H. Loomis, Challenges of remote FPGA configuration
for space applications, presented at the IEEE Aerospace Conf., Big Sky,
MT, Nov., 20041, Paper 1437.
[17] K. Y. Park and H. Kim, Remote FPGA reconfiguration using MicroBlaze
or PowerPC processors, Xilinx, Inc., San Jose, CA, XAPP441,
Sep. 2006. v1.1.
[18] NI LabVIEW FPGA Module, Nat. Instrum., Austin, TX. accessed:
June 2008. [Online]. Available:
[19] Virtex-4, Dual 3 GSPS ADC, Nallatech, Eldersburg, MD. accessed:
March, 2009. [Online]. Available:
[20] VHS-ADC, Lyrtech, Quebec, QC, Canada. accessed: March, 2009.
[Online]. Available:
[21] Tetra-PMC, Bittware, Concord, NH. accessed: March, 2009. [Online].
[22] HERON-IO5 Module, Hunt Engineering, Somerset, U.K. accessed:
March, 2009. [Online]. Available:
[23] L. M. Theis and S. C. Persyn, Development of a high-speed multichannel analog data acquisitioning architecture, in Proc. IEEE Aerosp.
Conf., 2006, pp. 918.
[24] C. F. M. Loureiro and C. M. B. A. Correia, Innovative modular highspeed data-acquisition architecture, IEEE Trans. Nucl. Sci., vol. 49, no. 3,
pp. 858860, Jun. 2002.
[25] J. M. Cardoso, J. B. Simes, and C. M. B. A. Correia, A
high performance reconfigurable hardware platform for digital pulse
processing, IEEE Trans. Nucl. Sci., vol. 51, no. 3, pp. 921925,
Jun. 2004.
[26] M. Bautista-Palacios, L. Baldez, and J. A. Herms-Berenguer, Configurable hardware/software architecture for data acquisition: Implementation on FPGA, in Proc. IEEE Field Programmable Logic Appl.,
Aug. 2005, pp. 241246.
[27] T. Lin and Z. Zhengou, The implementation of 100 MHz data acquisition
based on FPGA, in Proc. 3rd IEEE Int. Workshop Syst.-on-Chip RealTime Appl., Aug. 2005, pp. 241246.
[28] M. Abdallah and O. Elkeelany, System-on-chip technology-based onthe-fly audio data acquisition, monitoring and displaying system using
FPGA, in Proc. ISOCC, 2008, pp. 218221.
[29] Y.-H. G. Lee and C.-I. H. Chen, Dynamic kernel function fast Fourier
transform with variable truncation scheme for wideband coarse frequency
detection, IEEE Trans. Instrum. Meas., vol. 58, no. 5, pp. 15551562,
May 2009.
[30] J. D. Bronzino, Medical Devices and Systems., 3rd ed. Boca Raton, FL:
CRC Press, 2006.
[31] S. Nadeem, C. G. Sodini, and H.-S. Lee, 16-channel oversampled analogto-digital converter, IEEE J. Solid-State Circuits, vol. 29, no. 9, pp. 1077
1085, Sep. 1994.
[32] J. Z. Wu, The design of LVDS interface for a multi-channel A/D
converter, Texas Instruments Chip Center, Sep. 21, 2003. accessed:
February, 2007. [Online]. Available:
[33] T. J. Sobering and R. R. Kay, The impact of multiplexing on the
dynamic requirements of analog-to-digital converters, IEEE Trans. Instrum. Meas., vol. 45, no. 2, pp. 616620, Apr. 1996.
[34] S. P. Lyod, Least squares quantization in PCM, IEEE Trans. Inf. Theory,
vol. IT-28, no. 2, pp. 129137, Apr. 1982.
[35] D. Petrinovic, High efficiency multiplexing scheme for multi-channel
A/D conversion, in Proc. IEEE Midwest Symp. Syst. Circuits, 1998,
pp. 534538.
[36] J. Balent and T. Kerns, Understanding Real Time for Measurement and
Automation. Austin, TX: Nat. Instrum. accessed: September, 2008.
[Online]. Available:
[37] M. Meurer and R. Raulesfs, Enhancement of multi-channel ADC
conversion by a code division multiplex approach, in Proc.
IEEE 6th Int. Symp. Spread-Spectrum Technol. Appl., Sep. 2000,
pp. 641641.
[38] D. Petrinovic, High efficiency multiplexing scheme for multichannel ADC conversion, in Proc. Midwest Symp. Circuits Syst., 1998,
pp. 534537.


[39] Multi-Channel Analog Voltage Comparator in Fusion FPGAs, Actel

Corp., Mountain View, CA, 2007. accessed: March, 2008. [Online].
[40] Actel Corp.. [Online]. Available:
[41] Standard Graphic LCD Modules, CrystalFontz- Liquid Crystal Display, Spokane Valley, WA, Feb. 8, 2007. accessed: 2007. [Online].
[42] Color TFT display modules, 3.5  640 480 Color Monitor, 2007. accessed: February, 2007. [Online]. Available: http://www.purdyelectronics.
[43] M. Abdallah, O. Elkeelany, and A. Alouani, An efficient hardware reconfigurable multi-channel audio data acquisition, storing and monitoring
system, in Proc. ICCE, 2009, pp. 12.
[44] A. Kawaguchi, S. Nishioka, and H. Motoda, A Flash-memory based file
system, in Proc. USENIX Tech. Conf., 1995, p. 13.
[45], IC MUX ANALOG 16:1 28-SOICDG406DY. accessed:
August, 2009. [Online]. Available:
[46] Terasic TRDB_LCM, ver. 1.2, Terasic, Hsinchu, Taiwan, Nov. 2006.

Mohammed Abdallah (M07) received the B.Sc.

degree (degree of honor, ranked second) in computer and systems engineering and the M.Sc. degree
in automatic control system engineering, with the
master research area in ad hoc wireless networks,
from the University of Mansoura, Mansoura, Egypt,
in 2002and 2005, respectively.
From August 2002 to January 2007, he was a
Teaching Assistant/Lecturer with the Computers and
Systems Engineering Department. From September
2006 to December 2006, he was a Visiting Faculty
with the B.Sc. Credit Hours Program in Communications and Information
Engineering. From February 2004 to January 2007, he was an Instructor with
the Engineering Programming Unit, Faculty of Engineering, University of
Mansoura. Since January 2007, he has been with the Tennessee Technological University (TTU), Cookeville, as a Ph.D. Research Assistant. His Ph.D.
research area is high-performance embedded systems using reconfigurable
hardware. In May 2009, he was a Lecturer with the Electrical and Computer
Engineering, TTU.
Dr. Abdallah was a recipient of the TTU Graduate Student Research
Award in 2009. He was also a recipient of a monetary award (from Boeing
and Dr. A. Atkins) in recognition of presenting the best poster in the TTU
Research Day.


Omar Elkeelany (M99SM09) received the B.Sc.

and M.Sc. degrees in computer science and automatic control from the University of Alexandria,
Alexandria, Egypt, in 1992 and 1998, respectively,
the Ph.D. degree from the University of Missouri,
Kansas City, in May 2004, and the Doctor of Research degree from the International Institute of Science and Technology, Independence, MO.
He was with a research team of Wideband Corporation for one year. He has strong knowledge in the
areas of network management and embedded system
design, comprising both academic research and industrial work experience.
Since August 2005, he has been with the Tennessee Technological University,
Cookeville, as an Assistant Professor. His research interest includes the areas
of embedded systems for computer networks, network security, data streaming,
video encoding, performance analysis of computer networks using simulation
models, design of high-speed network devices, and the use of hardware description language in building new efficient handheld designs. After joining the
Tennessee Technological University, he established the Embedded Systems
Design Laboratory. This laboratory enables multiple embedded systems research and adds to the education program for both undergraduate and graduate
Dr. Elkeelany is a Member of the American Society for Engineering Education, the Association for Computing Machinery, and the Eta Kappa Nu
Honorary Association.

Ali T. Alouani (M83SM91) received the Diplome

dIngenieur Principal from lEcole Nationale
dIngeneurs de Tunis, Tunis, Tunisia, in 1981 and the
Ph.D. degree in electrical and computer engineering
from the University of Tennessee, Knoxville,
in 1986.
He is currently a Professor in electrical and computer engineering with the Tennessee Technological
University, Cookeville. His research interests include
intelligent systems design and control, application
of signal processing to the medical field, and sensor
data fusion. He has authored more than 120 journal and conference papers. He
is the holder of four U.S. patents.