Sie sind auf Seite 1von 4

Embedded Vision System Using AVR-8 bit

Microcontroller Atmega64 and Omnivision OV6620


CMOS Image Sensor
H.N. Nguyen and K.J. Lin
Abstract in this project, we introduce small, low power
consumption, real-time image processing engine capable of
tracking colorful objects. Our system uses the low-cost Atmel
AVR 8-bit microcontroller ATmega64, which was connected
directly to the C3088 camera module using the Omnivision
OV6620 CMOS image sensor. This means that on this modest
hardware architecture, we will implement simple computer
image processing algorithms, previously needing a
considerable amount of computing resources, consuming a
lot of power and out of reach for this kind of small, cheap
hardware. It was specifically designed to provide a low-cost,
low power but effective embedded vision system with simple
hardware that can be easily added to other applications
(robot, security, monitoring, etc). With careful attention to
the software algorithm, efficiency fast color objects tracking
at the speed of 30 frames/second is accomplished using
commodity hardware. It is a functioning vision system, with a
well-defined interface that is accessible through a standard
serial port, providing high-level, post-processed image
information to a primary system (PC, another
microcontroller, etc.). This removes the trouble of
performing the image-processing on the main system, and
allows the developer to concentrate on how to use the
high-level image data to perform the task at hand.

I. INTRODUCTION
Robotic systems are becoming smaller, lower power,
cheaper, enabling their application in areas previously
impossible and this is also true of the vision system. There
are many relatively simple computer algorithms which
have been proved to be extremely useful in a variety of
applications, [1], [4], [6], [7]. However, the hardware to
implement them is usually complex and expensive.
Traditionally, these systems comprise a camera, a frame
grabber, and a powerful computer to interface the frame
grabber and execute algorithms. Recent developments of
low cost CMOS color camera modules and high speed
microcontrollers make it possible to build a simpler and
cheaper system. [2] The well-known CMUCAM2 which
utilizes an Omnivision OV6620 CMOS camera, Ubicom
SX52 microcontroller operating at 75 MHz and a frame
buffer chip AL422B is a considerable advance and
H.N. Nguyen and K.J. Lin are with the Department of
Mechanical Engineering, Southern Taiwan University of
Technology, Taiwan, Email: M951Y205@webmail.stut.edu.tw

simplest, cheapest one. However, in some small


applications such as line following robots, object
tracking its still quite complicated and expensive with
the price of 199 USD.
In the CMOS camera technology, the pixel grabbing
circuitry is integrated with an analog to digital converter so
its not necessary to have a separate frame grabber.
Microcontrollers are becoming more powerful, faster and
their costs continue to decrease. Especially, the Atmel
AVR 8-bit microcontroller family can run at the maximum
speed of 20MHz with the cost just a few USD. This kind of
microcontroller is still too commodity for an embedded
vision system even comparing with the CMUCAM2,
which uses a 75MHz Ubicom SX52 microcontroller.
However, we have found that with careful attention of the
algorithm efficiency, a functioning embedded vision
system can be built using commodity image capture and
CPU hardware. We have constructed a real-time image
processing engine capable of tracking colorful objects,
which consists of an Omnivision OV6620 CMOS camera
module, 2 Atmel AVR-8bit microcontrollers (an
Atmega64 operating at 16MHz as a main microcontroller
and an Atmega8535 as a boot processor) with the cost
lower than 100 USD. A fast and cheap color image
segmentation was implemented to attain the result. We will
present the whole design in this paper.

II. SYSTEM ARCHITECTURE


Our system is designed to provide frame-rate color
region tracking, a low-overhead interface to a host for use
as a simple vision system and also basic video capture.
High-level information extracted from the capture image is
sent to a primary system (PC, another embedded controller,
etc.) to utilize. In a typical application, the external
processor in a mobile interactive robot could configure the
vision systems color tracking mode to send back tracking
packets. The main microprocessor then process the data in
real time and output high-level post-processed information.
[5]
A. Hardware system

17.7344 Mhz

Y bus

O
V
v6
6
2
0
SCL

Atmega64

Port C

UV bus

Port A

VSYNC

Ext2

HREF

Ext3

PCLK

SDA

UART Tx

Max232

UART Rx
Color
arrays
stored in
EEPROM

Timer/counter1
SDA
SCL

SDA
SCL

Reset

Atmega8535

Fig. 1. Hardware block diagram.


The hardware of our project is very straight-forward.
It consists of a four chip design: Omnivision OV6620
CMOS, Atmega64 as a main processor, an Atmega8535 as
a boot controller also provides I2C link for integration
into another embedded controller and a simple level shifter
for the RS232 serial data. To keep the simple design, the
Omnivision OV6620 CMOS image sensor is connected
directly to the mega64 to access important camera signals
(pixel clock, horizontal and vertical sync, and, of course,
the data busses). The mega64 waits for incoming data to
stream from the camera and processes it in real time. It
then sends the extracted high-level, post-processed
information to the outside world via the internal UART
port.
The image input to the system is provided by an
Omnivision OV6620 CMOS camera on a chip. [3] The
CMOS camera is mounted on a carrier board which
includes a lens and supporting passive components. By
itself, the board is free running and will output a stream of
8 bit RGB or YCrCb color pixels. Synchronization signals,
including a pixel clock, are then used to read out data and
indicate new frames and horizontal lines. The OV6620
supports resolutions of up to 352 x 288 with a maximum
refresh rate of 360 frames per second (fps) and the camera
parameters such as color saturation, brightness, contrast,
white balance, exposure time, gain and output modes are
programmable using a standard serial I2C interface. An
analog mono chrome output exists that can be used for
external monitoring of the image.
The main microcontroller used to process the video
data is a RISC architecture microcontroller ATmega64
operating at 17.7344 MHz. By executing powerful
instructions in a single clock cycle, the ATmega64
achieves throughputs approaching 1 MIPS per MHz,
allowing the system designer to optimize power
consumption versus processing speed. [9] It provides the

following features: 64K bytes of In-System Programmable


Flash with Read-While-Write capabilities, 2K bytes
EEPROM, 4K bytes SRAM, 53 general purpose I/O lines,
32 general purpose working registers, Real Time Counter
(RTC), four flexible Timer/Counters with compare modes
and PWM, two USARTs ( USART0 will be used to
interface with the PC), a byte oriented Two-wire Serial
Interface which will be used as an I2C Interface with the
camera module, an 8-channel, 10-bit ADC with optional
differential input stage with programmable gain,
programmable Watchdog Timer with internal Oscillator,
an SPI serial port, IEEE std. 1149.1 compliant JTAG test
interface, also used for accessing the On-chip Debug
system and programming, and six software selectable
power saving modes. The Idle mode stops the CPU while
allowing the SRAM, Timer/Counters, SPI port, and
interrupt system to continue functioning. This feature
helps us to reduce the power consumption because it is
advantageous to shut down the system when it is not
needed.
The separate boot controller chip is needed because
the main camera processor chip (Mega64) is clocked by a
17.7344MHz output signal from the video module, which
is not available at power-up. On power up, the boot
processor (Mega8535) holds the main processor in reset
while it issues configuration commands to the video
module. These turn on the high-speed clock output to drive
the main processor. The boot processor then releases the
camera control signals, and allows the main processor to
proceed.
B. Firmware
All firmware for the vision board was written in C
and compiled using the WINAVR compiler. When
compiled the current firmware requires 13 Kbytes FLASH
ROM and utilizes 768 bytes EEPROM.

Processing data on the fly. The firmware plays a


vital part in the success of our project. The ATmega64 has
limited RAM, ROM. We dont use any frame buffer so the
programming was restricted by strict code timing
requirements necessitated by processing the data "on the
fly." This means that all the processing must be done with
the data stream coming from the camera. A main
processing loop will look like the following: Start loop;
Wait for a valid pixel on the data bus (PCLK high); Read
the red value from the Y data bus; Read the green value
from the UV data bus; Wait for blue value on the data bus
(PCLK go low and high again); sample the blue value on
the UV data bus; Perform image processing; End loop. All
of these processing needs to be done in around 16 clock
cycles to keep up with the full-speed image stream from
the OV6620. However, we can waste 3 or 4 cycles to
monitor when PCLK changes state, and loop back to check
again if it has not changed state. To solve this trouble, we
use the same clock source for the mega64 and the camera
module so we can be sure that every 6 clock cycles, the
data on the data bus will have changed values, and will be
ready for reading. Thus we dont need to sample the PCLK
line in between pixel data, just synch up once with it at the
beginning of each line, eliminate all those cycles checking
the state of PCLK.
Colorful object tracking. [4]We employ a fast and
cheap color segmentation to enable our system capable of
tracking up to 8 colorful objects at 30 Hz on this general
commodity hardware [1]. This approach involves the use
of thresholds in a three dimensional RGB color space. To
illustrate the approach, consider the following example.
Suppose we discretize the RGB color space to 10 levels in
each dimension. So orange, for example might be
represented by assigning the following values to the
elements of each array:
RClass[] = {0,1,1,1,1,1,1,1,1,1};
GClass[] = {0,0,0,0,0,0,0,1,1,1};
BClass[] = {0,0,0,0,0,0,0,1,1,1};

Thus, to check if a pixel with color values (1,8,9)


is a member of the color class orange all we need to do is
evaluate the expression RClass[1] AND GClass[8]
AND BClass[9], which in this case would resolve to 1,
or true indicating that color is in the class orange.
true indicating that color is in the class orange. The
advantage of this approach is that it can determine a pixels
membership in multiple color classes simultaneously. As
an example, suppose the region of the color space occupied
by blue pixels were represented as follows:
RClass[] = {0,1,1,1,1,1,1,1,1,1};
GClass[] = {1,1,1,0,0,0,0,0,0,0};
BClass[] = {0,0,0,1,1,1,0,0,0,0};

We could combine the orange and blue


representations as follows:
RClass[] = {00,11,11,11,11,11,11,11,11,11};
GClass[] = {01,01,01,00,00,00,00,10,10,10};
BClass[] = {00,00,00,01,01,01,00,10,10,10};

Where the first (high order) bit in each element is


used to represent orange and the second bit is used to
represent blue. Thus we can check whether (1,8,9) is
in one of the two classes by evaluating the single
expression RClass[1] AND GClass[8] AND
BClass[9]. The result is 10, indicating the color is in

the orange class but not blue. In our system, each array
has 256 elements and each element is a 8-bit integer. It is
therefore possible to evaluate membership of 8 colorful
objects at once with two AND operations. This requires
8-bits resolution for each color channel and 256x3 bytes
EEPROM to store 3 color channel arrays.
Connected Region. [4]After the various color pixels have
been classified, connected regions are formed by
examining the classified pixels. In many robotic vision
applications significant changes in adjacent image pixels
are relatively infrequent. By grouping similar adjacent
pixels as a single run we can compute a run length
encoded (RLE). The merging procedure scans adjacent
rows, columns and merges runs which are of the same
color class and overlap under four-connectedness. Each set
of runs pointing to a connected region.
C. Interface:
As referred above, the main processor is connected
directly to the camera module. It acquires each pixel block
(R, G, B), performs the mapping into an actual color,
compute RLE, and store the information into a buffer. No
polling of the camera data needs to be done because this
interface is synchronized with the pixel data (OV6620 and
the mega64 use the same clock source). Here is a pixel
block:
...G G G G... (row x)
...B R B R... (row x+1)
[5]A pixel block is defined as a continuous group of
4 pixels that are combined together to form a specific color.
Typically, this is formed by sampling a green value,
followed by a red and blue value (since we are dealing
Bayer color data). As pixel blocks are sampled, the red,
green, and blue values are used to index into their
respective color arrays. The color arrays return values that
can be logically AND together so that a particular RGB
triplet will result in a single bit being set after the AND
operation. This single bit indicates which color be set after
the AND process. It is also possible for no bits to be set
after the AND process, indicating that the RGB triplet does
not map to any of the colors configured in the color arrays.
Our system can communicate with a PC via the standard
9-pin serial port. We can command the camera to take full
resolution snapshots (176 x 144 pixels) and analyze them
to determine the colors that are present in the captured
image. Then we build the color arrays, to indicate which
colors the system should track by selecting the colors in the
image. Once all the colors have been selected, the color
arrays will be sent down to the system via the serial port.
D. Results
Our final embedded vision system operates at a
maximum rate of 30 frames per second with the image
resolution of up to 88x144 pixels, providing real-time
tracking statistics (number of objects, color, bounding box,
and more) through a standard serial port (UART). The
consumption power is low, 100mA, 5V at full speed. The
system also can take full-color snapshot (176x144) and
displays in raw Bayer data or interpolated data. The results

of each tracked object are displayed in real time with color


and bounding box information.

III. CONCLUSION
The goal of this work is to implement simple
embedded vision system consuming low power to replace
fairly complicated, expensive system that is out of reach
for many developers who have moderate image processing
requirements. The feasibility is demonstrated with the
ability of tracking color objects at the real time speed 30
frames/second. In the future, we will need to apply the
system to some specific applications to evaluate the
performance of the system in various lighting conditions.
We would also add some more functionality to the system
such as the ability so support QCIF data because this is the
fastest stream that a small micro can keep up with in real
time, increasing the image resolution, and image
processing speed.
R EFERENCES
[1]

Fig. 2 Capturing images and tracking objects.

[2]

[3]

[4]

[5]
[6]
[7]
Fig. 3 Embedded vision system hardware.
[8]

[9]

A. Rowe, C. Rosenberg and I. Nourbakhsh, "A Low


Cost Embedded Vision System The Proceedings of
IROS, 2002.
A. Rowe, C. Rosenberg and I. Nourbakhsh, "A
Second Generation Low Cost Embedded Vision
System Proceedings of the 2005 IEEE Computer
Society Conference,2005.
Omnivision Technologies Incorporated, OV6620
Single-Chip CMOS CIF Color Digital Camera
Technical Documentation, http://www.ovt.com/
J. Bruce, T. Balch and M. Veloso, Fast and
Inexpensive Color Image Segmentation for
InteractiveRobots, The Proceedings of IROS 2000,
2000.
J. R. Orlando, Website http://www.jrobot.net.
I. Horswill, Polly: A vision-based artificial agent,
The Proceedings of the Eleventh National Conference
on Artificial Intelligence, 1993.
R. Sargent and B. Bailey and C. Witty and A.
Wright,Dynamic Object Capture Using Fast
Vision Tracking, AI Magazine, vol. 18, no. 1,
1997.
I. Ulrich and I. Nourbakhsh, Appearance-Based
Obstacleb Detection with Monocular Color Vision,
Proceedingsof AAAI Conference, pp. 866-871,
2000.
Atmel Corp., Website http://www.Atmel.com.

Das könnte Ihnen auch gefallen