Sie sind auf Seite 1von 9

PAPER PRESENTATION ON

Image Processing
And
Pattern Matching

Presented By

V.Somasekhar reddy V.Krishna Chaitanya Kumar

IVth B.Tech (IT) B.Tech (IT)

GPCET GPCET

somasekhar56@gmail.com chaitu.0755@gmail.com
Abstract reliability. The document will also outline the
existing image-processing architectures and
The growth of the Electronic Media,
compare them to IPoIP. Ending this document
Process Automation and especially the
will be a short chapter detailing several
outstanding growth of attention to national and
possible implementations of IPoIP in existing
personal security in the past few years have all
applications.
contributed to the growing need of being able
to automatically detect features and
Introduction
occurrences in pictures and video streams on a
massive scale, without the need for human eye
A tremendous amount of research effort
intervention and in real time. To date, all
has been put into the ability to extract
technologies available for such automated
meaningful data out of captured images (both
processing have come short of being able to
video and still) in the past years. As a result, a
supply a solution that is both technically viable
large number of proven algorithms exist both
and cost-effective.
for real-time and offline applications,
This white paper details the basic ideas
algorithms that are implemented on platforms
behind a novel, patent-pending technology
ranging from pure software to pure hardware.
called Image Processing over IP networks
These platforms, however, are generally
(IPoIP™). As its name implies, IPoIP provides
designed to deal with a relatively small number
a solution for automatically extracting useful
of simultaneous image inputs (in most cases
data from a large number of simultaneous
actually no more than one). They are designed
image (video or still) inputs connected to an IP
in one of two main architectures: Local
network, but unlike other existing methods,
Processing and Server Processing.
does so at reduced costs without compromising
Local Processing Architecture a processing unit, and the results are then
This is by far the most commonly transmitted through a network connection to
available system architecture for image the monitoring area. The processing unit is
processing. The main idea behind it is that all usually PC based for the more complex
the processing is done at the camera location by solutions but the recent growing trend is to
move the processing to standalone boxes based • Each additional type of algorithm requires
on a DSP or even an ASIC. It performs the additional processing resources and integration
entire image-processing task and outputs a between various algorithms is costly.
suitable message to the network when an event • In case of cameras that are distributed
is detected. Also residing at the location of the outdoors, PC based products provide an
camera is a video encoder that is used for inadequate solution due to space limitations
remotely viewing the video through the IP and their inability to withstand harsh
network. It can be configured to transmit the environmental conditions.
video at varying qualities depending on the • DSP based solutions require a much higher
available bandwidth. The video is transmitted development effort because of limited
using standard video compression techniques resources and inferior development tools.
such as MJPEG, MPEG-4 and others. When
cost is less of an issue, this architecture Server Processing Architecture The second
provides an adequate solution for running a type of system architecture (although far less
single type of algorithm per camera. However, common) is the “Server Processing”

when the number of cameras increases and a architecture. All of the image processing tasks
more robust solution is needed (which is in are put on one single powerful server that
many times the case), this solution falls short serves many cameras. From a hardware point
due to the following reasons: of view, this solution is more cost effective and
• Each camera requires its own dedicated is suitable for large-scale deployments. This
processing resources, causing the system cost architecture is made possible due to the fact
to scale linearly with the number of cameras that there are only a small percentage of
needed. No cost reduction is possible when “interesting” occurrences in each camera,
dealing with a large-scale system. requiring only a small amount of actual
processing power and allowing for one server
to deal with many cameras. Where this
architecture comes short is on the network side • It must be possible to view each camera
– it has extraordinary bandwidth requirements. remotely from a monitoring station connected
Because all of the image-processing functions to the network.
are performed at the server, it needs to receive • One or more image processing algorithms
very high quality images in order to provide needs to be applied to each camera at any given
accurate results. This creates a need for moment. The outputs of these algorithms need
significant network resources. When the to be collected in a central database and should
application runs on a LAN with a relatively also be viewable on the monitoring station.
small number of cameras this may be possible, • It should be possible to easily add new
but for distributed applications with large algorithms or customize existing ones without
numbers of cameras the solution becomes requiring massive upgrades to the system.
impractical because of the costly network • There's a need to detect both single-camera
infrastructure required. This also leads to the events and multi-camera events. Multi camera
fact that this type of architecture is usually used events fuse the information from several
in applications where the algorithm works on a sensors to create a higher level event.
single frame at a time and not on a full video • In Rural areas (tracks, pipelines, borders)
stream. where there's no infrastructure, power
requirements and bandwidths (especially if
Requirements For a Viable Solution (Both using wireless) are very important. For these
Technical and Cost) types of installations where power consumption
Having understood the limitations of the is critical, installing PC's is not an option.
existing image processing architectures, let us
now look at the requirements for a cost- IPoIP Architecture
effective and technically viable solution. Such a The IPoIP architecture was designed to
system must have the following characteristics: answer the needs defined above with the
• Scalability and mass-scale abilities – the following key goals in mind:
system must be able to handle deployments • Providing a cost effective solution for image
ranging from a few dozen cameras up to processing applications over a large number of
thousands of cameras simultaneously. cameras without sacrificing detection
• Scalability from a cost perspective – no probability or increasing False Alarm Rate
matter what the scale of the deployment is – the (FAR).
system has to provide a cost-effective solution. • Enabling the application of any algorithm to
• The cameras should be able to be installed in any camera even if it is in a geographically
geographically remote locations (under the remote location with limited supporting
assumption that there is an IP network facilities.
connection to these locations).
• Providing the ability to apply a wide range of level and extracts condensed information (or
algorithms simultaneously to any camera “features”) from the image pixels. This process
without limiting the user to only a single works on the incoming images when they are at
application at a time. their highest quality and no data has been lost
The uniqueness of IPoIP is a distributed due to image compression. When a suitable
image processing architecture. Instead of feature is located it is sent to the central server
performing the image-processing task either at for further analysis over the IP network. Since
the camera or in the monitoring area using one the feature data is very compact, it requires a
of the two aforementioned architectures, the negligible amount of network bandwidth (only
algorithms are performed in both locations. around 20 Kbps for each camera). There are
They are segmented into 2 parts and divided many types of features that can be identified in
between the video encoder hardware and the this manner, including but not limited to:
central image processing server. In this way • Segmentation of foreground and background
IPoIP is able to retain the strengths of both the • Motion vectors – generated by tracking areas
“Local” and “Server” architectures, while of the image between successive frames.
avoiding their limitations. • Histograms
• Specific color value range in a specified space
The idea behind this division is based on the (RGB, YUV, HSV).
fact that a processing unit already exists near • Edge information
each camera inside the video encoder (used to • Identifying problems with the input video
compress the video). This existing processing image such as image saturation, overall image
unit is a low-cost fixed-point processor and is noise and more.
highly suitable for performing several Additionally, upon request from the server, the
operations (as described below) that allow the video encoder can send the actual pixel data for
sending of only a small amount of information a certain portion of the image. For example,
to the image processing server for the main when performing automatic license plate
analysis. In this way, the system utilizes both recognition, the video encoder can send only
the high resolution of the original video and the the pixels of the license plate to the server, thus
computing strength and flexibility of the central eliminating the need for more bandwidth as is
server, without the need for a costly network. the case when sending the whole picture. The
common attributes to all these features is that
Feature Extraction Near the Camera they can be very efficiently implemented on
The initial part of the processing, which fixed point DSP processors on the one hand
is done by the video encoder is called Universal and provide excellent building blocks for a
Feature Extraction (UFE). This process is the wide variety of algorithms on the other hand
part of the algorithm that works at the pixel (hence the name Universal Feature Extractor).
5. Obtain additional information regarding
Feature Analysis At the Central Server objects of interest such as color, or sub
The main part of the processing is classification (Type of vehicle, etc.)
performed by the IPoIP server. The server is 6. Optionally extract unique identifying
able to dynamically request specific features features for an object, such as license plate
from each camera, according to the recognition or facial recognition.
requirements of the specific algorithms that are 7. Decide based on all the gathered information
currently being applied. and on the active detection rules whether or not
The server analyzes the feature data that an event needs to be generated and the system
is collected from each camera, and dynamically operator informed.
allocates computational resources as needed. In 8. Receive and analyze information from any
this way the server is able to utilize large-scale other algorithm running on the server at the
system statistics to perform very complex tasks same time. This very powerful capability
when needed, without requiring a huge and enables easy implementation of tasks such as
expensive network for support. inter-camera tracking. Using this ability a
The part of each algorithm that runs on the specific c moving object (a person or vehicle)
server performs the following main tasks: can be accurately tracked as it moves from the
1. Request specific features from the remote field of view of one camera to the next with the
UFE. system operator always viewing the correct
2. Analyze the incoming features over time and image. This ability also enables creating
extract meaningful “objects” from the scene. sequences of rules where a rule on one camera
3. Track all moving objects in the scene in only becomes activated (or deactivated) when a
terms of size, position and speed and calibrate rule on another camera detects an event.
all of This data into real word coordinates. The It is important to note that the algorithms at the
calibration process transforms the 2 server are constantly gathering information
dimensional data Received from the sensors regarding the scene even though most of the
into 3 dimensional data using various time no events are being generated. This
calibration techniques. Many Such techniques information can be stored as meta-data along
can be implemented in accordance with the with the video recording and later enable very
specific scene being analyzed. fast and efficient searches on large amounts of
4. Classify these objects into one of several recorded video content.
major classes such as vehicles, people, animals
and Static objects. The classification process
can be done using various parameters such as
size, Speed and shape (pattern recognition).
The Combined End-Product functionality of such systems. As a result, video
surveillance is rapidly penetrating into
Utilizing the methods described above, organizations needing security monitoring on a
IPoIP is able to provide algorithm complexity very large scale and in widely dispersed areas –
level and low costs that are unrivaled by any such as railway operators, electricity and
other existing method today as can be seen in energy distributors, the Border and Coast
the following comparison table: Guards and many more. Such organizations

encounter new problems of operating and


handling a huge amount of cameras, While
Applications In the Physical Security
having to provide for extensive bandwidth
Market VMDetector
requirements. This is where the use of
The IPoIP platform is ideally suited for
automatic video-based event detection comes
applications needing multiple simultaneous
into play. Solutions are currently available for
image input and processing. The fastest
automatic Video Motion Detection (VMD),
growing market today for such large scale
License Plate Recognition (LPR), Facial
image processing is the Physical Security
Recognition (FR), Behavior Recognition (BR),
market. Standard security measures today
traffic violation detection and other image
include the rapid deployment of hundreds of
processing applications. The output of these
thousands of cameras in streets, airports,
detection systems may be used for triggering an
schools, banks, offices and residences. These
alarm and/or initiating a video recording. This
cameras are currently being used mainly for
can reduce network bandwidth requirements (in
enabling the surveillance of a remote location
situations where constant viewing and
by a human operator or for recording the
recording is not required) and allow allocation
occurrences at a certain location for use at a
of human detection only to those cameras that
later time should the need arise. The
contain a special event.
introduction of digital video networking and
All the current implementations of these
other new technologies is now enabling the
algorithms suffer from the inherent problems of
video surveillance industry to move to new
existing system architectures as described
directions that significantly enhance the
above, and thus are very costly and unable to channel figures should be extremely low so that
penetrate them market on a large scale. IPoIP the accumulative system will can effectively be
provides the ideal platform for a cost-effective monitored by a small number of operators.
high performance and constantly evolving • Critical system – The system’s availability
physical security system. should be close to 100%. No single-point-of-
failure should exist. It is desired that the
Sample Application - Railway System network will handle local failures such as cable
Protection cuts.
In order to demonstrate the practical use • Variety of event types – The video
and benefits of the IPoIP technology, following intelligence system should detect intruders,
is a description of a typical application – suspected objects, safety hazards, suspected
Railway System Protection. This example license plate numbers and other standard and
shares similar requirements with other user-specific event types. This can be achieved
applications such as borderline security, using multiple high level algorithms, including
pipeline protection and more: using several algorithms simultaneously for a
• Poor infrastructure – The power and single camera.
communication infrastructure along the tracks • Low cost of ownership – As the protected
is not guarantied. A low-power and low- area is very large, rural and distributed, field
bandwidth solution is mandatory (e.g. visits are very expensive. Therefore, a
transmitting hundreds or thousands of cameras minimum amount of equipment in the field is
are not practical). A wireless / solar-cell vital for low installation and maintenance costs.
powered solution is desired. Looking at the above list, it is clear that
• Mostly outdoor environment – The system the classic concept of local processing – either
should be immune to typical outdoor field based or center based – fails to comply
environment phenomena such as rain, snow, with most requirements. Field based solutions
clouds, headlights, animals, insects, pole require lots of computers in the field, resulting
vibration etc. high power requirements and cost of
• Distributed locations – Railway facilities ownership. Server based solutions require
(tracks, stations, bridges, tunnels, service transmission of all the video sources at high
depots etc.) are distributed over a large quality all the time to the center, resulting in
geographic area, which forces using an IP very high bandwidth requirements.
network based system. Using IPoIP technology, only low-
• Large-scale – A typical railway system power video encoders with embedded feature
would use thousands of cameras to protect the extraction capability are required in the field.
tracks and all facilities. The (Nuisance Alarm Furthermore, most of time there’s no need to
Ratio/False Alarm Rate) NAR/FAR per transmit video but only low bandwidth feature
stream data, which is a dramatic saving in installed every 2-4 Km for event monitoring
network bandwidth requirements without and management. Two algorithms are used to
compromising on performance. protect the railroad. A Video Motion Detection
Poles are installed along the tracks. (VMD) algorithm is used to detect persons and
Each pole is carrying a FLIR (thermal) camera, vehicles approaching the protected area. A
video encoder, IP network node and power Non-Motion Detection (NMD) algorithm is
circuitry. The FLIR camera can detect persons used to detect any static changes in the scene
reliably up to few hundred meters at all weather such as objects left on tracks (bomb, fallen tree,
and illumination conditions, thus preventing the stuck car) or damaged tracks (missing parts).
need for artificial illumination and reducing These two algorithms are used simultaneously.
FAR/NAR. The camera consumes 2-5W. The The server is located at the backend and is
video encoder / feature extractor unit is a low based on a cluster of two or more computers
power module that uses some 10-20Kbps of designed as required for critical systems, The
feature data in average and transmits video at server computers may even be geographically
higher bandwidth (0.5-2Mbps) only when an distributed over few locations to increase
event is detected or upon an operator’s request. robustness. The system may be operated from
The encoder consumes 3-8W. The IP any location on the network. This enables
network can be either a wired (copper or fiber) dividing large networks to various users /
or wireless solution. For a wired network, fiber departments.
is recommended as it is not limited by distance
and immune to EMI/RFI. If cabling if not REFERENCES
possible or is too expensive, a wireless solution
1. IEEE journals and magazines website
may be used. A hybrid WI-FI and satellite
based network is recommended such that the 2. Google search engine

inter-pole communication is WI-FI based and


the access points use satellite link. An antenna
should be installed on the top of the pole. This
solution does not require any infrastructure and
consumes about 10W per pole / 40W per access
point. Power may be supplied either by power
lines or using a solar cell and battery module. If
cabling is used, it makes sense to use power
lines. If a wireless network is used, the power
should be supplied by solar cells.
On top of the FLIR cameras used for
intruder detection, a PTZ color camera is

Das könnte Ihnen auch gefallen