Sie sind auf Seite 1von 8

UNISYS CLOUD 20/20 SIP Y10

HUMAN DETECTION AND RECOGNITION


TEAM MEMBERS

MR. ESHWAR S DEVARAMANE


3RD YEAR, IS&E
ESHWARD95@GMAIL.COM

MR.RAGHAVENDRA N S
3RD YEAR, IS&E
RAGHAVENDRANS1998@GMAIL.COM

MR. SUKRUTH R
3RD YEAR, IS&E
VIVEKSUKRUTH@GMAIL.COM

MR. GURU RAGHAVENDRA R


3RD YEAR, IS&E
GURURAGHAVENDRAR98@GMAIL.COM

PROJECT GUIDE

DR. SANJAY H A
PROF AND HEAD, IS&E
SANYA.HA@NMIT.AC.IN
INSTITUTION
NITTE MEENAKSHI INSTITUTE OF TECHNOLOGY (NMIT) BANGALORE

PROJECT ABSTRACT
In this project we have worked on the problem of human detection, face detection, face
recognition and tracking an individual. Our project is capable of detecting a human and its face
in a given video and storing Local Binary Pattern Histogram (LBPH) features of the detected
faces. LBPH features are the key points extracted from an image which is used to recognize
and categorize images. Once a human is detected in video, we have tracked that person
assigning him a label. We have used the stored LBPH features of individuals to recognize them
in any other videos. After scanning through various videos our program gives output like-
person labeled as subject1 is seen in video taken by camera1, subject1 is seen in video by
camera2. In this way we have tracked an individual by recognizing him/her in the video taken
by multiple cameras. Our whole work is based on the application of machine learning and
image processing with the help of openCV, an open source computer vision library.

INTRODUCTION
PROBLEM STATEMENT: HUMAN DETECTION AND RECOGNITION

The observation or monitoring of the activity, behavior and other information by a system
which include several Closed Circuit Television (CCTV) cameras for observation and a set of
algorithms to track a person is called Surveillance system. Technology has evolved a lot in the
last few decades, previously there were no security cameras neither in Banks, Railway Station
nor at other places. There were only security guards which protect these areas. Once the
security cameras came into existence, it became easy to find people passing within the range
of CCTV camera by simply searching through the videos recorded. Inventions increases
people’s expectations, although security camera reduces human effort but one has to search
for an individual through entire video which takes a considerable amount of time. So people
thought what if searching task can be accomplished by machine, it would save both human
effort and time. A combination of machine learning with image processing is used to make a
machine learn to recognize a person and track that person in the given footage. Our project is
all about a system which have been designed to track human in the given videos. We have
trained our machine for some people assigning label to each, so that when he/she have
appeared in a video or more than one videos he/she have been recognized by our system
assigning him/her label. In this way a person have been recognized and tracked in a given
videos.

2
Human detection and recognition has lot of Advantages:

ADVANTAGE #1: ABNORMAL EVENT DETECTION

The most obvious application of detecting humans in surveillance video is to early detect an
event that is not normal. Detecting sudden changes and motion variations in the points of
interest and recognizing human action could be done by constructing a motion similarity
matrix or adopting a probabilistic method . Methods based on probability statistics use the
minimum change of time and space measure to model the method of probability.

ADVANTAGE #2: HUMAN GAIT CHARACTERIZATION

Detected humans in walking by extracting double helical signatures(DHS) from surveillance


video sequences. They found that DHS is robust to size, viewing angles, camera motion and
severe occlusion for simultaneous segmentation of humans in periodic motion and labelling of
body parts in cluttered scenes. They used the change in DHS symmetry for detecting humans
in normal walking, carrying an object with one hand, holding an object in both hands,
attaching an object to the upper body and attaching an object to the legs. Although DHS is
independent of silhouettes or landmark tracking, it is ineffective when the target walks toward
the camera as the DHS degenerates into ribbon and no strong symmetry can be observed and
detected the motion of a person who was walking at approximately 25° offset the camera's
image plane from a static camera.

ADVANTAGE#3: PERSON DETECTION IN DENSE CROWDS AND PEOPLE COUNTING

Detecting and counting persons in a dense crowd is challenging due to occlusions. The
proposed system was applied for detecting individual heads in dense crowds of 30 to 40 people
against cluttered backgrounds from a single video frame. However, the performance of their
approach may be challenged by the colour intensities of the heads to be detected.

ADVANTAGE #4: PERSON TRACKING AND IDENTIFICATION

A person in a visual surveillance system can be identified using face recognition and gait
recognition techniques. The detection and tracking of multiple people in cluttered scenes at
public places is difficult due to a partial or full occlusion problem for either a short or long
period of time. . The wider application of human detection is not only limited to analysis
surveillance videos but also extended to player tracking and identification in sport videos.

3
TECHNIQUES
Human detection in a smart surveillance system aims at making distinctions among moving
objects in a video sequence. The successful interpretations of higher level human motions
greatly rely on the precision of human detection. The detection process occurs in two steps:
object detection and object classification.

1.2.1 OBJECT DETECTION

An object is generally detected by segmenting motion in a video image. Most conventional


approaches for object detection are background subtraction, optical flow and spatio-temporal
filtering method. They are outlined in the following subsections.

1.2.1.1 BACKGROUND SUBTRACTION

Background subtraction is a popular method to detect an object as a foreground by


segmenting it from a scene of a surveillance camera. The camera could be fixed, pure
translational or mobile in nature . Background subtraction attempts to detect moving objects
from the difference between the current frame and the reference frame in a pixel-by-pixel or
block-by-block fashion. The reference frame is commonly known as ‘background image’ ,
‘background model’ or ‘environment model’. A good background model needs to be adaptive
to the changes in dynamic scenes. Updating the background information in regular intervals
could do this , but this could also be done without updating background information . Few
available approaches have been discussed in this section:

 Mixture of Gaussian model. Stauffer and Grimson introduced an adaptive Gaussian


mixture model, which is sensitive to the changes in dynamic scenes derived from
illumination changes, extraneous events, etc. Rather than modelling the values of all
the pixels of an image as one particular type of distribution, they modelled the values of
each pixel as a mixture of Gaussians. Over time, new pixel values update the mixture of
Gaussian (MoG) using an online K-means approximation. In the literature, many
approaches are proposed to improve the MoG .In an effective learning algorithm for
MoG is proposed to overcome the requirement of the prior knowledge about the
foreground and background ratio. In authors presented an algorithm to control the
number of Gaussians adaptively in order to improve the computational time without
sacrificing the background modelling quality. In each pixel is modelled by support
vector regression. Kalman filter is used for adaptive background estimation in . In a
framework for hidden Markov Model (HMM) topology and parameter estimation is
proposed. In colour and edge information are fused to detect foreground regions. In
normalized coefficients of five kinds of orthogonal transform (discrete cosine
transformation, discrete Fourier transformation (DFT), Haar transform, single value
decomposition and Hadamard transform) are utilized to detect moving regions. In each
pixel is modelled as a group of adaptive local binary pattern histograms that are
calculated over a circular region around the pixel.

4
 Non-parametric background model. Sometimes, optimization of parameters for a
specific environment is a difficult task. Thus, a number of researchers introduced non-
parametric background modelling techniques Non-parametric background models
consider the statistical behaviour of image features to segment the foreground from
the background. In a non-parametric model is proposed for background modelling,
where a kernel-based function is employed to represent the colour distribution of each
background pixel. The kernel-based distribution is a generalization of MoG which does
not require parameter estimation. The computational requirement is high for this
method. Kim and Kim proposed a non-parametric method, which was found effective
for background subtraction in dynamic texture scenes (e.g. waving leaves, spouting
fountain and rippling water). They proposed a clustering-based feature, called fuzzy
colour histogram (FCH) to construct the background model by computing the similarity
between local FCH features with an online update procedure. Although the processing
time was high in comparison with the adaptive Gaussian mixture model the false
positive rate of detection is significantly low at high true positive rates.
 Temporal differencing. The temporal differencing approach involves three important
modules: block alarm module, background modelling module and object extraction
module. The block alarm module efficiently checked each block for the presence of
either a moving object or background information. This was accomplished using
temporal differencing pixels of the Laplacian distribution model and allowed the
subsequent background modelling module to process only those blocks that were
found to contain background pixels. Next, the background modelling module is
employed in order to generate a high-quality adaptive background model using a
unique two-stage training procedure and a mechanism for recognizing changes in
illumination. As the final step of their process, the proposed object extraction module
computes the binary object detection mask by applying suitable threshold values. This
is accomplished using their proposed threshold training procedure.
.

5
PROJECT FLOW DESIGN

6
SYSTEM REQUIREMENTS:

SOFTWARE REQUIREMENTS

1) Python
2) OpenCV
3) NodeJS
4) Scikit-learn

HARDWARE REQUIREMENTS:

1. GPU: Intel HD Graphics


2. CPU: Intel Celeron
3. Camera: Minimum 2MP Camera
4. USB port: 1x USB 2.0 or better port
5. Operating system: Windows XP or better

EXPECTED RESULTS:

RESULTS
Several videos containing different actions were taken under varied conditions in terms of
illumination, location, and background. Frame by frame extraction from these videos is
performed. Each frame is firstly labeled in accordance with its semantic content manually.
Each stream of frames belonging to a specified class is bifurcated and kept in a folder
maintaining its sequence. Hence several samples of each action are segmented from the
videos manually. Each sample is a stream of videos belonging to specific action. In the next
step the background or the effect of background is removed from the frame. Two different
strategies are followed for this purpose. With the first method, background is removed by
firstly taking a blank frame which only contains the background and then subtracting this
frame from the one containing a foreground. As a result background will be eliminated.
The other method used for this purpose takes the difference of two successive frames. The
resultant frame will contain just the change that occurred due to motion dynamics. Once
the effect of background has been countered then for all resultant frames a corresponding
feature vector is formed. The feature vector of a frame contains the raw, central, scale, and
rotation invariant moments of the image besides its centroid and eccentricity .

7
PROJECT TIMELINES:

THE PROJECT WOULD TAKE AROUND 3 MONTHS TO COMPLETE WITH ALL THE MODULES

Time Project Part


1st Month Training Data set
2nd Month Using Video recognition Technique
3rd Month Tracking the Person captured in video

REFERENCES

 Ahonen, T., Hadid, A., and Pietikainen, M. Face Recognition with Local
Binary Patterns. Computer Vision - ECCV 2004 (2004), 469481.

 https://en.wikipedia.org/wiki/Supportvectormachine/media/F ile :
KernelMachine.s

Das könnte Ihnen auch gefallen