Beruflich Dokumente
Kultur Dokumente
Abstract—Radar detection of moving objects is vulnerable to generalization of widely used features such as HOG and
external environment. By introducing the object confirmation SIFT, features applied to engineering practices should be
algorithm, the intelligent radar perimeter security system’s designed specifically. Therefore, the detection accuracy
false alarm rate can be reduced significantly. The object depends heavily on the experience of developers. In order to
confirmation algorithm is essentially an object detection obtain higher accuracy, multiple features are used, so the
algorithm. Due to the poor generalization of artificial feature feature dimension becomes bigger and bigger which greatly
extraction algorithms, we use the deep convolution neural reduces the real-time performance of algorithms. As a result,
network to extract deep features automatically for object it is difficult to make a breakthrough in practice for
confirmation. In order to meet the real-time requirement in
traditional object detection algorithms.
engineering practices, our algorithm use the YOLOv2 system
With the continuous construction of large-scale data sets
as a basis, and selects anchor boxes which meet object scales of
our training data set by k-means++ clustering. To improve the and the ever-increasing hardware computing power, theories
YOLOv2 network structure, low-layer deep features which and practices of deep learning have been rapidly developed
denote the texture information and high-layer deep features in recent years. Object detection algorithms based on deep
which denote the semantic information are combined layer by convolutional neural networks have achieved a qualitative
layer to make object detection more accurate. The improvement in performance. In 2014, deep features are
experimental results show that the false alarm rate of the applied to object detection for the first time in R-CNN [1].
intelligent radar perimeter security system is further reduced Since then, object detection with deep learning based on
by introducing the object confirmation algorithm. Especially Region Proposal was pioneered. SPP-net [2], Fast R-CNN
for extreme weather, false alarms of radar detection greatly [3], Faster R-CNN [4] and R-FCN [5] all were improvement
increase. But most of them are eliminated after running object of this kind of method. With the optimization of deep
confirmation algorithm. Therefore the warning accuracy of the convolutional neural network, the above methods achieved
entire system can be guaranteed. The detection speed of the higher accuracy and faster detection speed. But all of them
object confirmation algorithm is 33FPS, which meets the real- didn’t achieve real-time. In engineering practices, real-time
time requirement of engineering practices. is very important. As a result, object detection algorithms
with deep learning based on regression were proposed.
Keywords-YOLOv2; deep convolutional neural network;
YOLO [6], SSD [7] and YOLOv2 [8] were typical
feature fusion; perimeter security
representatives. Without extraction of region proposals, such
methods directly obtained the location and category of
I. INTRODUCTION objects by regression, achieving real-time detection.
Perimeter security system is widely used in airports, Compared with traditional feature extraction methods,
nuclear power plants, oil fields, prisons, etc., to prevent the deep convolution neural network simulates the human
illegal invasion. The intelligent radar perimeter security brain. By continuous learning, deep features are extracted
system is a next-generation product which is developed to from images automatically, which not only avoid
reduce the high false alarm rate of the traditional perimeter complicated feature design step but also have powerful
security system. Due to external interference signals caused generalization. According to the real-time requirement of the
by swaying trees and moving animals, radar detection still object confirmation module, our object confirmation
contains some false alarms. By introducing the object algorithm is based on YOLOv2. Our training data set is
confirmation algorithm, false alarms detected by the radar constructed and anchor boxes are selected by clustering for
can be eliminated and the warning accuracy can be further this training set. For network structure, the detail texture
improved. information of deep features in low layers and the semantic
Object confirmation essentially is object detection based information of deep features in high layers are combined to
on an image. Traditional object detection algorithms include make object detection more accurate. The experimental
three steps: region selection, feature extraction and results show that most of false alarms captured by the radar
classification. Feature extraction is the key factor which are eliminated by introducing our object confirmation
affects the system performance. Because of the poor algorithm. The false alarm rate of the system is reduced
332
The number of anchor boxes is equal to the number of
clustering centers. For various numbers of clustering centers,
the average IOU between each ground truth and its closet
centroid is shown Fig. 5. When using 9 clustering centers,
the upward trend of the average IOU has tended to be stable.
Therefore, we use 9 different anchor boxes to balance the
detection accuracy and the model complexity. Table I shows
the width scale and height scale of each anchor box which
refer to the ratio of the width and height of each anchor box
to the width and height of related detection cell whose width
and height are both 32pixels.
V. NETWORK STRUCTURE
Deep convolution neural network is the soul of deep
learning. A well-designed network can efficiently extract
deep features of objects. Because the low-layer filters extract
the detail texture information of objects while the high-layer
filters extract the semantic information, multi-feature fusion
has become a new trend in the deep convolutional neural
Figure 4. Feature Combined module. network design in recent years [10]. Combining the semantic
information with the texture information for object detection
can achieved higher detection accuracy.
IV. TRAINING DATA SET
TABLE I. ANCHOR BOXES WITH 9 CLUSTERING CENTERS
A. Training Data Set Construction
The training data set is a very important factor whether Anchor box Width scale Height scale
deep learning can get good results. Therefore, in the training
step, the training data set construction is a key task. In our 1 0.63 1.22
project, we collect a large number of images as training 2 0.92 3.12
samples, which contain objects of the six classes named
people, car, motorbike, minibus, truck and bus. The location 3 1.68 2.00
and class of each object in the images are labeled. To ensure 4 1.73 5.21
the detection accuracy, training samples should be as diverse
as possible. The images in our training data set not only 5 2.96 4.32
include different outdoor scenes such as security monitoring 6 3.00 8.59
and traffic, but also cover different light environments such
as sunny day, rain, snow, and night. In addition, scales and 7 4.87 6.80
deformations of objects also should be as diverse as possible. 8 5.77 10.38
B. Anchor Boxes Slection 9 10.36 10.70
It is easier to achieve stable training of the deep model by
predicting the offset of each bounding box to related A. Feature Combined Module
detection cell than directly calculating coordinates of each YOLOv2 combines the 26 × 26 × 256 features before the
bounding box. So it is necessary to provide anchor boxes last pooling layer with the subsequent 13 × 13 × 1024
with specified scales as the reference to predict bounding features to obtain 13 × 13 × 3072 high-dimensional features
boxes. for the final detection. In order to excavate more deep
YOLOv2 provides anchor boxes for training sets like features, the feature combined module shown in Fig.4 is used
VOC and COCO (Common Objects in Context). Because to merge the low-layer deep features one by one with the
there are great differences in object classes and scenarios subsequent high-layer deep features. The fusion process is
between our training data set and them, we must calculate carried out along the deep convolutional neural network
anchor boxes which are suitable for our training data set. layer by layer.
We collect the width and height data of all objects in our As shown in Fig. 4, Firstly we perform batch
training data set. Then we run k-means++ clustering on them normalization on the previous depth feature named
to get anchor boxes which can best fit the aspect ratio of our feature_N and then put the output to conv1. Secondly, we
training data set. The detection accuracy is measured by the apply the second batch normalization to the output of conv1
IOU (Intersection over Union) between the bounding box and then put the output to conv2. Finally, we combine the
and the ground truth, so we use the distance metric shown in output of conv2 with feature_N to obtain the combined
feature named feature_N + 1.
d(box, centroid) = 1 IOU(box, centroid)
333
B. Network Structure convolution layer which has gradually reduplicated the
The structure of our deep convolutional neural network is number of filters.
shown in Fig. 5. Before each pooling layer, the low-layer We should detect objects with six different classes. The
features are superposed on the high-layer deep features by input image is divided into 13 × 13 detection cells and 9
multiple feature combined modules and then fed to 1 × 1 bounding boxes are predicted for each cell. Therefore, our
final output is 13 × 13 × (9 × (5 + 6)) = 13 × 13 × 99.
Object confirmation
Date Radar warnings
warnings
20180105 2860 2722
334
Because the weather condition on January 5, 2018 was 2017YFC0804900, the National Natural Science Foundation
stable, there were 148 false alarms in 2860 warnings. After of China through grant 61503352.
object confirmation, 138 false alarms were removed, that is,
93.3% of false alarms were eliminated. Because of the REFERENCES
frequent rocking of trees due to snow melting on January 6, [1] R. Girshick, J. Donahue, T. Darrell and J. Malik, “Rich Feature
2018, false alarms increased significantly. There were 304 Hierarchies for Accurate Object Detection and Semantic
false alarms in 1177 warnings. After object confirmation, Segmentation,” Proc. IEEE Conf. Computer Vision and Pattern
Recognition (CVPR 14), IEEE Press, Jun. 2014, pp. 580-587, doi:
274 false alarms were removed, that is, 90.1% of false 10.1109/CVPR.2014.81.
alarms were eliminated. There were 719 false alarms in 1768 [2] K. He, X. Zhang, S Ren and J. Sun, “Spatial Pyramid Pooling in Deep
warnings on January 7, 2018. After object confirmation, 651 Convolutional Networks for Visual Recognition,” IEEE Transactions
false alarms were removed, that is, 90.5% of false alarms on Pattern Analysis and Machine Intelligence, vol. 37, Sept. 2015, pp.
were eliminated. In conclusion, under stable weather 1904-1916, doi: 10.1109/TPAMI.2015.2389824.
conditions, the intelligent radar perimeter security system [3] R. Girshick, “Fast R-CNN,” Proc. IEEE Conf. Computer Vision
can achieve effective warning even in complex scenarios. Its (ICCV 15), IEEE Press, Dec. 2015, pp. 1440-1448, doi:
10.1109/ICCV.2015.169.
false alarm rate can be further reduced by introducing the
[4] S Ren, K. He, R. Girshick and J. Sun, “Faster R-CNN: Towards Real-
object confirmation module. Under extreme weather Time Object Detection with Region Proposal Networks,” IEEE
conditions, false alarms detected by the radar will be Transactions on Pattern Analysis and Machine Intelligence, vol. 39,
increased substantially. At this time, with the object Jun. 2017, pp. 1137-1149, doi: 10.1109/TPAMI.2016.2577031.
confirmation module, most of false alarms can be eliminated [5] J. Dai, Y. Li, K. He and J. Sun, “R-FCN: Object Detection via
and the entire system’s warning accuracy can be guaranteed. Region-based Fully Convolutional Networks,” arXiv:1605.06409, Jun.
2016, unpublished.
VII. CONCLUSION [6] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, “You Only Look
Once: Unified, Real-Time Object Detection,” Proc. IEEE Conf.
By introducing the real-time object confirmation Computer Vision and Pattern Recognition (CVPR 16), IEEE Press,
algorithm based on deep convolution neural network, the Jun. 2016, pp. 779-788, doi: 10.1109/CVPR.2016.91.
intelligent radar perimeter security system incorporates the [7] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed et al., “SSD:
high sensitivity of radar detection and the high accuracy of Single Shot MultiBox Detector,” Proc. 14th European Conference.
object confirmation. Due to breaking the limitation of single European Conference on Computer Vision (ECCV 2016), Springer
technology, the system’s false alarm rate is reduced while the Press, Sept. 2016, pp. 21-37, doi: 10.1007/ 978-3-319-46448-0_2.
system’s accuracy is improved. Using the object [8] J. Redmon and A. Farhadi, “YOLO9000: Better, Faster, Stronger,”
Proc. IEEE Conf. Computer Vision and Pattern Recognition (CVPR
confirmation algorithm, the intelligent radar perimeter 17), IEEE Press, Jul. 2017, pp. 6517-6525, doi:
security system significantly reduces the dependence on the 10.1109/CVPR.2017.690.
weather environment. It can still achieve effective warning [9] M. Lin, Q. Chen and S. Yan, “Network In Network,” Proc.
under adverse weather conditions. International Conference on Learning Representations(ILCR 2014),
Apr. 2014.
ACKNOWLEDGMENT [10] K. He, X. Zhang, S Ren and J. Sun, “Deep Residual Learning for
Image Recognition,” Proc. IEEE Conf. Computer Vision and Pattern
This work is financially supported by National Key Recognition (CVPR 16), IEEE Press, Jun. 2016, pp. 770-.787, doi:
Research and Development Project of China through grant 10.1109/CVPR.2016.90.
335