OpenCV intro guide

Open CV intro
References: 1. "Learning OpenCV: Computer Vision with the OpenCV Library", Bradski & Kaehler (O'Reilley 2008) 2. http://opencv.willowgarage.com/wiki/
What is Computer Vision?

(2D Image) Scene "Information" "Apple on a table" Triangle Mesh Virtual lights Camera
(2D Image)
Scene "Information" Skin Tones Parallel Lines (building) Depth calculations ... "Man in front of building"
What is OpenCV?
~2500 computer vision algorithms Highly optimized (originally for Intel) BSD license Languages:
C (function-based) OpenCV 1.x C++ (class based) OpenCV 2.x Python (2.6 and 2.7)
Import cv2 for the OpenCV 2.x style bindings Import cv2.cv for the OpenCV 1.x style bindings Very poor documentation!
Java (not yet, though)
Ported to Windows, OSX, Linux, iOS, Android
Links
Download:
http://opencv.org/
C / C++
http://docs.opencv.org/
Python
Tutorials: https://opencv-python-tutroals.readthedocs.org/en/latest/ [Not really any documentation just read the C++ docs and translate it yourself]
Python setup
Copy [OpenCVdir]\build\Python\2.7\Lib\sitepackages cv2.pyd (a .dll file) Paste it in the same directory as your script (or put in your python install folder)
Example01: Absolute basics

import cv2 # Creates a numpy.ndarray object (basically a fast, C-based # array of numeric values img = cv2.imread("apple.jpg", cv2.IMREAD_COLOR) # Creates a window (title = an apple!) and displays img in it. cv2.imshow('an apple!', img) # Waits for any key to be pressed. cv2.waitKey(0) # Destroys all windows. cv2.destroyAllWindows()
Example02: Video reading / game loop

from cv2 import cv import time #import cv2 # "Game" Loop while True: # Capture the current frame and show # it in the window img = cv.QueryFrame(cam) cv.ShowImage("main", img) if writing: # Save to the avi file cv.WriteFrame(writer, img) # Get Keyboard events keyCode = cv.WaitKey(5) if keyCode == 27: # Escape break elif keyCode == ord("s"): # Save the current image to a file cv.SaveImage("F" + str(captureNum) + ".jpg", img) captureNum += 1 # Create a new window cv.NamedWindow("main") writing = 0 if writing: # Start capturing from camera cam = cv.CaptureFromCAM(-1) # Create a write for offline processing (without # a webcam) width = cv.GetCaptureProperty(cam, cv.CV_CAP_PROP_FRAME_WIDTH) height = cv.GetCaptureProperty(cam, cv.CV_CAP_PROP_FRAME_HEIGHT) # Note: Indeo video 5.10 is the only codec I could # read and write to. writer = cv.CreateVideoWriter("example01.avi", code, 30.0, (width,height), 0) else: # Open a video file for reading (treat it as if it # were a camera) cam = cv.CaptureFromFile("example01.avi") # The current "window grab" captureNum = 0 # End capturing del cam # Should release the camera / file if writing: del writer # Destroy the window cv.DestroyWindow("main")
A tour of CV algorithms
1. Noise reduction:
a. Blurring b. Thresholding c. Erode / Dilate
2. Edge / shape detection

a. (Hu) moments
3. Histograms
a. Back-projection
4. Background-subtraction
1a. Noise Reduction (Blur)
im = cv.LoadImage("apple.jpg", cv.CV_LOAD_IMAGE_GRAYSCALE) cv.NamedWindow("orig") cv.ShowImage("orig", im) print dir(im) blurred = cv.CreateImage((im.width, im.height), im.depth, 1) cv.Smooth(im, blurred, cv.CV_GAUSSIAN, 9, 9)
1b. Noise Reduction (Threshold)
edImg = cv.CreateImage((im.width, im.height), im.depth, 1) cv.Threshold(blurred, edImg, 128, 255.0, cv.CV_THRESH_BINARY)
1c. Noise Reduction (Erode / Dilate)
Just erode Just dilate
cv.Erode(edImg, edImg, None, 20) cv.Dilate(edImg, edImg, None, 20)
Erode, then dilate
2a. Moments
An easy way to analyze a shape
Assumptions (here):
binary image (0=black, 1=white) Mainly One shape: the white part (pass CV_THRESH_BINARY_INV instead of CV_THRESH_BINARY to Threshold)
Notation:
I(x, y): intensity of pixel (x,y) (a 0 or 1) 1 1
( , )
=0 =0
2a. Moments, cont.

1 1
= Note:
( , )
=0 =0
M00 is the number of pixels (area) Centroid is (M10/M00, M01/M00)
2a. Hu Moments
Invariant (mostly) to
Scale Rotation Reflection
the seventh has different sign for reflection
Hu1 = M20 + M02 Hu2 = (N20 N02)2 + 4M112 Hu3 = (M30 3M12)2 + (3M21 M03)2 Hu7 = (3M21 M03)(M21 + M03)[3(M30 + M12)2 (M21 + M03)2] (M30 3M12)(M21 + M03)[3(M30 + M12)2 (M21 + M03)2]
If you were to treat this as a Vector7, you could compare it to a database of other Vector7's to do simple shape-matching
2a. Hu Moments, cont.

0.1730651754 0.0002714368 0.0000133760 0.0000289668 0.00000000056837 -0.0000004641061 0.00000000004541
0.1606958551 (diff2 = 0.000153) 0.0000105604 (diff2 = 0.00000731) 0.0000937233 (diff2 = 6.455e-9) 0.0000001.183 (diff2 = 8.322e-10) 0.000000000000341 (diff2 = 3.227e-19) 0.000000000269809 (diff2 = 2.15e13) -0.000000000000197 (diff2 = 2e-21) Total = 0.0016
cv.Threshold(blurred, edImg, 128, 255.0, cv.CV_THRESH_BINARY_INV) cv.Dilate(edImg, edImg, None, 10) cv.Erode(edImg, edImg, None, 10) edMat = cv.GetMat(edImg) moments = cv.Moments(edMat, 1) hu = cv.GetHuMoments(moments)
0.3608422951 (diff2 = 0.0353) 0.0568850707 (diff2 = 0.003) 0.0170774107 (diff2 = 0.0003) 0.0026356011 (diff2 = 6.8e-6) -0.00001702199712 (diff2 = 2.9e-7) -0.00062800188755 (diff2 = 3.94e-7) 0.000004785759613 (diff2 = 2.3e-11) Total = 0.0385 (~30x "farther")
3. Histograms
Basically, an n-dimensional plot Bins (buckets) Examples:
1D: In a grayscale image, number of pixels in a bin (0-5 intensity, 5-10 intensity, , 250-255 intensity) 2D: Hue-Saturation graph (x-axis = hue, y-axis = saturation)
3. Histograms, cont.
3. Histograms (creating, 1D)

Assumption: img is a grayscale image (1 channel)
# Create the histogram num_bins = 25 hist = cv.CreateHist([num_bins,], cv.CV_HIST_ARRAY, [[0,255],], 1) cv.CalcHist((img,), hist) # Create the image to visualize it (optional) scale = 15 # Size of each "bar" in the plot hist_img = cv.CreateImage((num_bins * scale, 256), 8, 3) (_, max_val, _, _) = cv.GetMinMaxHistValue(hist) cv.Rectangle(hist_img, (0,0), (num_bins * scale - 1, 255), cv.RGB(0,255,0), cv.CV_FILLED) for b in range(num_bins): val = 255.0 * cv.QueryHistValue_1D(hist, b) / max_val cv.Rectangle(hist_img, (b*scale, 255), ((b+1)*scale-1, 255-val), cv.RGB(255,0,0), cv.CV_FILLED)
3. Histograms (creating, 2D)

Assumption:
img is an RGB image mask is a gray image (black = don't count, white = do)
# Convert from RGB to HSV (storing the hue and sat in different (grayscale) images hsv = cv.CreateImage(cv.GetSize(img), 8, 3) cv.CvtColor(img, hsv, cv.CV_BGR2HSV) h_plane = cv.CreateMat(img.height, img.width, cv.CV_8UC1) s_plane = cv.CreateMat(img.height, img.width, cv.CV_8UC1) cv.Split(hsv, h_plane, s_plane, None, None) # Create the histogram hue_bins = 30 sat_bins = 32 hist_size = [h_bins, s_bins] h_ranges = [0, 180] # Red is ~0 degrees s_ranges = [15, 255] # 0=graysacle, 255=pure color ranges = [h_ranges, s_ranges] hist = cv.CreateHist(hist_size, cv.CV_HIST_ARRAY, ranges, 1) cv.CalcHist(planes, hist, 0, mask)
A histogram tells us how often a value (color) appears in the image the histogram was built from.
"Fuller" bin = more prevalent "Emptier" bin = less prevalent
3a. Back-projection
Back-Projection example:
Create an image with flesh tones. Create a histogram from it
Hue-Saturation generally ignores race
Now, given a new color, we can determine how likely it is to be flesh-toned by looking up that spot in the histogram.
If it's a full bin, it's probably a flesh-tone If it's an empty bin, it's probably not a flesh-tone
3a. Back-projection
3a. Back-projection
# Convert img (RGB) to HS(V) hsv = cv.CreateImage(cv.GetSize(img), 8, 3) cv.CvtColor(img, hsv, cv.CV_BGR2HSV) # Get images for the hue and sat "planes" of hsv h_plane = cv.CreateMat(i.height, i.width, cv.CV_8UC1) s_plane = cv.CreateMat(i.height, i.width, cv.CV_8UC1) cv.Split(hsv, h_plane, s_plane, None, None) h_plane = cv.GetImage(h_plane) # CvMat => CvImg s_plane = cv.GetImage(s_plane) hsPlanes = [h_plane, s_plane] # Do the back-projection. Note: hist was as created on a # previous slide backPropImg = cv.CreateImage((img.width, img.height), 8, 1) cv.CalcBackProject(hsPlanes, backPropImg, hist)
3a. Patch-based Back-projection

Similar to regular back-projection, but uses a (w x h) (I used 5 for each) "window" The window "slides" over each pixel in the image. Let's say it's at pixel (i, j)
Look in the 11x11 neigborhood of (i, j) Calculate a new histogram Compare it to a reference histogram. The degree of similarity is the value to set (i, j) to in the "probability" image

# hist, and hsPlanes are computed as before.
patchW = patchH = 5 backPropImg = cv.CreateImage((img.width - patchW + 1, img.height - patchH + 1), cv.IPL_DEPTH_32F, 1) cv.CalcBackProjectPatch(hsPlanes, backPropImg, (patchW, patchH), hist, cv.CV_COMP_CORREL, 1)
4. Background-Subtraction
Goal:
Mark non-background pixels in a mask (1=nonbackground, 0=background) Analyze the shape of the non-background pixels.
4. Background Subtraction
Nave Approach:
cv.AbsDiff(curFrame, bgOnlyFrame, diffImg) # Maybe a threshold now, erosion, dilate, etc.
Problems:
A lot of frame-to-frame noise Webcam auto-adjusting intensity (@#$! Logitechs) Clouds passing by, trees waving in wind,
A better approach[see example04]

OpenCV intro guide

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

OpenCV intro guide

Hochgeladen von

Copyright:

Verfügbare Formate

Open CV intro

What is Computer Vision?

Java (not yet, though)

Ported to Windows, OSX, Linux, iOS, Android

Example01: Absolute basics

Example02: Video reading / game loop

2. Edge / shape detection

1a. Noise Reduction (Blur)

1b. Noise Reduction (Threshold)

edImg = cv.CreateImage((im.width, im.height), im.depth, 1) cv.Threshold(blurred, edImg, 128, 255.0, cv.CV_THRESH_BINARY)

1c. Noise Reduction (Erode / Dilate)

Just erode Just dilate

cv.Erode(edImg, edImg, None, 20) cv.Dilate(edImg, edImg, None, 20)

Erode, then dilate

2a. Moments, cont.

M00 is the number of pixels (area) Centroid is (M10/M00, M01/M00)

2a. Hu Moments, cont.

3. Histograms (creating, 1D)

3. Histograms (creating, 2D)

3a. Patch-based Back-projection

3a. Patch-based Back-projection

3a. Patch-based Back-projection

A better approach[see example04]

Das könnte Ihnen auch gefallen