Sie sind auf Seite 1von 27

Open CV intro

References: 1. "Learning OpenCV: Computer Vision with the OpenCV Library", Bradski & Kaehler (O'Reilley 2008) 2. http://opencv.willowgarage.com/wiki/

What is Computer Vision?


(2D Image) Scene "Information" "Apple on a table" Triangle Mesh Virtual lights Camera

(2D Image)

Scene "Information" Skin Tones Parallel Lines (building) Depth calculations ... "Man in front of building"

What is OpenCV?
~2500 computer vision algorithms Highly optimized (originally for Intel) BSD license Languages:
C (function-based) OpenCV 1.x C++ (class based) OpenCV 2.x Python (2.6 and 2.7)
Import cv2 for the OpenCV 2.x style bindings Import cv2.cv for the OpenCV 1.x style bindings Very poor documentation!

Java (not yet, though)

Ported to Windows, OSX, Linux, iOS, Android

Links
Download:
http://opencv.org/

C / C++
http://docs.opencv.org/

Python
Tutorials: https://opencv-python-tutroals.readthedocs.org/en/latest/ [Not really any documentation just read the C++ docs and translate it yourself]

Python setup
Copy [OpenCVdir]\build\Python\2.7\Lib\sitepackages cv2.pyd (a .dll file) Paste it in the same directory as your script (or put in your python install folder)

Example01: Absolute basics


import cv2 # Creates a numpy.ndarray object (basically a fast, C-based # array of numeric values img = cv2.imread("apple.jpg", cv2.IMREAD_COLOR) # Creates a window (title = an apple!) and displays img in it. cv2.imshow('an apple!', img) # Waits for any key to be pressed. cv2.waitKey(0) # Destroys all windows. cv2.destroyAllWindows()

Example02: Video reading / game loop


from cv2 import cv import time #import cv2 # "Game" Loop while True: # Capture the current frame and show # it in the window img = cv.QueryFrame(cam) cv.ShowImage("main", img) if writing: # Save to the avi file cv.WriteFrame(writer, img) # Get Keyboard events keyCode = cv.WaitKey(5) if keyCode == 27: # Escape break elif keyCode == ord("s"): # Save the current image to a file cv.SaveImage("F" + str(captureNum) + ".jpg", img) captureNum += 1 # Create a new window cv.NamedWindow("main") writing = 0 if writing: # Start capturing from camera cam = cv.CaptureFromCAM(-1) # Create a write for offline processing (without # a webcam) width = cv.GetCaptureProperty(cam, cv.CV_CAP_PROP_FRAME_WIDTH) height = cv.GetCaptureProperty(cam, cv.CV_CAP_PROP_FRAME_HEIGHT) # Note: Indeo video 5.10 is the only codec I could # read and write to. writer = cv.CreateVideoWriter("example01.avi", code, 30.0, (width,height), 0) else: # Open a video file for reading (treat it as if it # were a camera) cam = cv.CaptureFromFile("example01.avi") # The current "window grab" captureNum = 0 # End capturing del cam # Should release the camera / file if writing: del writer # Destroy the window cv.DestroyWindow("main")

A tour of CV algorithms
1. Noise reduction:
a. Blurring b. Thresholding c. Erode / Dilate

2. Edge / shape detection


a. (Hu) moments

3. Histograms
a. Back-projection

4. Background-subtraction

1a. Noise Reduction (Blur)

im = cv.LoadImage("apple.jpg", cv.CV_LOAD_IMAGE_GRAYSCALE) cv.NamedWindow("orig") cv.ShowImage("orig", im) print dir(im) blurred = cv.CreateImage((im.width, im.height), im.depth, 1) cv.Smooth(im, blurred, cv.CV_GAUSSIAN, 9, 9)

1b. Noise Reduction (Threshold)

edImg = cv.CreateImage((im.width, im.height), im.depth, 1) cv.Threshold(blurred, edImg, 128, 255.0, cv.CV_THRESH_BINARY)

1c. Noise Reduction (Erode / Dilate)

Just erode Just dilate

cv.Erode(edImg, edImg, None, 20) cv.Dilate(edImg, edImg, None, 20)

Erode, then dilate

2a. Moments
An easy way to analyze a shape
Assumptions (here):
binary image (0=black, 1=white) Mainly One shape: the white part (pass CV_THRESH_BINARY_INV instead of CV_THRESH_BINARY to Threshold)

Notation:
I(x, y): intensity of pixel (x,y) (a 0 or 1) 1 1

( , )
=0 =0

2a. Moments, cont.


1 1

= Note:

( , )
=0 =0

M00 is the number of pixels (area) Centroid is (M10/M00, M01/M00)

2a. Hu Moments
Invariant (mostly) to
Scale Rotation Reflection
the seventh has different sign for reflection

Hu1 = M20 + M02 Hu2 = (N20 N02)2 + 4M112 Hu3 = (M30 3M12)2 + (3M21 M03)2 Hu7 = (3M21 M03)(M21 + M03)[3(M30 + M12)2 (M21 + M03)2] (M30 3M12)(M21 + M03)[3(M30 + M12)2 (M21 + M03)2]

If you were to treat this as a Vector7, you could compare it to a database of other Vector7's to do simple shape-matching

2a. Hu Moments, cont.


0.1730651754 0.0002714368 0.0000133760 0.0000289668 0.00000000056837 -0.0000004641061 0.00000000004541

0.1606958551 (diff2 = 0.000153) 0.0000105604 (diff2 = 0.00000731) 0.0000937233 (diff2 = 6.455e-9) 0.0000001.183 (diff2 = 8.322e-10) 0.000000000000341 (diff2 = 3.227e-19) 0.000000000269809 (diff2 = 2.15e13) -0.000000000000197 (diff2 = 2e-21) Total = 0.0016

cv.Threshold(blurred, edImg, 128, 255.0, cv.CV_THRESH_BINARY_INV) cv.Dilate(edImg, edImg, None, 10) cv.Erode(edImg, edImg, None, 10) edMat = cv.GetMat(edImg) moments = cv.Moments(edMat, 1) hu = cv.GetHuMoments(moments)

0.3608422951 (diff2 = 0.0353) 0.0568850707 (diff2 = 0.003) 0.0170774107 (diff2 = 0.0003) 0.0026356011 (diff2 = 6.8e-6) -0.00001702199712 (diff2 = 2.9e-7) -0.00062800188755 (diff2 = 3.94e-7) 0.000004785759613 (diff2 = 2.3e-11) Total = 0.0385 (~30x "farther")

3. Histograms
Basically, an n-dimensional plot Bins (buckets) Examples:
1D: In a grayscale image, number of pixels in a bin (0-5 intensity, 5-10 intensity, , 250-255 intensity) 2D: Hue-Saturation graph (x-axis = hue, y-axis = saturation)

3. Histograms, cont.

3. Histograms (creating, 1D)


Assumption: img is a grayscale image (1 channel)
# Create the histogram num_bins = 25 hist = cv.CreateHist([num_bins,], cv.CV_HIST_ARRAY, [[0,255],], 1) cv.CalcHist((img,), hist) # Create the image to visualize it (optional) scale = 15 # Size of each "bar" in the plot hist_img = cv.CreateImage((num_bins * scale, 256), 8, 3) (_, max_val, _, _) = cv.GetMinMaxHistValue(hist) cv.Rectangle(hist_img, (0,0), (num_bins * scale - 1, 255), cv.RGB(0,255,0), cv.CV_FILLED) for b in range(num_bins): val = 255.0 * cv.QueryHistValue_1D(hist, b) / max_val cv.Rectangle(hist_img, (b*scale, 255), ((b+1)*scale-1, 255-val), cv.RGB(255,0,0), cv.CV_FILLED)

3. Histograms (creating, 2D)


Assumption:
img is an RGB image mask is a gray image (black = don't count, white = do)
# Convert from RGB to HSV (storing the hue and sat in different (grayscale) images hsv = cv.CreateImage(cv.GetSize(img), 8, 3) cv.CvtColor(img, hsv, cv.CV_BGR2HSV) h_plane = cv.CreateMat(img.height, img.width, cv.CV_8UC1) s_plane = cv.CreateMat(img.height, img.width, cv.CV_8UC1) cv.Split(hsv, h_plane, s_plane, None, None) # Create the histogram hue_bins = 30 sat_bins = 32 hist_size = [h_bins, s_bins] h_ranges = [0, 180] # Red is ~0 degrees s_ranges = [15, 255] # 0=graysacle, 255=pure color ranges = [h_ranges, s_ranges] hist = cv.CreateHist(hist_size, cv.CV_HIST_ARRAY, ranges, 1) cv.CalcHist(planes, hist, 0, mask)

A histogram tells us how often a value (color) appears in the image the histogram was built from.
"Fuller" bin = more prevalent "Emptier" bin = less prevalent

3a. Back-projection

Back-Projection example:
Create an image with flesh tones. Create a histogram from it
Hue-Saturation generally ignores race

Now, given a new color, we can determine how likely it is to be flesh-toned by looking up that spot in the histogram.
If it's a full bin, it's probably a flesh-tone If it's an empty bin, it's probably not a flesh-tone

3a. Back-projection

3a. Back-projection
# Convert img (RGB) to HS(V) hsv = cv.CreateImage(cv.GetSize(img), 8, 3) cv.CvtColor(img, hsv, cv.CV_BGR2HSV) # Get images for the hue and sat "planes" of hsv h_plane = cv.CreateMat(i.height, i.width, cv.CV_8UC1) s_plane = cv.CreateMat(i.height, i.width, cv.CV_8UC1) cv.Split(hsv, h_plane, s_plane, None, None) h_plane = cv.GetImage(h_plane) # CvMat => CvImg s_plane = cv.GetImage(s_plane) hsPlanes = [h_plane, s_plane] # Do the back-projection. Note: hist was as created on a # previous slide backPropImg = cv.CreateImage((img.width, img.height), 8, 1) cv.CalcBackProject(hsPlanes, backPropImg, hist)

3a. Patch-based Back-projection


Similar to regular back-projection, but uses a (w x h) (I used 5 for each) "window" The window "slides" over each pixel in the image. Let's say it's at pixel (i, j)
Look in the 11x11 neigborhood of (i, j) Calculate a new histogram Compare it to a reference histogram. The degree of similarity is the value to set (i, j) to in the "probability" image

3a. Patch-based Back-projection


# hist, and hsPlanes are computed as before.
patchW = patchH = 5 backPropImg = cv.CreateImage((img.width - patchW + 1, img.height - patchH + 1), cv.IPL_DEPTH_32F, 1) cv.CalcBackProjectPatch(hsPlanes, backPropImg, (patchW, patchH), hist, cv.CV_COMP_CORREL, 1)

3a. Patch-based Back-projection

4. Background-Subtraction
Goal:
Mark non-background pixels in a mask (1=nonbackground, 0=background) Analyze the shape of the non-background pixels.

4. Background Subtraction
Nave Approach:
cv.AbsDiff(curFrame, bgOnlyFrame, diffImg) # Maybe a threshold now, erosion, dilate, etc.

Problems:
A lot of frame-to-frame noise Webcam auto-adjusting intensity (@#$! Logitechs) Clouds passing by, trees waving in wind,

A better approach[see example04]

Das könnte Ihnen auch gefallen