Sie sind auf Seite 1von 6

Sketching in the Air: A Vision-Based System for 3D Object Design

Yu Chen1 , Jianzhuang Liu1 Xiaoou Tang1,2

Department of Information Engineering 2
Microsoft Research Asia
The Chinese University of Hong Kong Beijing, China

Abstract mirror

3D object design has many applications including flexi-

ble 3D sketch input in CAD, computer game, webpage con-
tent design, image based object modeling, and 3D object re-
trieval. Most current 3D object design tools work on a 2D
drawing plane such as computer screen or tablet, which is
often inflexible with one dimension lost. On the other hand,
virtual reality based methods have the drawbacks that there
are awkward devices worn by the user and the virtual envi-
ronment systems are expensive. In this paper, we propose a
novel vision-based approach to 3D object design. Our sys- Fig. 1. The sketching system.
tem consists of a PC, a camera, and a mirror. We use the
camera and mirror to track a wand so that the user can de- developed to facilitate object design. In real time, the 3D
sign 3D objects by sketching in 3D free space directly with- positions of the strokes of the wand are captured, and the
out having to wear any cumbersome devices. A number of wireframes and surfaces being developed are displayed on
new techniques are developed for working in this system, in- the PC screen to guide the user to draw more and more com-
cluding input of object wireframes, gestures for editing and plex objects.
drawing objects, and optimization-based planar and curved The system provides a whole new way of 3D object de-
surface generation. Our system provides designers a new sign. It requires no special equipments and is easy to set up
user interface for designing 3D objects conveniently. and use. Its applications include flexible 3D sketch input in
CAD, game, education, and webpage content design, gen-
1. Introduction eration of 3D objects from 2D images, and a user-friendly
query interface for 3D object retrieval.
Despite great progress of 3D modeling in current
computer-aided design (CAD) tools, creating 3D objects us-
2. Related Work
ing these tools is still a tedious job since they require users
to work on a 2D drawing plane. Design in virtual 3D envi- Great effort has been made to develop CAD systems for
ronments enables users to draw objects in 3D space, but this 3D model design in the past three decades. Current tech-
method has the drawbacks that there are awkward devices niques can be classified into the following four categories:
worn by the user and the virtual environments are expen- 1) Traditional CAD tools such as AutoCAD [1] and
sive. SolidWorks [2]. These tools are sophisticated systems suit-
In this paper, we propose a novel vision-based approach able for engineers to input precise geometry of models but
to 3D object design. Different from the current techniques, are not suitable for designers to rapidly express their ideas
it works in 3D space without any devices connected to the at the initial stage of model development.
user. Our target is to develop an inexpensive system that al- 2) Automatic 3D object reconstruction from 2D line
lows the user to design 3D objects conveniently. Our system drawings. This is one of the main research topics in com-
consists of a PC, a camera, and a mirror, as shown in Fig. 1. puter vision and graphics. The methods are mainly based
In the system, the user designs 3D objects by sketching in on line labeling, algebra, image regularities, and optimiza-
the air the wireframes of the objects with an easily-tracked tion [18], [11], [17], [12], [5]. The critical problem in these
wand. A number of sketching and editing operations are methods is that they can handle only relatively simple pla-

978-1-4244-2243-2/08/$25.00 ©2008 IEEE

nar objects at the current stage. camera
wireframe face surface
on the PC identification generation
3) Sketch-based modeling user interfaces. Most tradi-
tional designers still prefer pencil and paper to mouse and wireframe in
3D space
drawing & editing
of the wireframe
keyboard in current CAD systems to sketch their ideas of on the screen
with the keyboard
shapes. To bridge the gap between the flexible 2D sketches drawing &
editing of projection on projection on
and the rigid CAD systems, researchers have developed the wireframe the screen the screen
the user
tools that try to convert 2D sketches into 3D models [17], in 3D space

[19], [8], [3], [9]. However, one physical limitation that Fig. 2. Flow chart of 3D object design in the sketching system.
cannot be overcome by these tools is that the sketching and Y y X
editing operations are performed on a 2D plane (tablet or x

screen). With one dimension missing, the 3D positions of O p2

the strokes, surfaces, and objects drawn on a 2D plane are f o
often ambiguous. P1

4) Design in virtual 3D environments. Virtual reality Z0 θ

image plane
(VR) has been thought to be the perfect CAD system be-
cause the designer could work naturally and intuitively in a Z
real 3D environment. However, such systems face problems
in the cost of the equipments, the inflexibility to use, and the mirror

slow frame update rates. Researchers are trying to develop Fig. 3. Geometry of the system.
better techniques for 3D design in VR [16], [10], [4], [7]
but they need special and awkward devices to operate by, different points p1 = (x1 , y1 , f )T and p2 = (x2 , y2 , f )T
or connect to, the user, making the design an unnatural pro- on the image plane. To find the 3D coordinate of P1 , we
cess. need to know θ and Z0 , where Z0 is the distance from O
From the discussion above, we can see that the current to A and A is the intersection of the Z axis and the mir-
methods for 3D model design are not good enough. Re- ror. The two parameters θ and Z0 can be obtained by the
searchers still need to develop more friendly and inexpen- calibration scheme discussed in Section 4.2.
sive interfaces with better design methodology. The 3D position P1 = (X1 , Y1 , Z1 )T can be determined
from the known θ, Z0 , p1 , p2 , and the geometrical relation-
3. The Sketching System ship shown in Fig. 3. The formula for calculating P1 is
As shown in Fig. 1, our system consists of a video cam- derived in Section 4.1.
era, a mirror, and a PC only. The user draws an object with a Our system is able to determine the 3D positions of the
wand in the 3D free space. The tip of the wand is colored so wand with sufficient accuracy. We have also tested the tradi-
that it is easy to track. The basic idea of 3D design in this tional stereo method using two cameras to find the depth of
system is that a 3D wireframe of an object is obtained by a spatial point. From our experiments, we have found that
tracking the movement of the wand in 3D space, and then the new method has the following advantages: (a) easier to
an automatic filling-in process generates a surface from the calibrate, (b) less time to track the wand due to only one
wireframe. We propose this system based on the observa- video sequence to handle, and (c) larger 3D working space
tion that a designer thinks not in terms of surfaces, but rather for object drawing if the volume to set up the two systems
in terms of the feature curves of an object which construct are the same. The last advantage comes from the fact that
the wireframe. The whole flow chart of our system is shown if the traditional stereo method is used, the tip of the wand
in Fig. 2. must appear in both image sequences of the cameras, which
limits the 3D drawing space.
3.1. 3D Geometry of the System
Being able to find the 3D position of the tip of the wand
3.2. Locating the Wand
is the first step. In the system, the world frame (X, Y, Z) Locating the 3D position P1 of the wand is the first
is defined as in Fig. 3, where the XY plane is parallel step for the system to work. We can represent P1 =
to the image plane xy of the camera and the distance be- (X1 , Y1 , Z1 )T in terms of the points p1 and p2 on the im-
tween the origin O and the image plane is equal to the age plane, and the parameters f , θ, and Z0 . First, from the
focal length f . With a simple calibration, the Y Z plane geometrical relationship in Fig. 3, it is straightforward to
can be set orthogonal to the mirror. The angle θ between relate the spatial positions with the positions in the image
the Z axis and the mirror is less than 90 degrees so that plane by
the tip of the wand P1 = (X1 , Y1 , Z1 )T and its image
P2 = (X2 , Y2 , Z2 )T in the mirror always project to two Xi = xi Zi /f, Yi = yi Zi /f, i = 1, 2. (1)
Y Y y
image plane mirror

O P'
f o A'
Z1 H H
Y2 θ
Z0 Q θ A
Z2 C Z

Fig. 6. Geometry of the Y Z plane in calibration.

Fig. 4. Geometry of the system on the Y Z plane.
R A' S orthogonal to the image plane and passes through the center
W W r
s o of the image displayed on the screen. The focal length f
H h can be known from the camera and is fixed in the system. To
a calibrate the system, we print out a rectangle (Fig. 5(a)) on a
white page and place this page on the central part of the mir-
(a) (b)
ror with the side RS approximately parallel to the ground.
Fig. 5. (a) A rectangle for calibration. (b) The rectangle in the This rectangle is captured by the camera and displayed on
image. the screen as shown in Fig. 5(b). Then we adjust the posi-
tion of the camera so that the point a is coincident with the
We now consider the geometry of the system on the Y Z
image center o, the side rs is horizontal, and w1 = w2 in
plane as shown in Fig. 4, where P and P0 are the projections
of P1 and P2 onto the Y Z plane, respectively. Let B be the the image. This simple adjustment makes the Y Z plane or-
thogonal to the mirror. With the known lengths of the sides
midpoint of PP0 , and the three lines PQ, P0 Q0 , and BC
be perpendicular to the axis OZ. Then |PQ| + |P0 Q0 | = of the rectangle in Fig. 5(a) and h and w1 in the image, we
2|BC| and |OQ| + |OQ0 | = 2|OC|. Hence, we have can find θ and Z0 from
w1 h f
Y2 = |P0 Q0 | = 2|BC| − |PQ| = = , (7)
W H sin θ Z0 − H cos θ
= 2H sin θ − y1 Z1 /f, (2)
which is derived from the geometry shown in Fig. 6.
Z2 = |OQ0 | = 2|OC| − |OQ|
= 2(Z0 − H cos θ) − Z1 . (3) 4. Wireframe Input and Object Editing
On the other hand, One difficulty to generate a wireframe of an object lies
H = |AB| = |PQ| sin θ + |AQ| cos θ in the fact that the strokes drawn are invisible to the user.
However, they are visible to the camera, and the trace of the
= y1 Z1 sin θ/f − (Z1 − Z0 ) cos θ. (4)
moving wand and what has been drawn can be displayed on
From (1), (2), (3), and (4), we have, the screen as the feedback to the user, which can be used to
guide the sketching process.
y2 Y2 2H sin θ − y1 Z1 /f
= = First, we propose a solution to locating the position on
f Z2 2(Z0 − H cos θ) − Z1 an unfinished wireframe in order to continue to draw. While
2y1 Z1 sin2 θ − (Z1 − Z0 )f sin 2θ − y1 Z1 the user moves the wand in the space, the closest point on
= .(5)
2Z0 f sin2 θ − y1 Z1 sin 2θ + Z1 f cos 2θ the unfinished wireframe to the tip is computed, and a dif-
ferent color is shown on the screen to indicate this point. In
Finally, Z1 is obtained from (5) as this way, the user can find a position to draw without dif-
2fZ0 sin θ(y2 sin θ − f cos θ) ficulty. When this position is found, the user presses some
Z1 = . (6)
(y1 y2 − f 2 ) sin 2θ − f (y1 + y2 ) cos 2θ key to let the system know it, and then the movement of the
tip is considered as a new edge of the wireframe.
Given (1), we immediately have the other two dimensions Second, to distinguish a drawing stroke from non-
of P1 : X1 = x1 Z1 /f and Y1 = y1 Z1 /f. The position of drawing movement of the wand, we use the keyboard to let
the wand P1 in the 3D space is hence determined if f , θ, the system know when a stroke begins and ends. Besides,
and Z0 are known. we have also developed a number of sketching and edit-
ing operations to facilitate object design, such as extrusion,
3.3. Calibration
moving, copying, rotation, and zooming. These 3D opera-
The calibration process is to find the parameters θ and tions and gestures are summarized in Table 1. The keyboard
Z0 . It is reasonable to assume that the Z axis in Fig. 3 is is used to control the start and stop of a 3D gesture shown
Table 1. Gestures and operations defined in the system.
Keyboard Function
m/r move/copy the selected part
Left/Right rotate along Y-axis
Up/Down rotate along X-axis
+/- zoom in/out Fig. 7. Extrusion of the circle along the curve.
d mode 1: move the position of a vertex or a step is to generate these patches, either planar or curved,
control point from their boundaries.
mode 2: sketch a patch by dragging a
straight/curve edge 5.1. Face Identification
mode 3: sketch a pyramid/cone structure Given a wireframe, before filling in it with surface
by dragging a patch patches, we have to identify the circuits that represent these
mode 4: sketch a rectangular/cylindrical patches (faces). Since the wireframe may represent a mani-
volume by dragging a patch fold or non-manifold solid, a sheet of surface, or the combi-
a draw a straight line/curve nation of them, with or without holes, identifying the faces
c/e mode 1: draw a circle/ellipse is not a trivial problem due to the combinatorial explosion in
mode 2: draw a body of revolution from a the number of circuits in the wireframe [13], [14], [15]. To
straight/curve edge solve this problem, we use the algorithm proposed in [15]
q/w adjust the scaling of the selected part to detect the faces of a wireframe. In our interactive system,
z/x adjust the curvature of curved strokes the user can also select the edges of a face for face identifi-
1-4 change editing modes cation manually, which can help fix wrongly detected faces
F1 switch between the curve mode and the by the algorithm occasionally.
straight-line mode
5.2. Planar Surface Generation
by the movement of the wand. All the operations can be
An object may have many planar faces. In the system, a
done by the wand and the keyboard, without resorting to
straight line replaces a stroke in the straight-line mode. It
the mouse, thus allowing a continuous design by moving
is reasonable to consider that a face is planar if all its edges
the wand with one hand and hitting the keyboard with an-
are straight lines. However, for a planar face with more
other. The system also allows the user to switch between
than three vertices, it is not likely for all the vertices to be
two drawing modes: the curve mode and the straight-line
located exactly on a plane in 3D space due to the inaccuracy
mode. In the curve-mode, smooth Bezier curves are gen-
of the measurement and the input during the sketching pro-
erated to fit the path of wand movement when forming the
cess. Filling in these circuits with triangular patches will
wireframe. On the other hand, in the straight-line mode, the
make the object distorted. In order to solve this problem,
path information is discarded and only the start and the end
we propose an automatic line drawing correction algorithm
positions of each stroke is used to generate straight lines. to deal with this problem.
Compared with traditional sketch-based editing opera- After face identification from a (partial) wireframe, we
tions on a 2D plane, many 3D operations have their ad- know the circuits representing planar faces. From the ver-
vantages. For example, if we want to draw a duct by the tices of these circuits, a fitting algorithm is used to find a
extrusion of a closed circle along an arbitrary open curve set of planes that best fit these planar circuits. We rep-
(see Fig. 7), a 2D system will encounter two problems: (a) resent a plane passing through face j by its normal vec-
whether the closed curve is a circle or ellipse and its orien- tor fj = (aj , bj , cj )T and a scale dj . Then, any vertex
tation in 3D space are unclear; (b) the 3D trail of the open v = (x, y, z)T on this plane satisfies the linear equation:
curve is impossible to determine. However, these problems aj x + bj y + cj z − dj = 0 or vT fj = dj .
do not exist in our system. We hope that for each identified face, the corrected po-
sitions of its vertices should be as close to the fitting plane
5. Surface Generation as possible. Besides, for each vertex, its corrected position
When we obtain the wireframe of an object in the 3D should not deviate too much from its initial position. Let Vj
space by using the sketching scheme described in the previ- be the set of the vertices of face j. The objective function to
ous section, the next step is to generate the 3D surface from be minimized is defined as follows:
the wireframe to finally reconstruct the object. Surface gen- Q(v1 , v2 , · · · , vN , f1 , f2 , · · · , fM , d1 , d2 , · · · , dM ) =
eration can be divided into two steps. The first is to identify N
all the faces, i.e., the circuits in the wireframe which repre- kvTi fj − dj k2
kvi − v0i k2 + β , (8)
sent patches constituting the whole surface, and the second i=1 j=1 i∈V
kfj k2
where v1 , v2 , · · · , vN are the corrected positions of N ver- The closed form solution to (14) results in the following
tices; (f1 , d1 ), (f2 , d2 ), · · · , (fM , dM ) are the parameters of equation for computing vin+1 :
M planar faces; v01 , v20 , · · · , v0N are the positions of the N
vin+1 = (Rn+1 )−1 v̂in+1 , i = 1, 2, · · · , N, (15)
vertices in the original sketching; β is a weighting factor.
The goal of the optimization is to find the corrected posi- where
tions vi , i = 1, 2, · · · , N , and the fitting planes (fj , dj ), X fjn+1 fjn+1 T
j = 1, 2, · · · , M , such that Q is minimized. R n+1
= I+β , (16)
We solve this optimization problem in an iterative way. j∈Fi
kfjn+1 k2
Let the set of vi , i = 1, 2, · · · , N , the set of fj , j = X dn+1 fjn+1
1, 2, · · · , M , and the set of dj , j = 1, 2, · · · , M be V = v̂n+1
i = v0i + β
. (17)
kfjn+1 k2
{vi }Ni=1 , F = {fj }j=1 , and D = {dj }j=1 , respectively.
Also let V n = {vin }N i=1 , F
= {fjn }Mj=1 , and D
= Our experiments have shown that the algorithm is effec-
{dj }j=1 be the optimization results after the nth iteration.
n M tive and converges quickly within several iterations.
The optimization problem is divided into two iterative min-
imization steps, and a closed-form solution can be achieved 5.3. Smooth Curved Surface Generation
in each step. After a curved stroke is finished, we use a Bezier curve
Step 1: Face fitting. to approximate it to obtain a smooth curve. After we have
a (partial) wireframe with identified faces (circuits) and
(F n+1 , Dn+1 ) = arg min Q(V n , F, D). (9) curves represented by Bezier curves, we fill in the circuits
Step 2: Vertex correction. denoting curved faces with smooth surface patches. Bilin-
early blended Coons patches [6] are used to do it.
V0 = {vi0 }N
i=1 , (10)
V n+1
= arg min Q(V, F n+1 , D n+1 ). (11) 6. Experiments
In Step 1, we do the plane fitting on all the identified In this section, we show a number of examples to demon-
planar faces using the updated positions of the vertices ob- strate the performance of our system. Our system is imple-
tained in the previous iteration. The optimal fitting is ob- mented using Visual C++, running on a PC with 3.4 GHz
tained as follows. First, by solving ∂d = 0, we have Pentium IV CPU. The parameter β in (8) is chosen to be
10. Our experiments show that the system is insensitive to
1 X T the parameters; very similar results are obtained when β
dj = vi fj , j = 1, 2, · · · , M. (12)
|Vj | i∈V
changes in [5, 20]. The wand tracking and display mod-
ule work in real time at a rate of 10 frames per second.
Substituting (12) into (8), after some algebraic manip- The system can track the moving of the wand at a max-
ulation, we can transform the problem in (9) into the imum speed of about one meter per second. A new user
fT S f usually needs to take two or three hours of training to get
problem of minimizing the Rayleigh quotient jf T fj with
j j

adapted to simultaneous keyboard and wand operation in
P to each fj , j = 1,T2, · · · , M , where Sj = the system. In our system, strokes are preprocessed and
i∈Vj (vi − vj )(vi − vj ) is the covariance matrix,
|Vj | the displayed wireframes are composed of straight lines and
P f T Sj f j
and vj = |V1j | i∈Vj vi . Furthermore, minimizing jf T fj smooth curves. The jittering of the strokes by natural hand
can be reduced to the following eigen-problem:
tremor is smoothed.
Fig. 8 shows a set of wireframes created with our sys-
Sj fj = λj,min fj , j = 1, 2, · · · , M, (13) tem. For each wireframe in the first column, we give the
corrected wireframes in the second column and the 3D re-
with fj being the eigen vector corresponding to the mini-
construction result displayed in two views in the third and
mum eigen value λj,min of Sj . From (12) and (13), we
fourth columns. The results show that the correction step
can obtain the closed-form solution fjn+1 and dn+1 ,j =
j (Section 6.2) is effective and the faces of the reconstructed
1, 2, · · · , M , in terms of vin , i = 1, 2, · · · , N .
objects are planar.
Step 2 is done by minimizing Q, given the fitting planes
∂Q Figs. 9 shows a set of more complex objects that include
obtained in Step 1. From ∂v = 0, i = 1, 2, · · · , N, we
i strokes of both straight-lines and curves. We can see that
our system can handle complex wireframes sketched in the
∂Q X (vT fj − dj )fj air and generate expected 3D objects.
= 2(vi − vi0 ) + 2β i
= 0, (14)
kfj k2 The time used for the face identification and the genera-
tion of planar and curved surfaces of a scene is between 3
where Fi is the set of the faces containing vertex i. and 19 seconds, depending on the complexity of the scene.
8. Acknowledgements
The work described in this paper was fully supported by
a grant from the Research Grants Council of the Hong Kong
SAR, China (Project No. CUHK 414306).

[1] Autodesk Inc. AutoCAD.
[2] SolidWorks Corporation. SolidWorks.
[3] A. Alexe, L. Barthe, M. Cani, and V. Gaildrat. Shape mod-
eling by sketching using convolution surfaces. Proc. Pacific
Graphics, 2005.
[4] R. Amicis, F. Bruno, A. Stork, and M. Luchi. The eraser pen:
a new interaction paradigm for curve sketching in 3D. Proc.
Fig. 8. Experimental results. The axes of the 3D coordinate system 7th Intl Design Conference, 1:465–470, 2002.
are also shown. [5] Y. Chen, J. Liu, and X. Tang. A divide-and-conquer approach
to 3D object reconstruction from line drawings. ICCV, 2007.
[6] G. Farin. Curves and Surfaces for Computer Aided Geomet-
ric Design: A Practical Guide. Academic Press, 1997.
[7] T. Grossman, R. Balakrishnan, and K. Singh. An interface
for creating and manipulating curves using a high degree-
of-freedom curve input device. SIGCHI Conference, pages
185–192, 2003.
[8] T. Igarashi, S. Matsuoka, and H. Tanaka. Teddy: a sketching
interface for 3d freeform design. SIGGRAPH, pages 406–
416, 1999.
[9] O. Karpenko and J. Hughes. Smoothsketch: 3D free-form
shapes from complex sketches. SIGGRAPH, pages 589–598,
[10] D. Keefe, D. Feliz, T. Moscovich, D. Laidlaw, and J. LaVi-
ola. Cavepainting: a fully immersive 3D artistic medium and
interactive experience. Symposium on Interactive 3D Graph-
ics, pages 85–93, 2001.
Fig. 9. More experimental results. [11] H. Lipson and M. Shpitalni. Optimization-based reconstruc-
tion of a 3D object from a single freehand line drawing.
For instance, the system takes 3 and 19 seconds to recon- Computer-Aided Design, 28(8):651–663, 1996.
struct the second object in Fig. 8 and the fourth object in [12] J. Liu, L. Cao, Z. Li, and X. Tang. Plane-based optimization
Fig. 9 respectively, after the wireframes are obtained. for 3D object reconstruction from single line drawings. IEEE
Trans. PAMI, 30(2):315–327, 2008.
7. Conclusion and Future Work [13] J. Liu and Y. Lee. A graph-based method for face identifi-
We have developed a novel 3D vision-based sketching cation from a single 2D line drawing. IEEE Trans. PAMI,
system with a simple and inexpensive interface allowing 23(10):1106–1119, 2001.
[14] J. Liu, Y. Lee, and W. K. Cham. Identifying faces in a 2D
users to sketch objects directly in the 3D space. A num-
line drawing representing a manifold object. IEEE Trans.
ber of new techniques are proposed for working in this sys- PAMI, 24(12):1579–1593, 2002.
tem, including input of object wireframes, gestures for edit- [15] J. Liu and X. Tang. Evolutionary search for faces from line
ing and drawing objects, and optimization-based planar and drawings. IEEE Trans. PAMI, 27(6):861–872, 2005.
curved surface generation. Experiments have verified its ef- [16] S. Schkolne, M. Pruett, and P. Schroder. Surface drawing:
ficacy in designing 3D objects. Our system is still being creating organic 3D shapes with the hand and tangible tools.
improved. At the current stage, a new user usually needs to SIGCHI Conference, pages 261–268, 2001.
take two or three hours of training to get adapted to simul- [17] A. Shesh and B. Chen. Smartpaper: An interactive and user
taneous keyboard and wand operation in the system. Our friendly sketching system. Proc. Eurograph, 2004.
future work includes: 1) developing more gestures and op- [18] K. Sugihara. Machine Interpretation of Line Drawings. MIT
erations to handle complex objects; 2) improving the track- Press, 1986.
ing algorithm to support more accurate 3D positioning of [19] R. Zeleznik, K. Herndon, and J. Hughes. Sketch: an interface
for sketching 3D scenes. SIGGRAPH, pages 163–170, 1996.
the wand movement.