Sie sind auf Seite 1von 5



3D object reconstruction has been in research for a long period of
time. Over the years, many methods have been devised, so as to make
the 3D image as perfect as possible. In this project report, we present a
method to procure a 3D image out of a point cloud, which is generated
from a set of 2D images. The softwares used for this purpose are:
VisualSFM, Meshlab, and Blender.

A point cloud is a set of data points in some coordinate system.
In a three-dimensional coordinate system, these points are usually
defined by X, Y and Z coordinates, and often are intended to represent
the external surface of an object.
Point clouds may be created by 3D scanners. These devices
measure a large number of points on an object's surface, and often
output a point cloud as a data file. The point cloud represents the set of
points that the device has measured.
As the result of a 3D scanning process point clouds are used for
many purposes, including to create 3D CAD models for manufactured
parts, metrology/quality inspection, and a multitude of visualization,
animation, rendering and mass customization applications.
While point clouds can be directly rendered and inspected, usually
point clouds themselves are generally not directly usable in most 3D
applications, and therefore are usually converted to polygon
mesh or triangle mesh models, NURBS surface models, or CAD models
through a process commonly referred to as surface reconstruction.
There are many techniques for converting a point cloud to a 3D
surface. Some approaches, like Delaunay triangulation, alpha shapes,
and ball pivoting, build a network of triangles over the existing vertices
of the point cloud, while other approaches convert the point cloud into a
volumetric distance field and reconstruct the implicit surface so defined
through a marching cubes algorithm.

Point clouds can also be used to represent volumetric data used

for example in medical imaging. Using point clouds multi-sampling and
data compression are achieved.
In geographic information system, point clouds are one of the
sources to make digital elevation model of the terrain. The point clouds
are also employed in order to generate 3D model of urban environment
There are 2 kinds of point cloud: Sparse and Dense cloud.
In this project, we are using Dense Cloud for 3D image

A polygon mesh is a collection of vertices, edges and faces that
defines the shape of a polyhedral object in 3D computer graphics and
solid modeling. The faces usually consist of triangles (triangle mesh),
quadrilaterals, or other simple convex polygons, since this simplifies
rendering, but may also be composed of more general concave
polygons, or polygons with holes.
The study of polygon meshes is a large sub-field of computer
graphics and geometric modeling. Different representations of polygon
meshes are used for different applications and goals.
As polygonal meshes are extensively used in computer graphics,
algorithms also exist for ray tracing, collision detection, and rigid-body
dynamics of polygon meshes.
A vertex is a position along with other information such as color,
normal vector and texture coordinates. An edge is a connection between
two vertices. A face is a closed set of edges, in which a triangle face has
three edges, and a quad face has four edges. A polygon is a coplanar set
of faces. In systems that support multi-sided faces, polygons and faces
are equivalent. However, most rendering hardware supports only 3- or 4sided faces, so polygons are represented as multiple faces.
Mathematically a polygonal mesh may be considered an unstructured
grid, or undirected graph, with additional properties of geometry, shape
and topology.


3D reconstruction from multiple images is the creation of threedimensional models from a set of images. It is the reverse process of
obtaining 2D images from 3D scenes.

The essence of an image is a projection from a 3D scene onto a 2D

plane, during which process the depth is lost. The 3D point
corresponding to a specific image point is constrained to be on the line
of sight. From a single image, it is impossible to determine which point
on this line corresponds to the image point. If two images are available,
then the position of a 3D point can be found as the intersection of the
two projection rays. This process is referred to as triangulation. The key
for this process is the relations between multiple views which convey the
information that corresponding sets of points must contain some
structure and that this structure is related to the poses and the
calibration of the camera.
In recent decades, there is an important demand for 3D content
for computer graphics, virtual reality and communication, triggering a
change in emphasis for the requirements. Many existing systems for
constructing 3D models are built around specialized hardware (e.g.
stereo rigs) resulting in a high cost, which cannot satisfy the requirement
of its new applications. This gap stimulates the use of digital imaging
facilities (like a camera). Moore's law also tells us that more work can be
done in software.
The task of converting multiple 2D images into 3D model consists
of a series of processing steps:
Camera calibration consists of intrinsic and extrinsic parameters,
without which at some level no arrangement of algorithms can work. The
dotted line between Calibration and Depth determination represents that
the camera calibration is usually required for determining depth.
Depth determination serves as the most challenging part in the
whole process, as it calculates the 3D component missing from any
given image depth. The correspondence problem, finding matches
between two images so the position of the matched elements can then
be triangulated in 3D space is the key issue here.
Once you have the multiple depth maps you have to combine
them to create a final mesh by calculating depth and projecting out of
the camera registration. Camera calibration will be used to identify
where the many meshes created by depth maps can be combined
together to develop a larger one, providing more than one view for
By the stage of Material Application you have a complete 3D
mesh, which may be the final goal, but usually you will want to apply the
color from the original photographs to the mesh. This can range from
projecting the images onto the mesh randomly, through approaches of
combining the textures for super resolution and finally to segmenting the
mesh by material, such as specular and diffuse properties.


So we can improve the efficiency of 3D image reconstruction on
the software side itself. The softwares used are VisualSFM for point cloud
generation, Meshlab for Mesh creation and 3D image construction and
finally, Blender for rendering of the 3D surface.
VISUALSFM: VisualSFM is a GUI application for 3D reconstruction
using structure from motion (SFM). The reconstruction system integrates
several of my previous projects: SIFT on GPU(SiftGPU), Multicore Bundle
Adjustment, and Towards Linear-time Incremental Structure from Motion.
VisualSFM runs fast by exploiting multicore parallelism for feature
detection, feature matching, and bundle adjustment.

For dense reconstruction, this program integrates the execution of

Yasutaka Furukawa's PMVS/CMVS tool chain. The SfM output of
VisualSFM works with several additional tools, including CMP-MVS by
Michal Jancosek, MVE by Michael Goesele's research group, SURE by
Mathias Rothermel and Konrad Wenzel, and MeshRecon by Zhuoliang
MESHLAB: MeshLab is an open source, portable, and extensible
system for the processing and editing of unstructured 3D triangular
meshes. The system is aimed to help the processing of the typical notso-small unstructured models arising in 3D scanning, providing a set of
tools for editing, cleaning, healing, inspecting, rendering and converting
this kind of meshes. The system is heavily based on the VCG library
developed at the Visual Computing Lab of ISTI - CNR, for all the core
mesh processing tasks and it is available for Windows, MacOSX, and
Linux. . The MeshLab system started in late 2005 as a part of the FGT
course of the Computer Science department of University of Pisa and
most of the code (~15k lines) of the first versions was written by a
handful of willing students.
BLENDER: Blender is a professional free and open-source 3D
computer graphics software product used for creating animated films,
visual effects, art, 3D printed models, interactive 3D applications and
video games. Blender's features include 3D modeling, UV unwrapping,
texturing, raster graphics editing, rigging and skinning, fluid and smoke
simulation, particle simulation, soft body simulation, sculpting,
animating, match moving, camera tracking, rendering, video editing and
compositing. Alongside the modeling features it also has an integrated
game engine.


The set of 2D images is first fed into the software: Visual SFM. The
software will create a sparse/dense cloud for the set of input images
along with the Polygon Mesh file i.e. .ply file.
This .ply file, which we got from VisualSFM, is fed into the software:
Mesh lab, which will process the polygon mesh and will give a 3D
image/object as output.
Finally, the output from Mesh Lab is fed into Blender which will do
the texturing of the 3D image, which will make it seem realistic.

Screenshots & Images

The proposed method provides really good results on 3D image
reconstruction even though no additional hardware is used. The 3D
image is reconstructed out of dense point cloud generated by VisualSFM
software from a set of 2D images input. The software Meshlab is used to
process the mesh and provide the 3D image as the output. Blender does
the finishing touch by applying texturing on the output provided by