Sie sind auf Seite 1von 11

Lovely

Professional
University
[Term paper of Artificial Intelligence (CAP- 402)]

Topic:- Image Recognition in Artificial Intelligence.

Submitted TO: Submitted By:

Mrs. Charu Sharma Aradhana katoch


Dept. CA Class: BCA-MCA
Roll No. 07
Reg. No. 3010060007
Synopsys Course Code: CAP402

Course Instructor: Ms. Navdeep Course Tutor: _________

Student Roll Number: 07 Section is: E3601

Declaration:
I declare that this Term paper is my individual work. I have not copied from any other
student's work or from any other source except where due acknowledgement is made
explicitly in the text, nor has any part been written for me by another person.

Aradhana katoch

(Student Signature)

Evaluator’s Comment:
__________________________________________________________________________________
__________________________________________________________________________________
__________________________________________________________________________________
_________

Marks Obtained: ____________ Out Of: ______________________


Contents
Contents................................................................................................................ 3

Introduction to the topic:.......................................................................................3

Image Processing...................................................................................................5

Example:-............................................................................................................... 7

Example of Image recognition:-.............................................................................8

MODEL:-................................................................................................................. 9

Properties of the model:-.......................................................................................9

Motion analysis.................................................................................................10

Introduction to the topic:


"Image recognition is the research area that studies the operation and design of a picture that
recognize patterns in it. Image recognition is a long-standing challenge in science. Important
application areas are image analysis through which we can try to make our computers to
recognize the images as a human mind recognise.
"For example, when we see a dog, first we recognize that it's an animal....This recognition
concept is simple and familiar to everybody in the real world environment, but in the world
of artificial intelligence, recognizing such objects is an amazing feat. The functionality of the
human brain is amazing; it is not comparable with any artificial machines or software. It is
done through machines Applications include finger print identification, face recognition,
character recognition, signature recognition and classification of objects in scientific/research
areas such as astronomy, engineering, statistics, medical, machine learning and neural
networks."

These include statistical and structural pattern recognition; image analysis;


computational models of vision; enhancement, restoration, segmentation, feature extraction,
shape and texture analysis; character and text recognition.

Current research on the image recognition for


AI: It takes surprisingly few pixels of information to be able to identify the subject of an
image. The discovery could lead to great advances in the automated identification of online
images and, ultimately, provide a basis for computers to see like humans do. Laboratory, and
colleagues have been trying to find out what is the smallest amount of information--that is,
the shortest numerical representation--that can be derived from an image that will provide a
useful indication of its content.

At present, the only ways to search for images are based on text captions that people
have entered by hand for each picture, and many images lack such information. Automatic
identification would also provide a way to index pictures people download from digital
cameras onto their computers, without having to go through and caption each one by hand.
And ultimately it could lead to true machine vision, which could someday allow robots to
make sense of the data coming from their cameras and figure out where they are.

We will try to find very short codes for images, so that if two images have a similar sequence,
they are probably similar--composed of roughly the same object, in roughly the same
configuration." If one image has been identified with a caption or title, then other images that
match its numerical code would likely show the same object and so the name associated with
one picture can be transferred to the others.

Psychologists have proposed that many human-object interaction activities form unique
classes of scenes. Recognizing these scenes is important for many social functions. To enable
a computer to do this is however a challenging task. Much of artificial intelligence deals with
autonomous planning or deliberation for robotical systems to navigate through an
environment. A detailed understanding of these environments is required to navigate through
them. Information about the environment could be provided by a computer vision system,
acting as a vision sensor and providing high-level information about the environment.
Image Processing
image processing, image analysis and machine vision. There is a significant overlap in the
range of techniques and applications that these cover. This implies that the basic techniques
that are used and developed in these fields are more or less identical, something which can be
interpreted as there is only one field with different names. On the other hand, it appears to be
necessary for research groups, scientific journals, conferences and companies to present or
market themselves as belonging specifically to one of these fields and, hence, various
characterizations which distinguish each of the fields from the others have been presented.

The following characterizations appear relevant but should not be taken as universally
accepted:

• Image processing and image analysis tend to focus on 2D images, how to transform
one image to another, e.g., by pixel-wise operations such as contrast enhancement,
local operations such as edge extraction or noise removal, or geometrical
transformations such as rotating the image. This characterisation implies that image
processing/analysis neither require assumptions nor produce interpretations about the
image content.
• They tends to focus on the 3D scene projected onto one or several images, e.g., how
to reconstruct structure or other information about the 3D scene from one or several
images. Computer vision often relies on more or less complex assumptions about the
scene depicted in an image.
• They tends to focus on applications, mainly in manufacturing, e.g., vision based
autonomous robots and systems for vision based inspection or measurement. This
implies that image sensor technologies and control theory often are integrated with the
processing of image data to control a robot and that real-time processing is
emphasised by means of efficient implementations in hardware and software. It also
implies that the external conditions such as lighting can be and are often more
controlled in machine vision than they are in general computer vision, which can
enable the use of different algorithms.
• There is also a field called imaging which primarily focus on the process of producing
images, but sometimes also deals with processing and analysis of images. For
example, medical imaging contains lots of work on the analysis of image data in
medical applications.

Typical tasks of image recognition:-


Determining whether or not the image data contains some specific object, feature, or activity.
This task can normally be solved robustly and without effort by a human, but is still not
satisfactorily solved in computer vision for the general case: arbitrary objects in arbitrary
situations. The existing methods for dealing with this problem can at best solve it only for
specific objects, such as simple geometric objects (e.g., polyhedra), human faces, printed or
hand-written characters, or vehicles, and in specific situations, typically described in terms of
well-defined illumination, background, and pose of the object relative to the camera.

Different varieties of the recognition problem are described in the literature:

• Object recognition: one or several pre-specified or learned objects or object classes


can be recognized, usually together with their 2D positions in the image or 3D poses
in the scene.
• Identification: An individual instance of an object is recognized. Examples:
identification of a specific person's face or fingerprint, or identification of a specific
vehicle.
• Detection: the image data is scanned for a specific condition. Examples: detection of
possible abnormal cells or tissues in medical images or detection of a vehicle in an
automatic road toll system. Detection based on relatively simple and fast
computations is sometimes used for finding smaller regions of interesting image data
which can be further analysed by more computationally demanding techniques to
produce a correct interpretation.

Several specialized tasks based on recognition exist, such as:

• Content-based image retrieval: finding all images in a larger set of images which
have a specific content. The content can be specified in different ways, for example in
terms of similarity relative a target image (give me all images similar to image X), or
in terms of high-level search criteria given as text input (give me all images which
contains many houses, are taken during winter, and have no cars in them).
• Pose estimation: estimating the position or orientation of a specific object relative to
the camera. An example application for this technique would be assisting a robot arm
in retrieving objects from a conveyor belt in an assembly line situation.
• Optical character recognition (OCR): identifying characters in images of printed or
handwritten text, usually with a view to encoding the text in a format more amenable
to editing or indexing

Department of Computer Science and Engineering, Michigan State University. "The


Pattern Recognition and Image Processing (PRIP) Lab faculty and students
investigate the use of machines to recognize patterns or objects. Methods are
developed to sense objects, to discover which of their features distinguish them from
others, and to design algorithms which can be used by a machine to do the
classification. ... Important applications include face recognition, fingerprint
identification, document image analysis, 3D object model construction, robot
navigation, and visualization/exploration of 3D volumetric data. Current research
problems include biometric authentication, automatic surveillance and tracking,
handless HCI, face modeling, digital watermarking and analyzing structure of online
documents. Recent graduates of the lab have worked on handwriting recognition,
signature verification, visual learning, and image retrieval."
Example:-
It takes surprisingly few pixels of information to be able to identify the subject of an image, a
team led by an MIT researcher has found. The discovery could lead to great advances in the
automated identification of online images and, ultimately, provide a basis for computers to
see like humans do.

Deriving such a short representation would be an important step toward making it possible to
catalog the billions of images on the Internet automatically. At present, the only ways to
search for images are based on text captions that people have entered by hand for each
picture, and many images lack such information. Automatic identification would also provide
a way to index pictures people download from digital cameras onto their computers, without
having to go through and caption each one by hand. And ultimately it could lead to true
machine vision, which could someday allow robots to make sense of the data coming from
their cameras and figure out where they are.

so that if two images have a similar sequence [of numbers], they are probably similar--
composed of roughly the same object, in roughly the same configuration." If one image has
been identified with a caption or title, then other images that match its numerical code would
likely show the same object (such as a car, tree, or person) and so the name associated with
one picture can be transferred to the others. "With very large amounts of images, even
relatively simple algorithms are able to perform fairly well" in identifying images this way.
Face recognition (a part of image recognition):-
Face recognition systems are progressively becoming popular as means of extracting
biometric information. Face recognition has a critical role in biometric systems and is
attractive for numerous applications including visual surveillance and security. Because of
the general public acceptance of face images on various documents, face recognition has a
great potential to become the next generation biometric technology of choice. Face images
are also the only biometric information available in some legacy databases and international
terrorist watch-lists and can be acquired even without subjects' cooperation.

Example of Image recognition:-


Detecting objects in cluttered scenes and estimating articulated human body parts are two
challenging problems in computer vision. The difficulty is particularly pronounced in
activities involving human-object interactions (e.g. playing tennis), where the relevant object
tends to be small or only partially visible, and the human body parts are often self-occluded.
We observe, however, that objects and human poses can serve as mutual context to each
other – recognizing one facilitates the recognition of the other. We then cast the model
learning task as a structure learning problem, of which the structural connectivity
between the object, the overall human pose, and different body parts are estimated through a
structure search approach, and the parameters of the model are estimated by a new max-
margin algorithm. On a sports data set of six classes of human-object interactions .

Introduction:-
Using context to aid visual recognition is recently receiving more and more attention.
Psychology experiments show that context plays an important role in recognition in
the human visual system. object detection and recognition ,scene recognition ,action
classification ,and segmentation ,While the idea of using context is clearly a good one, a
curious observation shows that most of the context information has contributed
relatively little to boost performances in recognition tasks between context based methods
and sliding window based methods for object detection .
MODEL:-

Objects and human poses can serve as mutual context to facilitate the recognition of each
other. the human pose is better estimated by seeing the cricket bat, from which we can have
a strong prior of the pose of the human. the cricket ball is detected by understanding the
human pose of throwing the ball. One reason to account for the relatively small margin is,
in our opinion, the lack of strong context. While it is nice to detect cars in the context of
roads, powerful car detectors can nevertheless detect cars with high accuracy whether they
are on the road or not. Indeed, for the human visual system, detecting visual abnormality out
of context is crucial for survival and social activities Many important image recognition tasks
rely critically on context. One such scenario is the problem of human pose estimation and
object detection in human-object interaction activities .However, the two difficult tasks can
benefit greatly from serving as context for each othe. The goal of this paper is to model the
mutual context of objects and human poses in HOI activities so that each can facilitate
the recognition of the other. Given a set of training images, our model automatically
discovers the relevant poses for each type of HOI activity, and furthermore the connectivity
and spatial relationships between the objects and body parts. We formulate this task as a
structure learning problem, of which the connectivity is learned by a structure search
approach, and the model parameters are discriminatively estimated by a novel max-margin
approach. By modeling the mutual co-occurrence and spatial relations of objects and
human poses, we show that our algorithm significantly improves the performance of both
object detection and pose estimation on a dataset of sports images.

Some techniques have been proposed to avoid exhaustively searching the image which
makes the algorithm more efficient. While the most popular detectors are still based on
sliding windows, more recent work has tried to integrate context to obtain better performance
. However, in most of the works the performance is improved by a relatively small margin.
It is out of the scope of this paper to develop an object detection or pose estimation method
that generally applies to all situations. Instead, we focus on the role of context
in these problems. In most of these works, one type of scene information serves as contextual
facilitation to a main recognition problem. For example, ground planes and horizons can help
to refine pedestrian detections.

Properties of the model:-


Co-occurrence context for the activity class, object, and human pose. Given the presence of a
tennis racket, the human pose is more likely to be playing tennis instead of playing croquet.
That is to say, co-occurrence information can be beneficial for coherently modeling the
object, the human pose, and the activity class. Multiple types of human poses for each
activity. Our model allows each activity (�) to consist of more than one human pose (�).
Treating � as a hidden variable, our model automatically discovers the possible poses from
training images. This gives us more flexibility to deal with the situations where the human
poses in the same activity are inconsistent.

Image Recognation Syatems:-


Motion analysis

Several tasks relate to motion estimation where an image sequence is processed to produce an
estimate of the velocity either at each points in the image or in the 3D scene, or even of the
camera that produces the images . Examples of such tasks are:

• Egomotion: determining the 3D rigid motion (rotation and translation) of the camera
from an image sequence produced by the camera.
• Tracking: following the movements of a (usually) smaller set of interest points or
objects (e.g., vehicles or humans) in the image sequence.
• Optical flow: to determine, for each point in the image, how that point is moving
relative to the image plane, i.e., its apparent motion. This motion is a result both of
how the corresponding 3D point is moving in the scene and how the camera is
moving relative to the scene.

Scene reconstruction

Given one or (typically) more images of a scene, or a video, scene reconstruction aims at
computing a 3D model of the scene. In the simplest case the model can be a set of 3D points.
More sophisticated methods produce a complete 3D surface model.

Image restoration

The aim of image restoration is the removal of noise (sensor noise, motion blur, etc.) from
images. The simplest possible approach for noise removal is various types of filters such as
low-pass filters or median filters. More sophisticated methods assume a model of how the
local image structures look like, a model which distinguishes them from the noise. By first
analysing the image data in terms of the local image structures, such as lines or edges, and
then controlling the filtering based on local information from the analysis step, a better level
of noise removal is usually obtained compared to the simpler approaches. An example in this
field is the inpainting.

Some systems are stand-alone applications which solve a specific measurement or detection
problem, while others constitute a sub-system of a larger design which, for example, also
contains sub-systems for control of mechanical actuators, planning, information databases,
man-machine interfaces, etc. The specific implementation of a computer vision system also
depends on if its functionality is pre-specified or if some part of it can be learned or modified
during operation. There are, however, typical functions which are found in many computer
vision systems.
• Image acquisition: A digital image is produced by one or several image
sensors, which, besides various types of light-sensitive cameras, include range
sensors, tomography devices, radar, ultra-sonic cameras, etc. Depending on the type
of sensor, the resulting image data is an ordinary 2D image, a 3D volume, or an image
sequence. The pixel values typically correspond to light intensity in one or several
spectral bands (gray images or colour images), but can also be related to various
physical measures, such as depth, absorption or reflectance of sonic or
electromagnetic waves, or nuclear magnetic resonance.
• Pre-processing: Before a computer vision method can be applied to image
data in order to extract some specific piece of information, it is usually necessary to
process the data in order to assure that it satisfies certain assumptions implied by the
method. Examples are
o Re-sampling in order to assure that the image coordinate system is correct.
o Noise reduction in order to assure that sensor noise does not introduce false
information.
o Contrast enhancement to assure that relevant information can be detected.
o Scale-space representation to enhance image structures at locally appropriate
scales.
• Feature extraction: Image features at various levels of complexity are
extracted from the image data. Typical examples of such features are
o Lines, edges and ridges.
o Localized interest points such as corners, blobs or points.

More complex features may be related to texture, shape or motion.

• Detection/segmentation: At some point in the processing a decision is


made about which image points or regions of the image are relevant for further
processing. Examples are
o Selection of a specific set of interest points
o Segmentation of one or multiple image regions which contain a specific object
of interest.
• High-level processing:

At this step the input is typically a small set of data, for example a set of points or an
image region which is assumed to contain a specific object. The remaining processing
deals with, for example:

o Verification that the data satisfy model-based and application specific


assumptions.
o Estimation of application specific parameters, such as object pose or object
size.
o Classifying a detected object into different categories.

So, image processing help AI to identify the image and respond according to the image
identification.

Das könnte Ihnen auch gefallen