Sie sind auf Seite 1von 16

CONTENTS

S.No. 1. 2. TOPIC INTRODUCTION TYPES OF MICRO ARRAYS 2.1 Spotted microarrays 2.2 Affymetric genechips 2.3 Other synthesis platform 3. MICRO ARRAY DATA PROCESSING 3.1 Scanning 3.2 Addressing 3.3 Segmentation 3.4 Background correction 3.5 Normalization 3.6 Gene expression data analysis 4. MICROARRAY IMAGE SEGMENTATION 5. APPLICATIONS OF MICROARRAYS 6. USES OF MICROARRAYS 7. CONCLUSION 8. BIBLIOGRAPHY PAGE NO.

1. INTRODUCTION Microarray technology makes use of the sequence resources created by current genome projects and other sequencing efforts to identify the genes, which are expressed in a particular cell type or an organism .Measuring gene expression levels in variable conditions provides biologists with a better understanding of gene functions, and has wide applications in life sciences. For example, microarrays allow comparison of gene expression between normal and cancerous cells. The technology has been referred by various names: DNA microarrays, DNA arrays, DNA chips, and gene chips. A microarray is typically a glass slide, onto which DNA molecules are attached at fixed locations, i.e. spots, each related to a single gene. Most of the microarray experiments compare gene-expression from two samples, one called target (or experimental) and the other is called control. The two samples are labeled by synthesizing single stranded cDNAs that are complementary to the extracted mRNA. A typical microarray experiment is depicted . The spots are either printed on the microarrays by a robot, or synthesized by photolithography or ink-jet printing. After the target genes are generated and laid out on the chip surface at defined positions, the cDNA extracted from two samples, labeled with fluorescence dyes, is hybridized to the chip. The result of the binding of cDNA is detected by fluorescence by laser excitation. In order to obtain the intensity values, the microarray image is processed so that each gene in the microarray is identified as an individual spot, and the intensity of the signal and its surrounding areas are calculated. The ratio between the signals in the two channels (dyes) is then calculated for each spot. The result of a microarray experiment is represented as a vector, each element being a spot.

The picture shows a portion of a microarray chip, each spot standing for a gene. The color of each spot reflects the relative abundance of the two fluorescence intensities. The raw data produced from microarray experiments are called the hybridized microarray images. To obtain information about gene expression levels, these images have to be analyzed: each spot on the array is identified, then its intensity measured and compared to the background. This process is called image quantization. These data can help biologists gain insights into underlying biological processes, only if they are carefully extracted and stored in databases, where they can subsequently be retrieved and analyzed. The overwhelming use of bioinformatics tools for microarray analysis is a significant achievement in biology, because no other technology has used such sophisticated tools, has combined expertise from many different disciplines, and has provided such detailed information about bio sequences. Although microarrays are a new emerging technology, they have already been widely adopted, and many users are now going beyond exploratory studies. Microarrays are being exploited in human diseases, drug discovery, and genetic screening and diagnostics . The most promising commercial application of microarrays is their potential use in clinical diagnostics. Its potential application goes from drug discovery to gene-based diagnostics, and gene-based treatments. The most appropriate treatments can be reached by studying the dynamics of gene expressions over time, among tissues, and disease status. In addition, microarrays have a huge potential impact in the areas of preventative medicine, ability to diagnose accurately the disease, and design and screen the drugs that can be used to treat certain disease states. What Exactly Is a DNA Microarray? DNA Microarrays are small, solid supports onto which the sequences from thousands of different genes are immobilized, or attached, at fixed locations. The supports themselves are usually glass microscope slides, the size of two side-by-side pinky fingers, but can also be silicon chips or nylon membranes. The DNA is printed, spotted, or actually synthesized directly onto the support With the aid of a computer, the amount of mRNA bound to the spots on the microarray is precisely measured, generating a profile of gene expression in the cell. The American Heritage Dictionary defines "array" as "to place in an orderly arrangement". It is important that the gene sequences in a microarray are attached to their support in an orderly or fixed way.

2. TYPES OF ARRAYS
There are predominantly three kinds of microarray technologies in widespread use among most laboratories: spotted microarrays consisting of presynthesized oligos or PCR products robotically deposited onto a surface, AffymetrixGeneChips composed of relatively short oligonucleotides synthesized on a chip surface, and other in situ synthesis platforms such as arrays made by Agilent and NimbleGen. Although each technology effectively serves as a genomic readout, each has unique characteristics that offer advantages or disadvantages in a given context. Parallel forms of measuring DNA and RNA will continue to change and evolve; however, these three platforms are currently the most ubiquitous. 2.1 Spotted Microarrays Spotted microarrays were the first widely available array platform and continue to enjoy broad use. Originating in the laboratory of Pat Brown, they consist of glass microscope slides onto which libraries of PCR products or long oligonucleotides are printed using a robot equipped with nibs capable of wicking up DNA from micro titer plates and depositing it onto the glass surface with micron precision [13,15]. Since their inception, demand for microarrays has exceeded availability. Because the Brown laboratory expended effort in every aspect of distributing the technology, including plans to build the robot and all protocols required for array manufacture and use, many academic laboratories invest resources into producing these arrays locally. This includes building or purchasing a robot, as well as performing PCR or oligo design and synthesis to create probes for spotting onto glass. The basic principle by which the arrays function is fairly simple, and all the reagents required are available to most researchers with some initial investment. However, apart from praising the benefits of putting technology into the hands of researchers, the reason for highlighting this aspect of spotted arrays is to point out the nonuniform nature of spotted microarrays. Because there is not one manufacturer, one source of materials, or a uniform method of production, variability exists among batches of microarrays and must be considered when planning experiments or when comparing experiments from different array sources. Spotted microarrays are primarily a comparative technology. They are used to examine relative concentrations of targets between two samples. Complex samples to be compared are labeled with uniquely colored fluorescent tags before being mixed together and allowed to compete for
4

hybridization to the microarray spots. In this way, differences between the samples are observed on a per spot basis because the fractional occupancy of the spot hybridized by each sample reflects the relative concentration of that gene or target in the original complex mixture. Thus, for any probe on the microarray, one gets a readout of the relative concentrations of the target between the two input samples. For this reason, spotted microarrays are often called two-color or two-sample arrays. 2.2 Affymetrix GeneChips Affymetrix GeneChips are the most ubiquitous and long-standing commercial array platform in use. The arrays consist of 25-mer oligonucleotides synthesized in situ on the surface of a glass chip. Aphotolithography mask, similar to that used to construct semiconductor chips, is used to control light-directed DNA synthesis chemistry such that oligo sequences are built up one nucleotide at a time at defined locations on a solid substrate or glass chip [18,19]. Current chips contain 6.5 million unique probes in an area of 1.28 cm2. The highly precise nature of the lithographic method allows the construction of compact matrices of square patches of probes. Instead of using a single sequence to probe expression of each gene, as would be common for a spotted array, Affymetrix employs a set of probes to measure expression of a gene. Probe sets contain two types of probes to measure the gene of interest, perfect match (PM) and mismatch (MM) probes. Perfect match probes are chosen to match the gene exactly and are designed against an exemplar sequence representing the gene. Although each probe is unique, probes may occasionally overlap. Mismatch probes are identical to the perfect match probes except that they contain a single base mismatch in the center of the probe. A single mismatch in a short sequence such as a 25-mer is very disruptive to hybridization. The purpose of the mismatch probe is to serve as a negative control for background hybridization. A typical probe set contains 11 perfect match probes and 11 mismatch probes. The positioning of probes for a single gene on the array is chosen by a random process to protect against local hybridization artifacts that could otherwise affect all the probes for a gene if they were clustered together. As most spotted arrays use only one probe per gene, local hybridization artifacts can be a problem. Affymetrix GeneChips are single sample microarrays (also known as one color or one channel). These arrays measure the relative abundance of every gene in a single sample. In this way, one can examine whether one gene is expressed at a higher or lower level than some other gene in the

same sample. If samples are to be compared, a separate chip must be performed for each sample, and the data adjusted by scaling or normalization before comparison. 2.3 Other In Situ Synthesis Platforms Apart from Affymetrix, two alternative in situ synthesis methods exist by which oligonucleotides are built up one nucleotide at a time in successive steps to create probes of length 2560 nucleotides long [108]. These methods are almost exclusively commercial and different companies take different approaches. Although Affymetrix uses a mask-based photolithographic process to control light-directed DNA synthesis, an alternative method employed by NimbleGen makes use of small rotating mirrors to control light and accomplish a similar task [22]. This approach is called Maskless Photolithography, and uses technology developed by Texas Instruments for projection televisions in which arrays of digitally controlled micromirrors can be used to direct light. In combination with light activated chemistry, light of the appropriate intensity and wavelength can be actuated in patterns required to build up any series of nucleotides into an oligonucleotide on a solid surface [23,24]. The NimbleGen approach has two great advantages over the method worked out by Affymetrix. The first is that it does not require a mask. To build an array of different oligonucleotides of length N requires a series of 4N synthesis steps. Thus, to build a library of unique 25-mers on a surface requires 100 chemical synthesis steps. For Affymetrix, a unique photolithography mask is required to control the chemistry at each step. These masks are expensive to construct; thus, the arrays are very costly. In addition, once a set of masks is constructed, it describes only a single array design. Changing the design requires a whole new set of masks. However, changing a pattern of micromirrors under electronic control is very easy; thus, each array produced by NimbleGen can have a different design. The second alternative in situ synthesis approach to array construction uses traditional oligo synthesis chemistry, but the method of controlling base addition is novel. Ink-jet technology, developed by Hewlett Packard for consumer printers, has been adapted to control the liquid precursors of DNA synthesis. Agilent, a spinoff of Hewlett Packard, uses this technology to synthesize 60-mer oligos on glass slides [22,25]. Like ink-jet printing itself, this technology is very flexible. Every array can be customized and thus possess unique content. The flexibility to change the design of an array easily is both a blessing and a curse. The positive aspect is that one
6

can easily change the array design to explore the genome or expression space as required by the experiment. The negative aspect is that data analysis becomes more cumbersome, in general, because one can easily change the probes used to represent a gene from array to array, as well as the content of the array from experiment to experiment. When everyone is using the same chip, as in chips mass produced by Affymetrix, comparisons between data sets are fairly easy. If every chip is unique, comparison between data sets becomes difficult. Another positive aspect of arrays created using in situ synthesis methods is that they do not depend on libraries of clones or molecules created elsewhere. Instead, the content is freshly created with each array. This is good because with spotted arrays one never knows the history of the library, or how many times it has been used to create arrays. However, quality control for in situ synthesized arrays remains obscure.

3. MICROARRAY DATA PROCESSING


Measuring gene expression levels in different conditions provides biologists with a better understanding of the cellular processes. In general, the analysis of DNA microarray gene expression data involves two steps. The first step is image quantitation, i.e. the extraction of gene expression data. The microarray image is processed in order to obtain the ratios of the intensities of the twoprobes for each gene. This is a very important step, since the accuracy of the resulting data is essential in posterior analyses. The second step is gene expression data analysis. After the ratios of the intensities are obtained, various methods can be applied to cluster the genes into different function groups based on the ratios retrieved in the first step. 2.1 Scanning:The hybridized arrays are scanned to measure the fluorescence intensities of each sample for every spot in the gene arrayThe scanned original images are stored as a pair of 16-bit TIFF files, typically from 2.5 to 20MB in size, where one file corresponds to the testing sample and the other to the reference sample.

A portion of a sample microarray image obtained from the microarray data. It is typical that the sizes and the intensities of the spots vary in a wide range in a microarray image. Using a fluorescent signal, the diameter of the spots printed on glass slide range from 25 to100 m (1000m = 1mm). The typical resolution is 3 to 5m per pixel. The minimum resolution of spots can be as low as 8 pixels/dimension, i.e. 64 pixels/spot.

2.2 Addressing:The process of addressing, which is also called gridding, refers to the identification of the center coordinates of each spot. The basic structure of a microarray image is often provided by the manufacturer. This includes the number of sub-grids in the image, the number of rows and columns in each sub-grid, and often, the coordinate of a marker position. After gridding, each spot is identified and linked to a unique identifier. 2.3 Segmentation:In general, segmentation of an image refers to the process of partitioning the image into several regions, each having its own properties . In microarray image processing, segmentation refers to the classification of pixels as either the signal or the surrounding area, i.e. foreground or background. The microarray image contains noisy pixels that come from contamination during different stages when producing the chips or otherwise during the hybridization process. The process of segmentation should be able to distinguish noisy pixels and true foreground pixels. 2.4 Background Correction:After the identification of spots, or the foreground, the next step in processing a microarray image is background correction A common procedure is to subtract the background intensity from that of the foreground for each channel before calculating the ratio of the two channels, using the following equation: It = If Ib , where It is the true intensity of the spot, i.e. the intensity after eliminating the influence of the contamination and scan effects and hybridization effects, If is the foreground intensity, which is measured in the foreground region, and Ib is the background intensity, which is measured in the background region. 2.5 Normalization:Normalization is the process of removing the systematic variations that are introduced in microarray experiments, such as the differences of the labeling efficiency of the two dyes. It has been conjectured that normalization contributes more than background correction in the estimation of gene expression levels .Normalization is conducted either within a single microarray slide or among multiple slides.

2.6 Gene Expression Data Analysis:While common features, or patterns, are regarded to be important for the biological functions of macromolecules, a wide range of domains is involved in analyzing sequence data, among which, the most prolific field is pattern recognition. Whether the training samples are available or not, a pattern recognition system can fall into one of two categories: supervised learning or unsupervised learning . In supervised learning, each sample in the training set is labeled and the cost of mislabeling samples is given. After the system is adjusted to achieve the minimum cost according to the result of classifying the training samples, it is used to assign labels to unknown objects. In unsupervised learning, no prior knowledge about the object is assumed, and hence the system attempts to group datasets into natural groups, or clusters.

10

4. MICROARRAY IMAGE SEGMENTATION


In this section, we discuss the microarray image segmentation problem. The aim is to partition the microarray image pixels into different regions or groups. As a result, foreground pixels fall into one group, and background pixels fall into another group. There may exist other types of pixels, such as noisy pixels, which are contaminated pixels produced during microarray production and scanning, and should be excluded from the background or the foreground region during segmentation. Depending on the approaches to classify the pixels, another possible type of pixels includes the edge pixels surrounding the foreground region. Because the intensities of these pixels fall in between the foreground and the background, including or excluding them will lead to different signal to noise ratios. In a few words, it can be said that the goal of segmentation is to obtain the foreground intensity and background intensity of each spot in the microarray image. In general, image segmentation is the process of distinguishing objects from the background Image segmentation is usually the first step in vision systems, and is the basis for further processing such as description or recognition. The goal of segmentation is to extract important features from images. Segmentation of an image can also be seen, in practice, as the classification of each image pixel to be assigned to one of the image compositions. Most image segmentation approaches can be placed in one of five categories: clustering or threshold-based methods, boundary detection methods, region growing methods, shape-based methods, and hybrid methods. Many image segmentation approaches are intended for specific application domains to yield better results, for example, real-time image segmentation, color image segmentation, 3-D image segmentation, and motion image segmentation. Clustering or thresholding methods are one of the earliest image segmentation techniques .In these methods, the information about the pixel and its neighbors is used to classify the pixel into one of the many regions. Boundary detection or edge-based methods focus on contour detection. The image is segmented based on spatial discontinuity or edge finding and linking. This method is implemented as the convolution of mathematical gradient operators, or template matching operators, that use multiple templates at different orientations of the image. The region growing method performs image segmentation based on spatial similarity among pixels. The image is partitioned into connected regions by grouping neighboring pixels of similar
11

intensity levels. Adjacent regions are then merged under some criterion involving the homogeneity or sharpness of the region boundaries. Shape-based methods utilize some knowledge about the shape of the object to be segmented, e.g., mathematical morphology and template matching. Although many methods exist for general image segmentation, specialized methods have been designed for microarray image analysis. These methods are able to consider the characteristics of the microarray image. While there are quite a few, most of them discussed in this chapter, they are being perfected so as to maximize the information being extracted from the microarray image by computing the median of the radius of the edge pixels.

12

5. APPLICATIONS OF MICROARRAYS
The applications are as follows: 1) Disease diagnosis: DNA Microarray technology helps researchers learn more about different diseases such as heart diseases, mental illness, infectious disease and especially the study of cancer. Now, with the evolution of microarray technology, it will be possible for the researchers to further classify the types of cancer on the basis of the patterns of gene activity in the tumor cells. This will tremendously help the pharmaceutical community to develop more effective drugs as the treatment strategies will be targeted directly to the specific type of cancer. 2) Drug discovery: Microarray technology has extensive application in Pharmacogenomics. Pharmacogenomics is the study of correlations between therapeutic responses to drugs and the genetic profiles of the patients. Comparative analysis of the genes from a diseased and a normal cell will help the identification of the biochemical constitution of the proteins synthesized by the diseased genes. The researchers can use this information to synthesize drugs which combat with these proteins and reduce their effect. 3) Toxicological research: Microarray technology provides a robust platform for the research of the impact of toxins on the cells and their passing on to the progeny. Toxicogenomics establishes correlation between responses to toxicants and the changes in the genetic profiles of the cells exposed to such toxicants. 4) GEO: In the recent past, microarray technology has been extensively used by the scientific community. This data is scattered and is not easily available for public use. For easing the accessibility to this data, the National Center for Biotechnology Information (NCBI) has formulated the Gene Expression Omnibus or GEO. It is a data repository facility which includes data on gene expression from varied sources. 5) water: The DNA chip is used to test the drinking water safety. The DNA chip for this purpose are made with gene sequences from a variety of disease causing microbes.

13

6. USES OF MICROARRAYS

1) Can get a lot of results fast. 2) Can compare the activity of many genes in diseased and healthy cells. 3) Can follow the activity of many genes at the same time. 4) Can categorize diseases into subgroupsHowever, microarray analysis and simple clustering of differentially expressed genes reveal previously unknown differences. 5) Very common use of microarrays is in genotyping and the measurement of genetic Variation. 6) Microarrays have been used to identify the RNA components of various complexes, shedding light on biological mechanisms of RNA translation and transport. 7) Microarrays could be used to monitor and characterize the trafficking of cellular RNA through this complex.

14

7. CONCLUSION
Microarray image segmentation is a specific sub-field in image segmentation. Although many methods exist for image segmentation, in general, custom-designed methods for microarray image segmentation are desirable to achieve better accuracy by considering the characteristics of the microarray image. Methods found in the literature can be grouped into five categories: fixedcircle method, adaptive circle approach, adaptive shape techniques, histogram-based methods, and clustering-based methods. The first two methods that we discuss in this paper are shapebased segmentations techniques. While the first one is too simple and naive to produce good results, the adaptive circle method achieves better results for circle-shaped spots. The histogram, adaptive shape, and clustering-based methods do not restrict to the shape and size of the spots to a specific form or value. Adaptive shape methods, in general, produce smaller foreground area than the actual spots. Histogram methods have been found to obtain good results, but suffer from the difficulty in choosing a suitable mask size. When using clustering-based techniques to separate background from foreground in microarray images, the current methods can not deal well enough with noisy data and weak spots. By choosing a suitable method and tuning the parameters, some models suitable for microarray image segmentation can be derived. As we have mentioned that microarray technology is still growing rapidly, there are no established standards for microarray experiments and how the raw data should be processed. There are quite a few problems that remain open and deserve investigation. In the microarray image segmentation, it is not enough to just group the pixels into foreground and background. In some cases,the noisy pixels also need to be identified. We suggest that techniques such as Adaline, or SVMs, can be used to detect noise. In the process of background correction, an adaptive formula could be used to reveal the true foreground intensity instead of using the rigid formula . Neural networks could also be applied to obtain the true foreground intensity. When conducting the gene expression data analysis, although hierarchical clustering is a widely used method, it suffers from drawbacks such as dealing with noise and providing a non-unique solution. These two problems are currently being investigated.

15

8. BIBLIOGRAPHY
Chen Y, Dougherty E, and Bittner M. 1997. Ratio-based decisions and the quantitative analysis of cDNA microarray images. Journal of Biomedical Optics, 2:364-374. Davidson College. MicroArray Genome Imaging and Clustering Tool MAGIC Tool User Guide,Department of Biology. Mukherjee S, Tamayo P, Mesirov J, Slonim D, Verri A, and Poggio T. 1999. Support vector machine classification of microarray data. Technical Report 182, AI Memo 1676, CBCL. Buhler J, Ideker T, and Haynor D. 2000. Dapple: Improved Techniques for Finding Spots on DNA Microarrays. Technical Report UWTR 2000-08-05, University of Washington

16

Das könnte Ihnen auch gefallen