Beruflich Dokumente
Kultur Dokumente
Outline
Basic Genomics Signal Processing for Genomic Sequences Signal Processing for Gene Expression Resources and Co-operations Challenges and Future Work
Basic Genomics
Genome
Every human cell contains 6 feet of double stranded (ds) DNA This DNA has 3,000,000,000 base pairs representing 50,000100,000 genes This DNA contains our complete genetic code or genome DNA regulates all cell functions including response to disease, aging and development Gene expression pattern: snapshot of DNA in a cell Gene expression profile: DNA mutation or polymorphism over time Genetic pathways: changes in genetic code accompanying metabolic and functional changes, e.g. disease or aging.
mRNA
translation
CCUGAGCCAACUAUUGAUGAA
Protein
PEPTIDE
The Problem
Genomic information is digital letters A, T, C and G Signal processing deals with numerical sequences, character strings have to be mapped into one or more numerical sequences Identification of protein coding regions Prediction of whether or not a given DNA segment is a part of a protein coding region Prediction of the proper reading frame Comparing to traditional methods, signal processing methods are much quicker, and can be even more accurate in some cases.
a ! 1 j , t ! 1 j , c ! 1 j, g ! 1 j
y[ n] ! x[ n] x[ n 1] / 2 x[n 2] / 4
Signal Analysis
Spectral analysis (Fourier transform, periodogram) Spectrogram Wavelet analysis HMT: wavelet-based Hidden Markov Tree Spectral envelope (using optimal string to numerical value mapping)
(a) 1st section (1000bp), (b) 2nd section (1000bp), (c) 3rd section (1000bp), (d) 4th section (954bp) Conjecture: the 4th quarter is actually non-coding
Sample preparation
Microarray Reaction
excitation
scanning laser 1
laser 2
emission
printing
mRNA target)
0.1nl/spot
microarray
Image Segmentation
Simple way: fixed circle method Advanced: fast marching level set segmentation
Advanced
Fixed circle
Data: 96 samples of normal and malignant lymphocytes. Results: scatter-plotting of 12 independent components Comparison: close related to results of hierarchical clustering