Sie sind auf Seite 1von 105

GENOMICS

WHAT IS GENOMICS?

• Genomics is the sub discipline of genetics


devoted to the
– mapping,
– sequencing ,
– and functional
analysis of genomes.
WHAT IS GENOMICS?
• How is Genomics different from Genetics?

– Genetics looks at single genes, one at a


time, like a picture or snapshot.

– Genomics looks at the big picture and


examines all the genes as an entire
system.
HISTORY OF GENOMICS
• Genomics can be said to have appeared in
the 1980s, and took off in the 1990s with the
initiation of genome projects for several
biological species.

• The most important tools here are


microarrays and bioinformatics.
HISTORY OF GENOMICS
cont…
• Without a doubt, the introduction of the
computers into molecular biology laboratories
was one of the key factors in the
development of the genomics.

• Laboratory automation led to the production


of large amounts of data, and the need of
analysis ,combine and understand these
resulted in development of “Bioinformatics”
RELATED
DEVELOPMENTS
• Bioinformatics and computational biology
involves the use of techniques including
– applied mathematics,
– informatics,
– statistics,
– computer science,
– artificial intelligence,
– chemistry and biochemistry,
to solve biological problems usually on the
molecular level.
WHAT IS A GENOME?
• The genome broadly refers to the total
amount of DNA of a single cell (haploid cell in
the case of a diploid organism) of an
organism, including its genes.

“The whole hereditary information of an


organism that is encoded in the DNA”
WHAT IS A GENOME?
Genes provide the information for making all
proteins that are necessary for the
expression of characters.

Gene Protein Character

Characters refers to how an organism looks, its


physiology, its ability to fight infections and even
its behavior.
• The genome is found inside every cell, and in
those that have nucleus, the genome is
situated inside the nucleus. It is a part of the
DNA molecule.
• DNA sequencing techniques enables
scientists to determine the exact order or
sequence of the bases of a genome.
• The sequence information of the genome will
show,

• the position of every gene along the


chromosome,
• the regulatory regions that flank each
gene,
and
• the coding sequence that determines the
protein produce by each gene.
LOOKING AT A GENOME
• The key question about the genome is how
many genes it contains.
• We can think about the total number of
genes at four levels, corresponding to
successive stages in gene expression:
1. Genome
2. Transcriptome
3. Proteome
4. Proteins
SOME QUICK FACTS ABOUT
GENOMES
• Individual genomes show extensive variation.
• Not all genes are essential.
In yeast and fly,
deletions of <50% of the genes have detectable effects.
QUICK FACTS cont…

• A substantial part of most eukaryotic nuclear


genomes is made up of Repetitive DNA.

– Repetitive DNA: individual sequence elements


that are repeated many times over, either in
tandem arrays or interspersed throughout the
genome.

– Single copy DNA: which includes most genes,


and is made up of sequences that are not
repeated elsewhere.
QUICK FACTS cont…
• Extrachromosomal genes:

– The mitochondria of all organisms, as well as


the chloroplasts of all photosynthetic cells,
contain DNA molecules that carry a limited
number of genes.

– These genes code for the RNAs and some of


the proteins required in the organelle.
Genes and Proteins & the role
of Introns
• Introns:
Derived from the term "intragenic
regions", are non-coding sections of
precursor mRNA (pre-mRNA).

• Exons:
Are coding sections that remain in the
mRNA sequence.
INTRONS AND EXONS
INTRONS AND EXONS
• Introns are common in eukaryotic pre-mRNA,
but in prokaryotes they are only found in
tRNA and rRNA.
• Unlike introns, exons are coding sections that
remain in the mRNA sequence.
INTRONS AND EXONS
• It is now recognized that introns are
"a complex mix of different DNA, much of
which are vital to the life of the cell”.
• Introns produce a major selection
advantage and consequently are
characteristic of higher, more developed
organisms
• The relationship of introns to cancer and
their use as tumor markers is also being
explored.
• Are genes uniformly distributed in
chromosomes?

– Some chromosomes are relatively poor in


genes, and have >25% of their sequences as
“deserts” – regions longer than 500 kb where
there are no genes.
WHY SEQUENCE
GENOMES?
• Because there is a need to put information
about the genomes of flora and fauna in the
context of the fields that they serve.

• Genomic sciences will serve those that


choose genetic modification as a method for
crop improvement as well as those that apply
conventional breeding methods to improve
and develop agricultural practices.
WHY SEQUENCE
GENOMES?
• This information is used by physiologists and
scientists in research determining
relationships between stress, genes and yield
potential etc.

• It can also be used to produce sufficient


amounts of safe and nutritious food in times
of increased population growth.
WHY SEQUENCE
GENOMES?
• It can be used to conserve and protect
agricultural and other environments.

• It can serve the farmer/producer under


increasing financial pressure by providing
higher yields through improve varieties.
WHY SEQUENCE
GENOMES?
• We need information and technology to
– improve human health,
– harness natural energy,
– understand and react in a positive
manner to global climate change,
– clean up our environment and
– ensure food safety.
ANTICIPATED BENEFITS
OF GENOME RESEARCH
• Molecular Medicine
• Microbial Genomics
• Risk Assessment
• Bioarchaelogy, Anthropology, Evolution and
Human Migration
• DNA Identification (Forensics)
• Agriculture, Livestock Breeding, and
Bioprocessing
GENOME MAPPING
• Genomes can be mapped by,

– Linkage,

– Restriction cleavage, or

– DNA sequence.
LINKAGE MAP
• A genetic / linkage map identifies the
distance between mutations in terms of
recombination frequencies.

• A linkage map can also be constructed by


measuring recombination between sites in
genomic DNA.
RESTRICTION CLEAVAGE
• A restriction map is constructed by
cleaving DNA into fragments and
measuring the distances between the
sides of cleavage.

• Large changes in the genome can be


recognized because they affect the size or
number of restriction fragments. Thus point
mutations are difficult to detect.
DNA SEQUENCE

• By analyzing the protein-coding potential of


the sequence of the DNA.

• The principle is to obtain a series of


overlapping fragments of DNA, which can
be connected into continuous map.
HUMAN GENOME
PROJECT
The Human Genome

• The human genome is by far the most


complex and largest genome.

• Its size spans a length of about 6 feet of


DNA, containing 30,000 to 40,000 genes.
HUMAN GENOME
PROJECT
The Human Genome

• The DNA material is organized into a haploid


chromosomal set of 22 and a sex
chromosome.
HUMAN GENOME
PROJECT
• The US Human Genome Project is a 13 year
effort, which is coordinated by the
– Department of Energy (DOE) and
– National Institutes of Health (NIH).

• This project was launched in 1986 by Charles


DeLisi and was originally planned to last for
15 years.
HUMAN GENOME
PROJECT
Goals
• Identify the approximate genes in human DNA.

• Determine the sequences of 3 billion chemical


base pairs that make up human DNA.

• Store this information in databases.

• Improve tools for data analysis.


HUMAN GENOME
PROJECT
Goals
• Transfer related technologies to the private
sector.

• Address the ethical, legal and social issues


(ELSI), that may arise from the project.
HUMAN GENOME
PROJECT
Milestones
1986 The birth of the Human Genome
Project.
1990 Project initiated as joint effort of US
Department of Energy and the
National Institute of Health.
1994 Genetic Privacy Act: to regulate
collection, analysis, storage and use
of DNA samples and genetic
information is proposed.
HUMAN GENOME
PROJECT
Milestones
1996 Welcome Trust joins the project.
1998 Celera Genomics formed to
sequence much of the human
genome in 3 years.
1999 Completion of the sequence of
Chromosome 22-the first human
chromosome to be sequenced.
2000 Completion of the working draft of
the entire human genome.
HUMAN GENOME
PROJECT
Milestones
2001 Analysis of the working draft are
published.
2003 HGP sequencing is completed
and Project is declared finished
two years ahead of schedule.
HUMAN GENOME
PROJECT
• Whose DNA is being sequenced?
– Used samples from of blood (female) and
sperm (male) from a large number of people.

– Celera Genomics collected samples from


individuals who were Hispanic, Asian,
Caucasian, and African-American.

– The donor identities were protected.


HUMAN GENOME
PROJECT
• The first step towards sequencing the
genome is creating maps.

• Maps are of various types:


– Genetic Linkage Maps

– Physical Maps

– Contig Maps
HUMAN GENOME
PROJECT
Sequencing
• Chromosomes broken down into much
shorter pieces.
• Each short piece is used as a template to
generate a set of fragments.
• The fragments in a set are separated by gel
electrophoresis.
• The final base at the end of each fragment is
identified.
HUMAN GENOME
PROJECT
Sequencing
• Automated sequencers analyze the resulting
electropherograms giving the output as a
four-colour chromotogram.

• After the bases are “read”, computers are


used to assemble the short sequences into
long continuous stretches.
A CLOSER LOOK AT THE
HUMAN GENOME
• The human genome contains 3164.7 million
nuckeotide bases ( approx. 3 billion A,C,T
and G).

• The average gene is made up of 3000


bases, but sizes of genes vary greatly.
A CLOSER LOOK AT THE
HUMAN GENOME
• The total number of genes is estimated at
around 30000.

• Almost all (99.9%) nucleotide bases are


exactly the same in all the people.
A CLOSER LOOK AT THE
HUMAN GENOME
• Less than 2 % of the genome codes for
protein.
A CLOSER LOOK AT THE
HUMAN GENOME
A CLOSER LOOK AT THE
HUMAN GENOME
• Repeated sequences that do not code for
proteins (“junk DNA”) make up at least 50%
of the genome.

• Repetitive sequences fall into five classes:


1. Transposons
2. Processed pseudogenes
3. Simple sequence repeats
4. Segmental duplications
5. Tandem repeats form blocks of one type of sequence
A CLOSER LOOK AT THE
HUMAN GENOME
A CLOSER LOOK AT THE
HUMAN GENOME
• The sequence of human genome
emphasizes the importance of
transposons.

• Most of the transposons in the human


genome are nonfunctional; very few are
currently active.

• They have played an active role in


shaping the genome.
A CLOSER LOOK AT THE
HUMAN GENOME
• Some present genes originated as
transposons, and evolved into their present
condition after losing the ability to transpose.

• Almost 50 genes appear to have originated


like this.
A CLOSER LOOK AT THE
HUMAN GENOME
• The human genome’s gene-dense “urban
centres” are predominantly composed of C
and G bases.

• The gene-poor “deserts” are rich in A and T


bases.

• Genes appear to be concentrated in random


areas along the genome, with vast expanses
of non-coding DNA in between.
A CLOSER LOOK AT THE
HUMAN GENOME
• Stretches of up to 30000 G and C bases
repeating over and over occur adjacent to
gene-rich areas, forming a barrier between
the genes and the “junk” DNA.

• Chromosome 1 has the most number of


genes (2968) and Y chromosome the least
(231).
SEQUENCING THE
GENOMES OF OTHER
ORGANISMS
• The sequence of many organisms have been
carried out and is still being carried out at a
rapid pace.

• There are many medical, genetic and


commercial reasons for sequencing the
genomes of various organisms.
• As of September 2007 the complete
sequence was known of,

– 1879 viruses,

– 577 bacterial species, and

– roughly 23 eukaryotic species (of which


about half are fungi).
PROKARYOTIC
GENOMES
Escherichia coli Genome

• Escherichia coli considered as model bacteria.

• Model organism for studying many essential


processes of life.

• The E. coli strain K-12 genome was


sequenced.
Escherichia coli Genome

• 4639 kb in length
• Comprises approximately 4288 genes.
– 1897 genes coding for known proteins
– 397 unidentified open-reading frames
• The 4288 genes take up approximately 80%
of the DNA molecule with the remaining 20%
being made up of intergenic regions.
• Note on the genomes of other bacteria:
– The basic features of gene organization, with
numerous operons but few repeated genes,
appear to be the same in all bacteria.
– Many of the bacteria with larger genomes
have more complex life cycles.
– The bacteria with the smallest genomes
are mostly obligate parasites.
Escherichia coli Genome
• Benefits of the genome project:
– Role of small proteins found in E. coli.
– To identify proteins that are crucial to E. coli.
– By comparing the genes, we can infer how
particular genes originated.
– Sequence information of strain K-12 of E. coli
can be used to compare with the deadly E.
coli designated O157:H7, which has also
been sequenced.
EUKARYOTIC GENOMES
Saccharomyces cerevisiae Genome

• One of the most important fungal organisms


used in biotechnological processes.
• Considered as a model eukaryotic
organism.
• The first eukaryotic organism to have its
entire genome sequenced
Saccharomyces cerevisiae Genome

• 16 chromosomes (2n)
• Approximate genome size – 15520 kb
• 5885 potential protein-coding genes.
Caenorhabditis elegans (Nematode)
Genome

• Is an often used simple model for


multicellular organisms.
• 19099 known and predicted genes.
• One gene per 5076 bp.
Drosophila melanogaster (Fruit Fly)
Genome

• Has been the most important tool for genetics


studies in the twentieth century.
• Second multicellular organism to have its
genome sequenced.
• Genome is about 180 Mb in size
• 4 chromosomes (2n)
• 13601 predicted genes
Drosophila melanogaster (Fruit Fly)
Genome

• Interestingly, the Drosophila genome


contains genes that are similar to 177 of
289 human genes that are responsible for
diseases.
Arabidopsis thaliana (Thale / Mouse
Ear Cress) Genome

• Used as a model plant in plant research.


• This was the first ever plant to be completely
sequenced.
• 10 chromosomes (2n).
Arabidopsis thaliana (Thale /
Mouse Ear Cress) Genome

• Spans 125 Mb
• Contains a total of 25498 genes and code for
11601 proteins
– Of these proteins, 35% are unique to plants
• Of the total genes, 9% were classified
experimentally, while 30% were
unclassified.
• At least 70% of the genes are duplicated.
Arabidopsis thaliana (Thale / Mouse
Ear Cress) Genome
• Impacts of plant genetics and research:
– Greatly simplify the process of forward
genetics.
– Study of human diseases.
– Improve food crops.
Oryza sativa L. (rice) Genome

• One of the ,most important food crops in the


world.
• Scientists use rice as a model plant in
cereal genomics.
• 24 chromosomes (2n).
Organism Type Genome Number of
size genes
predicted

Oryza sativa Rice 420 Mb 32 – 50000


ssp. indica

Oryza sativa Rice 466 Mb 46022 –


ssp.japonica 55615
Mus musculus (Laboratory Mouse)
Genome
• The sequence of the mouse genome is
important for understanding the contents of
the human genome and it also serves as a
key experimental tool for biomedical
research.

• 20 chromosomes (2n)
Mus musculus (Laboratory Mouse)
Genome
• The draft sequence was generated by
assembling the sevenfold sequence
coverage from female mice of the B6 strain.

• Genome size is 2.5 Gb.


• Seem to contain about 30000 protein-coding
genes.
COMPARING THE GENOMES
OF DIFFERENT ORGANISMS
• Why Compare?
– Need to better understand the individual
genomes
– To understand the functioning of individual
genes
– To derive a comparative study of basic
functions
– To better understand evolutionary processes
COMPARATIVE CEREAL
GENOMICS
• Rice is now considered as the model system
for studying cereal genomics because of its
relationship to other cereals.

• It is extremely difficult to carry out studies at the


molecular level of wheat, maize, barley and rye.

• Gene content in rice is comparable to other


grass plants and gene order and sequences
have been conserved during evolution.
COMPARATIVE CEREAL
GENOMICS
• The conservation of the order of genes
(Synteny) allows the identification of linkages
between plant families through their
genomes.

• This will provide information to understand


the structure and evolution pattern of genes
and genomes.
COMPARATIVE CEREAL
GENOMICS
COMPARATIVE CEREAL
GENOMICS
• Through the identification of different loci, the
existence of synteny will assist in
isolating an important gene in the small
genome of rice
and
use it as a probe to isolate the
corresponding homologue in a plant
with a much larger genome such as
wheat or maize.
COMPARATIVE CEREAL
GENOMICS
• Such studies will
– Make an important contribution to cereal
breeding.
– Be the basis for maximizing the breeding
potential of all cereal plants.

• Rice will serve as the “reference”


genome for comparative studies and a
donor of genes for biotech manipulations.
COMPARATIVE
ANALYSIS OF THE
HUMAN AND MOUSE
GENOMES
• The mouse genome is 14% smaller than the
human genome.
• At the nucleotide level, approximately 40% of
the human genome can be aligned to the
mouse genome.
• The mammalian genome is evolving in a
non-uniform manner.
• The mouse and human genomes seem to
contain about 30000 protein-coding genes.
• Mouse-human sequence comparisons
allow an estimate of the rate of protein
evolution in mammals.
• Similar types of repeat sequences have
accumulated in the corresponding genomic
regions in both species.
GENERAL GENOMIC
COMPARISONS
• Unlike the human’s seemingly random
distribution of gene-rich areas, many other
organisms’ genomes are more uniform, with
genes evenly spaced throughout.

• Although humans appear to have stopped


accumulating repeated DNA over 50 million
years ago, there seems to be no such decline
in rodents.
GENERAL GENOMIC
COMPARISONS
Organism Genome Size Estimated
(Bases) Genes
Human (Homo sapiens) 3 billion 30,000
Laboratory mouse
2.6 billion 30,000
(M. musculus)
Thale cress (A. thaliana) 100 million 25,000
Roundworm (C. elegans) 97 million 19,000
Fruit fly (D. melanogaster) 137 million 13,000
Yeast (S. cerevisiae) 12.1 million 6,000
Bacterium (E. coli) 4.6 million 3,200
Human immunodeficiency
9700 9
virus (HIV)
GENOMES AND
EVOLUTION
• Comparisons of the human genome
sequence with sequences found in other
species is revealing about the process of
evolution.

• Comparisons of different genomes show a


steady increase in gene number as
additional genes are added to make
eukaryotes, make multicellular organisms,
make animals, and make vertebrates.
GENOMES AND
EVOLUTION
• Most of the genes that are unique to
vertebrates are concerned with the immune
or nervous system.
GENOMES AND
EVOLUTION
• We see, therefore, that the progression from
bacteria to vertebrates requires addition of
groups of genes representing the necessary
new functions at each stage.
GENOMES AND
EVOLUTION
• Comparing the human proteome in more
detail with proteomes of other organisms,
– 46% of the yeast proteome,
– 43% of the worm proteome, and
– 61% of the fly proteome
is represented in the human proteome.

• A key group of approx. 1300 proteins is


present in all four proteomes.
• The common proteins are housekeeping
proteins required for essential functions.
ISSUES OF CONCERN
Ethical, Legal and Social issues of
the Human Genome Project
• Fairness in the use of genetic information.
• Privacy and confidentiality of genetic
information.
• Psychological impact, stigmatization, and
discrimination.
• Reproductive issues.
• Clinical issues.
• Uncertainties associated with gene tests for
susceptibilities and complex conditions.
ISSUES OF CONCERN
Ethical, Legal and Social issues of
the Human Genome Project
• Fairness in access to advanced genomic
technologies.
• Conceptual and philosophical implications.
• Health and environmental issues.
• Commercialization of products.
• Education, Standards, and Quality control.
• Patent issues.
ISSUES OF CONCERN
• Some questions to consider:
– Who should have access to your genetic
information?

– How does knowing your predisposition to


disease affect an individual?

– Should screening be done when there is


no treatment available?
ISSUES OF CONCERN
Example situation:

Human Gene Prospecting in Iceland


• The human gene pool of Iceland is much
more homogenous than the gene pools of
most other populations.
Human Gene Prospecting in Iceland cont.
• Iceland’s national health service has kept
superb medical records since 1915.

• deCODE Genetics, a private company has an


exclusive license from the government of
Iceland to construct and analyze a genetic
database derived from the country’s health
records.
Human Gene Prospecting in Iceland cont.
• deCODE Genetics has a deal with the Swiss
pharmaceutical giant Hoffman-LaRoche.

• To the people of Iceland, the contract


specifies that Hoffman-LaRoche must provide
free of charge all drugs, diagnostic tests and
other products resulting from this research.
Human Gene Prospecting in Iceland cont.

• The key issue in the ongoing debate in


Iceland is the question of presumed consent
and informed consent.
ISSUES OF CONCERN
“Will the poor get poorer whilst losing their valuable
genetic resources, and the rich get richer?
Will the exorbitantly expensive research into genomics
and proteomics limit this powerful technology to the rich
only?
Can we manage without this technology?
Will someone sequence the tea genome, for example?
Should we not do it first, being so famous for Ceylon
tea?
Can we not use this technology to identify and isolate
those wonderful genes from our traditional rice varieties
and transfer them to present day varieties?
Can we not manipulate the rubber genome to produce
better quality rubber and lead the world market?”
THE FUTURE
• Future Challenges – What we still don’t know
– Beyond the HGP:
– Gene number, exact locations, and functions
– Gene regulation
– DNA sequence organization
– Chromosomal structure and organization
– Noncoding DNA types, amount, distribution, information content,
and functions
– Coordination of gene expression, protein synthesis, and post-
translational events
– Interaction of proteins in complex molecular machines
– Predicted vs experimentally determined gene function
– Evolutionary conservation among organisms
THE FUTURE
• Future Challenges – What we still don’t know
– Beyond the HGP:
– Protein conservation (structure and function)
– Proteomes (total protein content and function) in organisms
– Correlation of SNPs (single-base DNA variations among
individuals) with health and disease
– Disease-susceptibility prediction based on gene sequence
variation
– Genes involved in complex traits and multigene diseases
– Complex systems biology including microbial consortia useful for
environmental restoration
– Developmental genetics, genomics
THE FUTURE
Functional Genomics

Trasnscriptomics

Proteomics

Structural Genomics

Experimental methodologies and Comparative


Genomics
THE FUTURE
Genomes to Life:
A DOE Systems Biology Program

Exploring Microbial Genomes for


Energy and the Environment

HapMap
Chart genetic
variation
within the human
genome
THE FUTURE
• What does the
future hold for us?
• How far will this
new science take
us?
• What will become
the boundary of
man?
THE FUTURE

“We still do not have in our hands the


answer to a most fundamental
question:
What makes us human…?”
The staff of the Department of Biotechnology of
the Faculty of Agriculture and Plantation
Management.
GROUP MEMBERS
Ikram Mohideen 066020
 Prof. D.P.S.T.G. Attanayake  B.Sc. (Agric)
Nimhani Perera 066119
(Peredeniya), Ph.D. (Birm)
Kokila Harshanie 066092
Janaki Gunathilake 066016
Prof. E.R.K. Perera B.Sc. Hons.(Agric), M.Sc.
Damith Dharmarathne
(VPI 066066 & SU, USA),
& SU, USA), Ph.D.(VPI
Janaka Sampath 066094
Postdoctoral Animal Biotechnology (VPI &
PubuduSU,Gokarella
USA) 066013
Nushri Jamal Mohamed 066115
Prof.
Thilini Athula Perera B.Sc.
Priyadharshani 066049(Agric) (Sri Lanka),
SandunM.Sc.(Japan),
RanasinghePh.D.(UK),
066084Postdoctoral training
– Biotechnology & Biosafety(UK, USA,
REFERENCES
Book based research
Perera, A. “Secrets of the Genomes” 1st edition, 2005.
Brown,T.A “Genomes” 8-10
Snustad,Simmons.“Principles of Genetics”.3rd edition, 514-541
Kumar,H.D.”Molecular Biology”,
Lewin ,“Genes III”, 236-277
Lewin ,“Genes VIII”,52-69
Domingo,Esteban; Holland, J.J; Ahlquist,Paul. “RNA Genetics” volume 1
“Genomics & Bioinformatics”, 8-25

Journal based research


Beachy,R.N(Ph.D), President & Director, Donald Donforth Plant Science Center,
St.Louis, Missouri, USA.
Jackson,S.A; Rokhisar,D; Stacy,G;Shoemaker,R.C; Schmutz,J; Grimwood,J “Toward
a reference sequence of the Soy Bean Genome? A multi-agency effort” s-55
Bergman,J. “The Functions of Introns: From Junk DNA to Designed DNA”
Perspectives on Science and Christian Faith Volume 53, Number 3, September