Sie sind auf Seite 1von 17

5/27/2013

Genome

TOPIC 1 GENOME STUDIES

Genome is total genetic information possessed by an organism in every cell, tissue, and organ in a body. Every cells contain complete copy of instructions, written in the four-letter language of DNA (i.e. A, C, T, G). If the genome (DNA molecule) of a typical bacterium is extended, it would be about 2mm in length. In comparison, the diameter of the bacterium itself is only about 0.001 mm.

Dr Choo QC (TOPIC 1)

Dr Choo QC (TOPIC 1)

Genome
The amount of protein sequence information in a cell cannot be easily estimated from its genome size because:
(a) Not all DNA codes for proteins - Introns - Regulatory regions (e.g. promoters) (b) Some genes exist in multiple copies (c) The alternate splicing of the gene

Genome of Selected Organisms


Organism Escherichia coli Saccharomyces cerevisiae Caenorhabditis elegans Arabidopsis thaliana Drosophila melanogaster Homo sapiens
3

Number of Genes (Approximate) 4500 6000 19000 25000 1300 25000


Dr Choo QC (TOPIC 1)

Number of base pairs (x 106) 4.6 12.1 95.5 117 180 3200
4

Dr Choo QC (TOPIC 1)

5/27/2013

Genomics
Genomics - study of genome Involves large data sets (i.e. 3 billion base pairs for human genome) High-throughput methods (fast methods for data collection)

Genomics
Genomics studies include: DNA sequencing Genomic library constructions PCR amplification and cloning Hybridization techniques

Dr Choo QC (TOPIC 1)

Dr Choo QC (TOPIC 1)

Genomics
Genome variation within a population, Transcriptional control of genes, Proteome (complete protein content of a cell/organisms at a given time)
Dr Choo QC (TOPIC 1) 7 Dr Choo QC (TOPIC 1)

Histones (DNA binding proteins) Chromatin (DNA histone complexes) Nucleosome (8 histone protein) form core octamer have linker histones (act as clamp prevent coiled DNA from detaching from chromosome)
8

5/27/2013

Genes
An order sequence of nucleotides that encodes a specific product Physical and functional units of heredity

Dr Choo QC (TOPIC 1)

Dr Choo QC (TOPIC 1)

10

Genes
May be turned on or off (by its regulatory mechanism) in response to the environment e.g. Concentration of nutrients & stress Development of the organism Bacterial genomes may also have operons a contiguous of several genes to catalyze successive steps of biochemical reactions
Dr Choo QC (TOPIC 1) 11

Genes
Genes comprise only about 2% of the human genome The remainder consists of non-coding regions, Function: providing chromosomal structural integrity and regulation - where, when, and in what quantity proteins are made Human genome is estimated to contain ~25,000 genes
Dr Choo QC (TOPIC 1) 12

5/27/2013

09_25_Chromosome22.jpg

There are 23 Chapters, called CHROMOSOMES: All the chapters being bind together called FOLDINGS Each chapter contains several thousand stories, called GENES Each story is made up of paragraphs, called EXONS which are interrupted by advertisements called INTRONS Each paragraph is made up of words, called CODONS Each word is written in letters called BASES which is glued together with BONDS And this is what that made up the GENOME

Dr Choo QC (TOPIC 1)

13

Dr Choo QC (TOPIC 1)

14

Proteins
Large, complex molecules made up of chains of small chemical compounds called amino acids Perform most life functions Majority of cellular structures

Proteins
Nucleotide sequence can be translated into amino acid sequence using the universal genetic code Chemical properties that distinguish the 20 different amino acids Cause the protein chains to fold up into specific three-dimensional structures that define their particular functions in the cell
15 Dr Choo QC (TOPIC 1) 16

Dr Choo QC (TOPIC 1)

5/27/2013

Why Study Proteins?


Genomes Proteins Information Action

Proteins rely on their regular three-dimensional structure for function. They have to have the right shape and chemistry to carry out their biological role This means bringing together amino acids, not only in a particular sequence, but also spatial relationship
Dr Choo QC (TOPIC 1) 17 Dr Choo QC (TOPIC 1) 18

Amino Acids
Monomeric building blocks of proteins Joined by covalent bond (peptide bond) Twenty different amino acids - Same general structure - Differ in side chain (R group) - All organisms have same set of 20 Different activities and shapes of proteins due to different amino acid sequences
Carboxyl group Amino group Dehydration synthesis PEPTIDE BOND

What Amino Acids Look Like


side chain R H amino group C H H O OH C carboxyl group

Different side-chain (R group) Different chemical and physical properties


19 Dr Choo QC (TOPIC 1) 20

Dr Choo QC (TOPIC 1)

5/27/2013

Three Classes of Amino Acids - Classification Based on Polarity


(A) Nonpolar (hydrophobic)

(B) Polar uncharged

Y N

(C) Charged : basic and acidic

P D E

Dr Choo QC (TOPIC 1)

21

Dr Choo QC (TOPIC 1)

22

Primary Structure

Determines 2o, 3o, 4o structures

THREE-DIMENSIONAL STRUCTURES OF PROTEINS

EXAMPLE: Sickle cell anemia - Single amino acid change in hemoglobin related to disease

Dr Choo QC (TOPIC 1)

23

Dr Choo QC (TOPIC 1)

24

5/27/2013

Sickle Cell Hemoglobin (HbS)


Normal mRNA

Sickle Cell Hemoglobin (HbS)


Caused by a point mutation (single A to T) Amino acid substitution (Glu to Val) in the 6th position of -globins 146 amino acid chain Low oxygen conditions cause Hb to aggregate into rod-shaped polymers This distort the shape of RBC to a sickle shape
Dr Choo QC (TOPIC 1) 26

Normal protein

GUG CAC CUG ACU CCU GAG GAG AAG val his leu thr pro GLU glu lys 1 2 3 4 5 6 7 8 Mutation (in DNA)

Mutant mRNA

Mutant protein

GUG CAC CUG ACU CCU GUG GAG AAG val his leu thr pro VAL glu lys 1 2 3 4 5 6 7 8

NOTE: Glu is a negatively charged amino acid and it is replaced by Val, which has no charge
Dr Choo QC (TOPIC 1) 25

Primary Structure
The amino acid sequence or polypeptide chain Primary structure determines final shape and function

2o Structure
(i) -helix (ii) -sheet (iii) Loops and turns
(ii) (iii)

(i)

Secondary Structure
Repeated coiling or folding of the polypeptide by hydrogen bonding Local description of structure Major Types: -helix -sheets Loops & turns
Dr Choo QC (TOPIC 1) 27

Dr Choo QC (TOPIC 1)

28

5/27/2013

-Helix
Amino hydrogen (N-H) on nth residue bonds with carbonyl oxygen (C=O) located 4 amino acids away (nth + 4) A common secondary structure in both fibrous and globular proteins Side chain groups point outwards from helix Amino acids with bulky side chains less common in -helix Glycine and proline destabilizes -helix
Dr Choo QC (TOPIC 1) 29

-Strand and -Sheet


Strands may be parallel / antiparallel Anti-parallel -sheets are more stable Side chains point alternately above and below the plane of the beta-sheet -sheet are common motifs in proteins

Dr Choo QC (TOPIC 1)

30

Loops and turns = Non-repetitive structure


Loops Loops usually contain hydrophillic residues Connect -helices and -sheets Turns Loops with < 5 amino acids are called turns Allows the peptide chain to reverse direction Proline and glycine are prevalent in -turns
Dr Choo QC (TOPIC 1) 31

3o Structure
Third level of protein organization 3-D arrangement

Hydrophobic Interaction

Types of tertiary interactions


(A) Bonds: Covalent Ionic Hydrogen (B) Hydrophobic interactions

Dr Choo QC (TOPIC 1)

32

5/27/2013

4o Structure
Describes the organization of subunits in a protein with multiple subunits Subunits held together by non-covalent interactions

Proteins
Many new protein sequence data are now being determined by translation of DNA sequences, rather than by direct sequencing of proteins (an expensive procedure) However, one should remember that the protein sequence translated from the genome sequence is a hypothetical structure until it is verified experimentally
33 Dr Choo QC (TOPIC 1) 34

22
Dr Choo QC (TOPIC 1)

Proteomics
Proteome - complete set of proteins produced within a cell Proteomics - the study of proteins Proteome of an organism changes depending on its environment stimulus (like heat shock, growth) Rate of synthesis of different proteins varies among different tissues, different cell types and state of activity
Dr Choo QC (TOPIC 1) 35

Picking out genes from genomes


Bioinformatics software can assist scientist in finding novel genes from genome The software identifies open reading frames or ORFs - a region of DNA sequence that begins with an initiation codon (ATG) and ends with a stop codon An ORF is a potential protein-coding region
Dr Choo QC (TOPIC 1) 36

5/27/2013

1977 First viral genome Sanger et al. sequence bacteriophage X174 This virus is 5386 base pairs (encoding 11 genes) Note: Accession J02482

Genome Sequencing Projects

1981 Human mitochondrial genome 16,500 base pairs 1986 Chloroplast genome 156,000 base pairs (most are 120 kb to 200 kb) 1995 First genome of a free-living organism, the bacterium Hemophilus influenzae

Dr Choo QC (TOPIC 1)

37

Dr Choo QC (TOPIC 1)

38

1997 More bacteria and archaea Escherichia coli 4.6 Mb; 4200 proteins (38% of unknown function) 1998 First multicellular organism Nematode Caenorhabditis elegans 97 Mb; 19,000 genes. 1999 First human chromosome Chromosome 22 (49 Mb, 673 genes)
Dr Choo QC (TOPIC 1) 39 Dr Choo QC (TOPIC 1) 40

10

5/27/2013

Human Genome Project (and others)


Potential benefits
(A) Molecular medicine:

Significance and Importance of Genome Studies

Improved diagnosis of disease - lead to more accurate diagnosis Earlier detection of genetic predispositions to disease Will be able to assess risk for certain diseases e.g. cancer, Type II diabetes, heart disease Drugs designed to target specific gene products that cause disease Gene therapy and control systems for drugs Replacement of defective genes for certain diseases Pharmacogenomics "custom drugs Drug therapy based on genotype
41 Dr Choo QC (TOPIC 1) 42

Dr Choo QC (TOPIC 1)

Human Genome Project (and others)


(B) Archaeology, anthropology, evolution, and human migration
Study evolution through germline mutations in lineages Study migration of different population groups based on female genetic inheritance Study mutations on the Y chromosome to trace lineage and migration of males Compare breakpoints in the evolution of mutations with ages of populations and historical events
Dr Choo QC (TOPIC 1) 43

Human Genome Project (and others)


(C) DNA forensics (identification) Identify potential suspects whose DNA may match evidence left at crime scenes Exonerate persons wrongly accused of crimes Identify crime and catastrophe victims Establish paternity and other family relationships Identify endangered and protected species as an aid to wildlife officials (could be used for prosecuting poachers) Detect bacteria and other organisms that may pollute air, water, soil, and food Determine pedigree for seed or livestock breeds

Dr Choo QC (TOPIC 1)

44

11

5/27/2013

Human Genome Project (and others)


(D) Agriculture, livestock breeding, and bioprocessing
Disease-, insect-, and drought-resistant crops Healthier, more productive, disease-resistant farm animals More nutritious produce Biopesticides Edible vaccines incorporated into food products New environmental cleanup uses for plants
Dr Choo QC (TOPIC 1) 45 Dr Choo QC (TOPIC 1) 46

Genomes of Prokaryotes

Gene Regulation in Bacteria

The Operon

Dr Choo QC (TOPIC 1)

47

Dr Choo QC (TOPIC 1)

48

12

5/27/2013

The Operator

Operons in E. coli

Dr Choo QC (TOPIC 1)

49

Dr Choo QC (TOPIC 1)

50

(a) Lac operon (Inducible Operon)

Lac operon (Inducible Operon)

Dr Choo QC (TOPIC 1)

51

Dr Choo QC (TOPIC 1)

52

13

5/27/2013

Regulatory gene

Promoter Operator
lac operon

DNA

lacI

lacZ
DNA lacI RNA polymerase mRNA 5 3 mRNA 5 lacZ lacY lacA

3 mRNA 5 RNA polymerase

No RNA made

Protein Allolactose (inducer) Inactive repressor

-Galactosidase

Permease

Transacetylase

Protein

Active repressor

(b) Lactose present, repressor inactive, operon on

(a) Lactose absent, repressor active, operon off


Dr Choo QC (TOPIC 1) 53 Dr Choo QC (TOPIC 1) 54

(b) Trp operon (Repressible Operon)


A repressible operon that is always on Only turns off in the presence of its end product, the amino acid tryptophan Produces enzymes for production of tryptophan Structural genes present within the tryptophan operon code for repressible enzymes
Dr Choo QC (TOPIC 1) 55

(a) Tryptophan absent, repressor inactive, operon on

Dr Choo QC (TOPIC 1)

56

14

5/27/2013

DNA No RNA made mRNA

Inducible & Repressible Enzyme


Inducible enzymes usually function in catabolic pathways - synthesis is induced by a chemical signal Repressible enzymes usually function in anabolic pathways - synthesis is repressed by high levels of the end-product Regulation of the trp and lac operons involves negative control of genes because operons are switched off by the active form of the repressor
Dr Choo QC (TOPIC 1) 58

Protein

Active repressor Tryptophan (corepressor)

(b) Tryptophan present, repressor active, operon off


Dr Choo QC (TOPIC 1) 57

Genome of prokaryotes
Large circular, double-stranded DNA Usually < 5 Mbp May contain plasmids Environment-specific genes on plasmids and other types of mobile genetic elements

Genomes of prokaryotes
The protein-coding genomes: regions of bacterial

Do not contain introns Partially organized into operons Genes that are located alongside one another transcribed into single mRNA molecule, under common transcriptional control

Dr Choo QC (TOPIC 1)

59

Dr Choo QC (TOPIC 1)

60

15

5/27/2013

Genomes of prokaryotes
In bacteria, the genes of many operons code for proteins with related functions For instance, successive genes in the trp operon of E. coli code for proteins that catalyze successive steps in the biosynthesis of tryptophan

Genome of Escherichia coli

Dr Choo QC (TOPIC 1)

61

Dr Choo QC (TOPIC 1)

62

Genome of Escherichia coli


Contains 4,639,221 bp in a single circular DNA molecule, with no plastids Relatively gene dense Gene coding for proteins or structural RNAs occupy ~89 % of the sequence Average size of an ORF is 317 amino acids

Genome of Escherichia coli


Most of the transcribe units contain only 1 gene but E. coli also has operons where a set of genes grouped at one place It is estimated that E. coli genome contains 2584 operons Operons vary in size, although few contain more than five genes Genes within operons tend to have related functions
Dr Choo QC (TOPIC 1) 64

Dr Choo QC (TOPIC 1)

63

16

5/27/2013

Genome of Eschericia coli


The largest class of proteins is the enzymes approximately 30 % of total genes Many enzymatic functions are shared by more than one protein - arisen by duplication or differ in specificity, regulation or intracellular location

Genome of Escherichia coli


4288 protein-coding genes 122 structural RNA genes Non-coding repeat sequences Regulatory elements Transposable elements Prophage remnants Patches of unusual composition - likely to be foreign elements introduced by horizontal gene transfer
Dr Choo QC (TOPIC 1) 66

Dr Choo QC (TOPIC 1)

65

Dr Choo QC (TOPIC 1)

67

Dr Choo QC (TOPIC 1)

68

17

Das könnte Ihnen auch gefallen