Sie sind auf Seite 1von 81

WEHI Postgraduate Seminar Series 2003

Tools for Maximising the Value


of Genomic Data

Keith Satterley, Bioinformatics,


The Walter & Eliza Hall Institute of Medical research
2nd. June 2003
keith@wehi.edu.au
http://bioinf.wehi.edu.au/resources/presentations.html

Overview
1.

Genomic data what is it, where is it


1.

Gene Finding
1.

2.

GenScan

Comparitive Genomics
1.

Gene Finding
1.
2.

3.

Finding Regulatory Regions


1.
2.
3.

2.

rVista
Consite
Toucan

Programming Tools
1.

Languages
1.
2.
3.
4.

2.

3.
4.

Slam
Twinscan

Perl
BioPerl
BioJava
Bio???

Slipper-a Perl program & results

Link References
Aknowledgements

1953

2003

http://www.geneticscongress2003.com/index.php

Genomic data
Whole genome data sets. According to
http://www.ebi.ac.uk/genomes/ as at 28-May-03

Archea 16
Bacteria 107
Organelles 308
Phages 112
Plasmids 280
Viroids 40
Viruses 880

TOTAL:1743

Eukaryota (completed chromosomes)


Description

Chromosomes

Anopheles gambiae: Ensembl project data

2L 2R 3L 3R X

MUSTARD: Arabidopsis thaliana complete genome:

I II III IV V

I II III IV V
Proteome pages

WORM:Caenorhabditis elegans complete genome

I II III IV V
X

FastA
Proteome pages

FLY: Drosophila melanogaster complete genome

X,2-4,Y

FastA
Proteome pages

Encephalitozoon cuniculi complete genome

I II III IV V
VI VII VIII IX X
XI

I II III IV V
VI VII VIII IX X
XI

Ensembl project data

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
21 22 X Y

Proteome pages

Homo sapiens complete genome parts: CON files

14 21q

Leishmania major

source MIPS

HUMAN:Homo sapiens complete genome:

MOUSE:Mus musculus complete genome:


Ensembl project data

123456
7 8 9 10 11
12 13 14 15
16 17 18 19 X

Oryza sativa

Plasmodium falciparum

1 2 3 4 5 6 7 8 9 10 11 12 13 14

RAT:Rattus norvegicus complete genome:


Ensembl project data

YEAST:Saccharomyces cerevisiae strain S288C complete


genome

YEAST:Schizosaccharomyces pombe strain 972hcomplete genome

Trypanosoma brucei
http://www.ebi.ac.uk/genomes/

Proteins

Proteome pages

123456
7 8 9 10 11
12 13 14 15
16 17 18 19 20 X
I-XVI

FastA
Proteome pages

I-III

I II III
Proteome pages

http://gnn.tigr.org/sequenced_genomes/genome_guide_p1.shtml

gnn.tigr.org

GOLD Genomes Online Database

http://www.genomesonline.org/

Its a Fact:
Count @ 1 base per second, 24 hours a day,
It would take you about
95 years to count the DNA in one cell.

Francis Collins
Director, National Human Genome Research Institute
25th. April 2003
Here in the very month of the 50th anniversary of the
discovery of DNAs double helix, I am pleased and honored
perhaps I should say exhilarated to declare the goals
of the Human Genome Project to be completed.

3.1 Million years


To count to 100
Trillion!
..the information that will matter to you about your life is a fraction of your genetic
code probably less than 1 percent. J.Craig Venter, 25-04-2003(Bio-IT World)
http://www.genomesonline.org/

Most Recent Genomics News


BETHESDA, Md., May 20, 2003
By June, researchers from the Whitehead/MIT Center and the
Genome Sequencing Center at Washington University School
of Medicine expect to complete the sequencing work
(approximately four-fold coverage) necessary to create an
initial working draft of the genome of the chimpanzee (Pan
troglodytes).
The Whitehead/MIT team expects to complete a high-quality
draft of the dog genome sequence within the next 12 months.
After the genome of the boxer is sequenced, researchers plan
to sample and analyze DNA from 10 to 20 other dog breeds,
including the beagle, to study genetic variation within the
canine species.
http://www.genome.gov/11007358

Gene Finding
Gene finding is about detecting coding regions and
inferring gene structure.
Gene finding is difficult.
DNA sequence signals have low information
content (degenerated and highly unspecific)
It is difficult to discriminate real signals
Sequencing errors
Prokaryotes: High gene density and simple gene structure,
Short genes have little information, Overlapping genes.

Eukaryotes: Low gene density and complex gene structure


Alternative splicing, Pseudo-genes.

Gene Finding
A Good Gene Finding Review has been
prepared by Lorenzo Cerutti of the Swiss
Institute of Bioinformatics. It is an EMBNet
course, (September 2002) entitled Gene
Finding.
It is at:
http://www.ch.embnet.org/CoursEMBnet/Pages02/slides/gene_finding.pdf

Gene Finders

GenScan - Uses generalized hidden Markov models to predict complete gene structure
http://genes.mit.edu/GENSCAN.html

MZEF - Designed to predict only internal coding exons.

http://www.cshl.org/genefinder
FGENES Uses linear discriminant analysis.
http://genomic.sanger.ac.uk/gf/gf.shtml

GeneFinder:
http://www.cshl.org/genefinder

GRAIL 1,1a,2
http://compbio.ornl.gov

HMMgene - Designed to predict complete gene structure.


http://genome.cbs.dtu.dk/services/HMMgene

Genewise - Uses HMMs. Genewise is part of the Wise2 package:

Procrustes - Predicts gene structure from homology found in proteins.

http://www.sanger.ac.uk/Software/Wise2.

http://hto-13.usc.edu/software/procrustes/index.html

GeneMark.hmm. Recently modified to predict gene structure in eukaryotes.


http://opal.biology.gatech.edu/GeneMark

Geneid. Recently updated to a new and faster version.


http://www1.imim.es/geneid.html

Gene Finders

Gene Finders
1. Overall performances are the best for HMMgene and GENSCAN.
2. Some programs accuracy depends on the G+C content, except for
HMMgene and GENSCAN, which use different parameters sets for different
G+C contents.
3. For almost all the tested programs, medium exons (70-200 nucleotides
long), are most accurately predicted. Accuracy decrease for shorter and
longer exons, except for HMMgene.
4. Internal exons are much more likely to be correctly predicted (weakness of
the start/stop codon detection).
5. Initial and terminal exons are most likely to be missed completely.
6. Only HMMgene and GENSCAN have reliable scores for exon prediction.

Gene prediction limits


1.
2.
3.
4.
5.
6.
7.

Existing predictors are for protein coding regions


Non-coding areas are not detected (5 and 3 UTR)
Non-coding RNA genes are missed
Predictions are for typical genes
Partial genes are often missed
Training sets may be biased
Atypical genes use other grammars

GenScan
GENSCAN was developed by Chris Burge and

Samuel Karlin, Department of Mathematics, Stanford University

Genscan is a general probabilistic model of the


gene structure of human genomic sequences.
Genscan identifies complete exon/intron structures
of genes in both strands of genomic DNA.
The new Genscan Web Server is at
http://genes.mit.edu/GENSCAN.html
Genscan is also available for WEHI people at
http://www.wehi.edu.au/resources/PBC/index.html
with a greater choice of options.
Prediction of Complete Gene Structures in Human Genomic DNA. J. Mol. Biol. (1997) 268, 78-94

Comparitive Genomics

Quotes from the 50/50 series of


interviews by Bio-IT World
Gene Myers
Professor, Dept. of Electrical Engineering & Computer Sciences
University of California, Berkeley .

If you take a sequence and just run a gene


prediction program on it, the programs dont
usually do very well. But if you take human
and mouse sequence, and compare them
against each other looking for similar
regions you get better predictions. And
the more genomes we have, the better it will
get.

Quotes from the 50/50 series of


interviews by Bio-IT World
Richard Durbin
Head of Informatics, Wellcome Trust Sanger Institute.

Looking at the similarity between the human


genome and other species is a really powerful way
to get at functional sequences and to allow us to
work on them in different species.
Several groups, including ours, have gene-finding
methods for comparative genomics. This is an
active area where we will see significant advances
in the next few years.

Comparative Genomics

The Assumption that underlies comparitive genomics is that the two


genomes had a common ancestor and that each organism is a combination
of the ancestor and the action of evolution.

Evolution can be broadly thought of as the combination of two processes:


mutational forces that generate random mutations in the genome
sequence, and selection pressures that
1. Eliminate random mutations (negative selection),
2. Have no effect on mutations (neutral selection) or,
2. Increase the frequency of mutant alleles in the population as a result
of a gain in fitness (positive selection).

The combined action of mutation and selection is represented generally by a


RATE MATRIX of base-pair changes between the two observed genomes.

Human

Comparative Genomics

Mouse
Rat

Evolutionary
relationship
between metazoans
that are sequenced,
or due for
sequencing.
Evolutionary
distances are in
millions of years.

C.Elegan
s

Comparitive Genomics
Comparative genomics may be defined as the

derivation of genomic information following


comparison of the information content of 2 or
more species genome sequences

There is a good article in Nature Genetics Reviews, April


2003 Vol 4 No 4,pp251-262.

Comparative Genomics: Genomice-Wide


analysis in Metazoan Eukaryotes,
Ureta-Vidal, A. Laurence Ettwiller &
Ewan Birney 2003

http://www.nature.com/cgi-taf/DynaPage.taf?file=/nrg/journal/v4/n4/full/nrg1043_fs.html

The similarity is such that human chromosomes can be cut


(schematically at least) into about 150 pieces (only about 100 are
large enough to appear here), then reassembled into a reasonable
approximation of the mouse genome.
http://www.ornl.gov/TechResources/Human_Genome/graphics/slides/ttmousehuman.html

Comparitive Genomics
there has been an explosion in the
availability of tools which may make it
difficult to decide which tool is most
suitable for your research.
Indeed, to interpret these resources, you
must be aware of the differences between
them and between their underlying
assumptions.

Whole Genome Alignments


Kbrowserhttp://hanuman.math.berkeley.edu/cgibin/kbrowser
Amultiplegenomebrowser,currentlysetupforhuman,mouseandrat
basedontheMAVIDalignments,UCSCgenomebrowser.

Comparative Gene Prediction


SLAM
http://baboon.math.berkeley.edu/~syntenic/slam.html

Exampleofacomparativegenefinder
EmploysageneralisedpairhiddenMarkovmodel
approachforpredictinggenestructureswithin
syntenicgenomicsequences
Performinggenefindingandalignmentofthe
sequencessimultaneously

SLAM
SLAM has been used for whole genome annotation projects.
For the Mouse/Human analysis, SLAM used a human/mouse sytenny map,
giving segments which are further broken up into 300kb pieces.
These pieces are aligned by AVID .
SLAM then ran on all syntenic pieces using AVID alignments as guides.
Coding lengths < 120 were discarded.
SLAM also predicted conserved non coding regions(CNS), the first de novo
prediction of CNS in the human and mouse genome.
The results are available at
http://bio.math.berkeley.edu/slam/mouse/
A similar result is available for Human/Rat.

seq1 SLAM CDS 2421 2478 . + 2 gene_id "000001"; transcript_id "000001.1"; frame "1"; exontype "internal"
seq1 SLAM CDS 3127 3805 . + 1 gene_id "000001"; transcript_id "000001.1"; frame "1"; exontype "internal"
-------------------------------------------------------------------------------------------------------------------------------------------------------------seq2 SLAM CDS 2134 2191 . + 2 gene_id "000001"; transcript_id "000001.1"; frame "2"; exontype "internal"
seq2 SLAM CDS 2867 3545 . + 1 gene_id "000001"; transcript_id "000001.1"; frame "2"; exontype "internal
-------------------------------------------------------------------------------------------------------------------------------------------------------------> Protein 1: (244,244) aa (incomplete protein)
Y
Z
...

1 KCEAIASDCF LSGNVDIELK DHNNCISKIN VEDQKNCALS WAFASIYHLE


CE IAS CF LSGNVDIE K D ++C S I
E+Q NC LS W F S HLE
1 TCERIASSCF LSGNVDIEWK DKSSCFSSIE TEEQGNCNLS WLFTSKTHLE

50
50

http://baboon.math.berkeley.edu/~syntenic/slam.html

TwinScan
One of the first gene predictors to substantially
exceed the performance of GENSCAN on a
genomic scale by using mousehuman
comparison was TWINSCAN (Korf et al. 2001).

http://genes.cs.wustl.edu/query.html

Other Comparative Gene


Predicters
DoubleScan -http

://www.sanger.ac.uk/cgibin/doublescan/submit

It is a program for comparative ab initio prediction of


protein coding genes in mouse and human DNA.
Generates exon candidates in both sequences.
SPG-1....http://soft.ice.mpg.de/sgp-1
SGP-1 is a similarity based gene prediction
program. Given two genomic DNA sequences it
post-processes the pairwise local alignment to
predict single or multiple gene models of protein
coding genes in forward and reverse strands.

Regulatory Sequence

Regulatory Sequence
Leroy Hood brought out this point in
his talk at the Bio2001 meeting in San
Diego (2428 June 2001) with his statement
that
The difference between man and
monkey is gene regulation.

Quotes from the 50/50 series of


interviews by Bio-IT World
Lincoln Stein
Associate Professor, Cold Spring Harbor Laboratory .

I think the places that we should be looking at


now are the non-repetitive, unique, noncoding DNA. If they are conserved, they
must be important. There are discoveries in
there.

Finding regulatory regions


rVISTA. . . . . . . . . . . . . . . . . . . . . . .
http://teapot.jgi-psf.org/ovcharen/rvista/index.html
Consite. . . . . . . . . . . . . . . . . . . . . . .
http://forkhead.cgb.ki.se/cgi-bin/consite
Footprinter. . . . . . . . . . . . . . . . . . .
http://abstract.cs.washington.edu/~blanchem/FootPrinterWeb/FootprinterInput.pl

Toucan. . . . . . . . . . . . . . . . . . . . . . .
http://www.esat.kuleuven.ac.be/~saerts/software/toucan.php/
Trafac . . . . . . . . . . . . . . . . . . . . . . . .
http://trafac.chmcc.org/trafac/index.jsp

VISTA is a set of tools for comparative genomics. It was designed to visualize


long sequence alignments of DNA from two or more species with annotation
information.
The alignment engine behind VISTA. AVID is a program for globally
aligning DNA sequences of arbitrary length.

mVISTA (main VISTA) A program for visualizing alignments of an


arbitrary number of genomic sequences from different species
rVISTA (regulatory VISTA) combines transcription factor binding sites
database search with a comparative sequence analysis.
http://teapot.jgi-psf.org/ovcharen/rvista/index.html

rVista
http://teapot.jgi-psf.org/ovcharen/rvista/index.html

A program that combines transcription factor


binding site (TFBS) searches with
comparative sequence analysis.
At the first step, human and mouse sequences are aligned using the global
alignment program MAVID.
At the second step, potential transcription factor binding sites were predicted
by Match program based on TRANSFAC Professional 5.3 library.
At the third step, the human-mouse sequence conservation of a DNA
region spanning a transcription factor binding site was assessed using a
novel strategy.
Human and/or mouse annotation determine the genomic location of each
predicted transcription factor hit.

Finding Regulatory Regions


rVista
A program that combines transcription factor
binding site (TFBS) searches with comparative
sequence analysis.

ConSite
http://forkhead.cgb.ki.se/cgi-bin/consite

Identification of conserved regulatory


elements by comparative genome analysis

Boris Lenhard*, Albin Sandelin*, Luis


Mendoza*, Pr Engstrm*,
Niclas Jareborg* and Wyeth W Wasserman*
BioMed Central - Open Access
Journal of Biology

ConSite - Identification of conserved regulatory


elements by comparative genome analysis

Consite is a web-based tool for detecting


transcription factor binding sites in
genomic sequences using phylogenetic
footprinting.
Two orthologous genomic sequences are
aligned, and transcription factor binding
sites are only reported for those regions in
the alignment which transcend a certain
treshold of conservation.

ConSite
The method is implemented as a graphical web
application, ConSite, which is at:
http://forkhead.cgb.ki.se/cgi-bin/consite or
http://www.phylofoot.org/
Various tools are made available at phylofoot.org.

http://www.phylofoot.org/

Sequence View

http://www.phylofoot.org/

http://www.phylofoot.org/

Toucan
http://www.esat.kuleuven.ac.be/~saerts/software/toucan.php

Toucan is a workbench for regulatory


sequence analysis on metazoan genomes :
comparative genomics, detection of significant
transcription factor binding sites, and detection
of cis-regulatory modules (combinations of
binding sites) in sets of coexpressed/coregulated
genes.
Standalone Java application that is tightly linked
with Ensembl, and was built using the BioJava
package

Perl A Programming Language.


What is Perl?
Perl actually stands for
Practical Extraction and Report Language,
and was invented by Larry Wall.
Perl is supported by its users and was all
written by volunteers.

Programming Tools

Perl
Perl is remarkably good for slicing, dicing,
twisting, wringing, smoothing,
summarizing and otherwise mangling text!
Perl's powerful regular expression
matching and string manipulation
operators simplify this job in a way that is
unequalled by any other modern
language.

Perl & Genome Data


Although genome informatics groups are
constantly tinkering with other "high level"
languages such as Python, Tcl and recently Java,
nothing comes close to Perl's popularity.
In short, when the genome project was
foundering in a sea of incompatible data formats,
rapidly-changing techniques, and monolithic data
analysis programs that were already antiquated
on the day of their release, Perl saved the

day.

Lincoln Stein.

Perl one-Liners!
Take a blast output and print all of the
gi's(Genbank Identifiers) matched, one per
line.
Solution one line of Perl.
perl -pe 'next unless ($_) = /^>gi\|
(\d+)/;$_.="\n"' filename.blast

Perl Modules/Programs
Perl can be used for complex programs.
The RepeatMasker program is written in Perl. It
calls other programs written in other
languages(Crossmatch written in C).
Slipper is a 4500 line program written in Perl. It
calls Repeatmasker and Primer3 repeatedly and
processes the output files from them, writing
summarised results to disk.

SLiPPER
Sequence Length Polymorphism and Primer FindER

Programming: Keith Satterley, Specifications: Grant Morahan


Division of Bioinformatics & Genetics,
The Walter & Eliza Hall Institute of Medical Research

Slipper

Masks Alu etc. repeats (using RepeatMasker);


Selects SSLRs with user-specified parameters;
Designs primers (using Primer3)
Grant Morahan selects and tests chosen SSLRs
to become Microsatelite Markers on the Mouse
Genome.
To derive a first generation systematic map of the
mouse, with sub-centiMorgan (1Mb) resolution.
Extend to 10 times this density over 50 strains.

UTILITY OF SLIPPER
40
Dimer repeats
Multimer repeats

O* = polymorphic
O* B6 = NZO

35

30

Number
of SSRs

25

O*

20

O*
15

10

O*
O*

O*

O*
O*

O*

O*

O* O*

0
0

20 00 00

40 00 00

60 000 0

800 00 0

1 00 0 00 0

1 20 0 000

14 000 00

Position on chromosome (bp)

Possible SLIPPER Data Analysis


STRAIN RELATEDNESS AND EVOLUTION

-graphic depiction of allele sharing between strains


-probability of IBD v allele convergence by mutation
-comparison of close strain relatedness
(eg B6 v b10; B6 v Ka; D1 v D2; NOD v NOR)
-overall strain relatedness > cladogram
-pairwise strain dimorphism rate
useful for choosing 2 strains to be used in a cross
-comparison of results for reduced strains set with MIT markers
- comparison of haplotypes with Phenome database

O|B|F - Open Bioinformatics


Foundation
The Open Bioinformatics Foundation is a non profit,
volunteer run organization focused on supporting open
source programming in bioinformatics.
The foundation grew out of the volunteer projects
Bioperl, BioJava and Biopython.
Underwrites and supports the BOSC conferences.
Organizing and supporting developer-centric "hackathon"
events.
Managing servers, bank account & other assets.

Open Bioinformatics Foundation


PROJECTS
BioPerl
BioJava
BioPython
BioRuby
BioPipe
BioSQL / OBDA
MOBY
DAS
BioPathways
EMBOSS

Open Bioinformatics Foundation

June 27-28 2003 -- 4th Annual Bioinformatics


Open Source Conference
www.open-bio.org/bosc/

ISMB 2003 - Brisbane

Normally held in Europe and Nth. America.


For 2 days beforehand
BOSC(Open Source conference) .
Biopathways, BioOntology, Text Mining & WEB03

Tutorials on Sunday choose 2 from 15 offered.


ISMB for 4 days over 50 no parallel talks!
http://www.iscb.org/ismb2003/index.shtml

The bioperl project


Officially organized in 1995 and existing
informally for several years prior, The
Bioperl Project is an international
association of developers of open source
Perl tools for bioinformatics, genomics and
life science research.

What is BioPerl
Bioperl is a tookit of perl modules useful in
building bioinformatics solutions in perl.
It is built in an object-oriented manner
The collection of modules can be used to
run a large range of Bioinformatics
programs and process their output files.
There are modules to carry out analyses,
to graph data and to read many data
formats.

BioJava
http://www.biojava.org/
The BioJava Project is an open-source
project dedicated to providing Java tools for
processing biological data.
BioJava is a general bioinformatics toolkit. It
provides a framework for building everything
from simple scripts to complete applications.
BioJava is designed to be used as a library.

BioJava
http://www.biojava.org/

Currently, there are objects for:


Sequences and features

Dynamic programming

Single-sequence and pair-wise HMMs


Viterbi-path, Forward and Backward algorithms
Training models
Sampling sequences from models

External file formats and programs

IO
Processing, storing, manipulating
Visualising

GFF
Blast
Meme

Sequence Databases

BioCorba interoperability
ACeDB client
DAS client

Other Open Source Projects.


BioDAS - Distributed Annotation System
(DAS) - A server system for the sharing of
Reference Sequences.
Biopython. tools for computational
molecular biology. Python(excellent
language for beginners, yet superb for
experts).
BioRuby, BioSQL, MOBY,BioPathways
and BioOpera.

LINKS

Internet Resources
Prediction of exons and gene structure
SLAM....http://baboon.math.berkeley.edu/~syntenic/slam.html
SPG-1....http://soft.ice.mpg.de/sgp-1
TwinScan....http://genes.cs.wustl.edu
Finding regulatory regions by phylogenetic footprinting
Consite....http://forkhead.cgb.ki.se/cgi-bin/consite
rVISTA....http://teapot.jgi-psf.org/ovcharen/rvista/index.html
Toucan....http://www.esat.kuleuven.ac.be/~saerts/software/toucan.php
Whole-genome alignments in genome browser
ECR browser....http://nemo.lbl.gov/ecrBrowser
Ensembl....http://www.ensembl.org
UCSC....http://genome.ucsc.edu
A comprehensive, straightforward Links Page one of the best!

http://apollo11.isto.unibo.it/

Genome Links from Ewan Birney et al.

Genome aligners
AVID....http://www-gsd.lbl.gov/vista/details_avid.htm
BLASTZ....http://bio.cse.psu.edu
BLAT....http://genome.ucsc.edu
Exonerate....http://www.ensembl.org/Docs/wiki/html/EnsemblDocs/Exonerate.html
GLASS....http://crossspecies.lcs.mit.edu
LAGAN/MLAGAN....http://lagan.stanford.edu
MegaBLAST....http://www.ncbi.nih.gov/blast/tracemb.html
MUMmer....http://www.tigr.org/software/mummer
PatternHunter....http://www.bioinformaticssolutions.com/products/ph.php
WABA....http://www.cse.ucsc.edu/~kent/xenoAli/index.html

Prediction of exons or coding regions


DIALIGN2....http://bibiserv.techfak.uni-bielefeld.de/dialign
ExoFish....http://www.genoscope.cns.fr/proxy/cgi-bin/exofish.cgi
OrthoSeq....http://www.phylofoot.org/cgi-bin/orthoseq.cgi
ROSETTA/GLASS....http://crossspecies.lcs.mit.edu

Prediction of exons and gene structure


DoubleScan....http://www.sanger.ac.uk/Software/analysis/doublescan
SLAM....http://baboon.math.berkeley.edu/~syntenic/slam.html
SPG-1....http://soft.ice.mpg.de/sgp-1
TwinScan....http://genes.cs.wustl.edu

Genomics Web sites

Functional and Comparative Genomics Research - More technical information


on HGP involvement with comparative and functional genomics.
Virtual Library of Genetics - Links to genetic and genomic information
organized by organism.
Microbial Genome Program - U.S. Department of Energy program to study the
genetic material of microbes that may be useful in helping DOE fulfill its
missions.
DOE Joint Genome Institute - Consortium of U.S. Department of Energy
researchers developing and exploiting new technologies as a means for
discovering and characterizing the basic principles and relationships
underlying living systems.
A Quick Guide to Sequenced Genomes - Illustrated index of organisms that
have had their genomes sequenced. From the Genome News Network.
Model Organisms for Biomedical Research - Information on model organisms
from the National Institutes of Health.
Mouse Genome Resources - Gateway to mouse resources in and beyond
National Center for Biotechnology Information (NCBI) resources.
Functional Genomics - Gateway to functional genomics sources from Science.
Ecce homology: A primer on comparative genomics - From Modern Drug
Discovery, a publication of the American Chemical Society.

Image Gallery links

http://www.ornl.gov/TechResources/Human_Genome/education/images.html

Image Gallery links


http://
www.ornl.gov/TechResources/Human_Genome/educ
ation/images.html

Gallery 1: Genome Science


Gallery 2: Genome Tools and Technologies
Gallery 3: Genomes to Life
Gallery 4: Human Genome Project
Gallery 5: Ethical, Legal, and Social Issues;
Genomic Medicine

http://
www.ornl.gov/TechResources/Human_Genome/educ
ation/images.html
Other Website Image Galleries and
Resources
NIH NHGRI Press Photos
CSHL Eugenics Archive
RasMol Protein Gallery
Photos of normal and abnormal chromosomes
Access Excellence Graphics Gallery
The Why Files Cool Image Gallery
Genetics Animation Gallery

http://
www.ornl.gov/TechResources/Human_Genome/educ
ation/images.html
Molecular Expressions Photo Gallery
Gene Maps
1999 Online Gene Map from NCBI.
Clickable 1996 Gene Map from Science magazine.
You can click on any one of the 24 different human
chromosomes and see examples of genes found.

Chromosome Maps
human chromosome 16
human chromosome 19

http://
www.ornl.gov/TechResources/Human_Genome/educ
ation/images.html

U.S. Government Image Galleries


Argonne National Laboratory Photo Gallery
Brookhaven National Laboratory Image Library
Fermi National Laboratory Photo Database
Jefferson Laboratory Picture Exchange
Lawrence Berkeley National Laboratory Image Gallery
Lawrence Livermore National Laboratory Image Gallery
Los Alamos National Laboratory Photo Gallery
National Human Genome Research Institute Image Gallery
National Renewable Energy Laboratory Photo Library
Oak Ridge National Laboratory Image Gallery
Pacific Northwest National Laboratory Photo Gallery
Stanford Linear Accelerator Center Photo Archives
Sandia National Laboratory Photo Gallery
U.S. DOE Image Gallery

Acknowledgements
WEHI Bioinformatics group
Tim Beissbarth
Alex Gout
Terry Speed
All the others in Bioinformatics who provide a
great environment to work in and with.

Grant Morahan
WEHI ITS - who provide the best infrastructure
of anywhere I know of.

Year by year we are becoming better equipped to


accomplish the things we are striving for.
But what are we actually striving for?
- Bertrand de Jouvenel, 1903-1987

Success is the ability to go


from failure to failure without
losing your enthusiasm.
- Winston Churchill, 18741965