December
1, 1999
Scientists Complete
First Chapter of Book of Life With Decoding of First Human Chromosome
An
international team of researchers has achieved a scientific milestone
by unraveling for the first time the genetic code of an entire human
chromosome.
Reported
in this week's issue of Nature (Dec. 2), researchers at the Sanger
Centre near Cambridge, England; University of Oklahoma, Norman,
OK; Washington University, St. Louis, MO; and Keio University in
Japan have succeeded in deciphering the sequence of the 33.5 million
"letters," or chemical components, that make up the DNA of chromosome
22.
This
sequence includes the longest, continuous stretch of DNA ever deciphered
and assembled. It is over 23 million letters in length.
Each
human gene is made up of a series of chemical building blocks represented
by letters, A (adenine), T (thymine), G (guanine) and C (cytosine).
The number and order of these letters, also called bases, determine
what we are, how we look, and the diseases to which we may be predisposed.
The chromosome 22 team has deduced the text of one chapter of the
human genetic instruction book.
The
next mammoth task is to determine what it all means. Sequencing
and mapping efforts have already revealed that chromosome 22 is
implicated in the workings of the immune system, congenital heart
disease, schizophrenia, mental retardation, birth defects, and several
cancers including leukemia. But, the scientific team agrees that
many more secrets are to be discovered in this decoded text.
The
sequencing of chromosome 22 permits scientists for the first time
to view the entire DNA of a chromosome.
"This
is the first time that we have been able to see the organization
of a chromosome at the base pair level," said Dr. Ian Dunham, senior
research fellow at the Sanger Centre and leader of the research
team that deciphered chromosome 22. "This immediately suggests new
experiments and avenues of research which can be pursued."
"To
see the entire sequence of a human chromosome for the first time
is like seeing an ocean liner emerge out of the fog, when all you've
ever seen before were rowboats," said Dr. Francis Collins, director
of the National Human Genome Research Institute of the National
Institutes of Health which supported the U.S. contribution to the
sequencing of chromosome 22.
University
of Oklahoma scientist Dr. Bruce Roe, one of the researchers who
deciphered the sequence of chromosome 22, added, "It's incredible.
For the first time we can stand back and view a picture of all the
structures and other features of a human chromosome, to see how
a chromosome is organized. Now we can begin to understand where
genes are located on chromosomes, how they express themselves, how
deletions that give rise to disease-causing mutations occur, and
how chromosomes are duplicated and inherited."
Chromosome
22 is the first of 23 human chromosome pairs to be deciphered because
of its relatively small size and its association with several diseases
and because of the groundwork of several scientists beginning in
the early 1990s.
Because
protein-coding genes do not seem to occur on the short arm of chromosome
22, the scientists focused on the chromosome's long arm, which is
richer in genes relative to other human chromosomes. Ninety seven
percent of this arm was sequenced.
The
sequence contains 11 gaps or areas that could not be deciphered
with current technology. The location and size of the gaps were
determined. The 33.5 million bases of sequenced DNA are extremely
high quality with an error rate of less than one in 50,000 bases.
The
sequence reveals the following about the landscape of chromosome
22:
- A total number
of at least 545 genes and 134 pseudogenes (genes that once functioned
but no longer do) were detected on the chromosome, with 200
to 300 additional ones likely. If representative of other chromosomes,
this count suggests that the total number of genes on all human
chromosomes will not be substantially more or less than the
previously estimated number of 80,000.
- The genes
range in size from 1,000 to 583,000 bases of DNA with a mean
size of 190,000 bases. A total of 39 percent of the chromosome
is copied into RNA (exons and introns), while only 3 percent
of the chromosome encodes protein.
- A total of
247 genes were revealed by computer analyses to be identical
to previously identified human genes or protein sequences. Computer
analysis of the chromosome 22 sequence found 150 additional
genes with DNA sequence similarity to known genes. An additional
148 predicted genes containing sequence homologous to known
genetic markers (ESTs) were identified.
- Several gene
families appear to have arisen by tandem duplication. There
are families of genes that are interspersed among other genes
and distributed over large chromosomal regions.
- There is
unexpected long-range complexity of the chromosome with an elaborate
array of repeat sequences near the centromere of the chromosome.
The existence of so much repetitive DNA information could help
explain how this chromosome rearranges or reshuffles its DNA,
leading to human disorders such as DiGeorge syndrome, which
includes a form of mental retardation, and how chromosome structure
changes over time.
- An unexpected
finding shows several regions where recombination is increased,
and others where it is suppressed, and these will probably play
a role in health and disease.
Comparing
the chromosome 22 sequence to known gene sequences of the mouse,
a lab animal frequently used to facilitate understanding of human
genetic disorders, the research team found 160 human genes that
have comparable sequences in the mouse. Examining the chromosomal
locations of the mouse genes that have counterparts on the human
chromosome 22 shows that the order of the genes along the chromosome
in the two species is genetically conserved, although the mouse
homologs of human genes on chromosome 22 are dispersed to eight
different mouse chromosomal regions.
The
sequencing of the DNA of chromosome 22 was conducted as part of
the international Human Genome Project, which involves scientists
in the U.S., England, Japan, France, Germany and China.
In
deciphering chromosome 22, scientists used the approach that has
been developed and widely tested by the Human Genome Project. This
approach involves sequencing overlapping cloned segments of DNA
from known locations on the chromosome.
Until
now, scientists were uncertain about whether an entire human chromosome
could be sequenced in this manner. For example, they did not know
whether insurmountable problems would prevent assembling their sequencing
data. The presence of a small number of unclonable gaps was not
unexpected, but the scientists carrying out this project adhered
to the agreed upon standard that a chromosome should not be considered
"essentially complete," until the sequence of regions that are clonable
and sequenceable with current technology have been determined to
high accuracy, and the sizes of any remaining gaps have been determined.
"That
chromosome 22 was essentially sequenced by using overlapping clones
increases our confidence that the Human Genome Project will be able
to complete a 'working draft' of the DNA sequence of the human genome
in Spring 2000 and finish it by 2003," said Dr. Richard Wilson,
co-director of the Genome Sequencing Center at Washington University
School of Medicine in St. Louis and member of the research team
that deciphered chromosome 22.
The
results of the Human Genome Project, which are freely accessible
through public databases such as GenBank
(www.ncbi.nlm.nih.gov/genome/seq), give scientists insight into
the way genes are arranged along a strip of DNA and paves the way
for major advances in the diagnosis and treatment of disease.
Knowing
the identity and order of the chemical components of the DNA of
the 23 pairs of chromosomes that are found in almost every human
cell provides a tool to determine the basis of health and disease.
"The fact that all of this information is now freely available for
scientists to use, without the constraints of patents and fees,
is of major importance, if the knowledge of our genetic make-up
is to be used for the good of mankind," said Dr. Michael Morgan,
chief executive of the Wellcome Trust Genome Campus, which is home
to the Sanger Centre.
For
more information contact:
Cathy
Yarbrough, NHGRI
Tel: 301-594-0954 or
e-mail at cyarbrou@mail.nih.gov
webmaster@nhgri.nih.gov
|