Garma Garam
Hulchal: News & Analysis

Saddi Dharti Sadde Log
The land of five rivers
Our Culture & Heritage

Punjabi Millennium
A Saga of Sacrifice & Struggle

Sabhyachaar

Books
Literature
Fiction
Humor
Poetry
Art & Culture...


Faith and Religion 

Sikhism
Sufi and Bhakti Tradition 
Arya Samaj
Hinduism
Islam
Communalism & Secularism


Rasoi
Punjabi Delicacies
Exotic Recipes


Education

Institutions
Studying Abroad
Career...


Tourism

Destination Punjab
Links


Media

Newspapers 
Magazines 
Television
Online 
Radio

More
Health
InfoTech
Science
Environment
Sports
Agriculture
Business
Music
Films
Kidz & Youth
Fashion
  

At Your Service
Weather
Matrimonials 
Free e-mail
Free Web Pages 
Plus

Home

Agriculture  


INTERNATIONAL RICE GENOME SEQUENCING PROJECT



Vision and Goals:

Fundamental plant biological information from a model plant: As a member of the Graminae and a crop plant, a wealth of fundamental information about important aspects of plant biology can be learned from the genomic sequence of rice. Rice is a model for learning about yield, hybrid vigor, single and multigenic disease resistance. Different races of rice are adapted to a wide variety of environmental situations, from tropical flooding to temperate dry land, so it is a model for real life adaptive responses. Because it shares collinear genomes, rice is a key to knowledge of the genomic organization of the other grasses. Comparison of the sequence of the dicot, Arabidopsis thaliana, with that of rice, a model monocot, will tell us what genomic structures these two different groups of angiosperms have in common and how they differ.

While the goals of the International Rice Genome Project must be focused, the information provided by the International Project can be exploited by the entire research community to learn:

The function and map location of cereal and ultimately all plant genes.
Use of map-based sequence information to identify and provide markers for agronomicly significant genes.
The molecular basis of plant growth and development so that fundamental questions in plant physiology, biochemistry, cell biology, and pathology can be addressed.
The relationship, if any, of genome structure to gene expression.


The primary goal is the complete genome sequence of rice.

The primary activity in the first year will be to prepare and distribute clones for sequencing. During this period, it is anticipated that the libraries will be quality controlled and that the clones will be end sequenced and fingerprinted. Subsequent years will be devoted to large-scale genomic sequencing. The objective is to complete the task in ten years.

The time line below for the first five years indicates that greater than 170 MB of the 430 MB genome will be sequenced by 2003, that chromosomes 6 and 10 will have been completed, and the sequencing of chromosomes 1 and 2 will be well underway.


Five Year Plan: 1998-2003.

The purpose of the international collaboration is to accelerate the completion of this goal.

The International Collaboration is best achieved by sharing materials and technologies and by the timely release of sequence and related information. To this end, scientists interested in the genomic sequencing of rice participated in a workshop held in conjunction with the International Symposium on Plant Molecular Biology in Singapore on September 23, 1997. (ftp://genome1.bio.bnl.gov/pub/maize/rice.html). A Working Group, nominated in Singapore, met on February 5, 1998 to develop this document.


Membership in the Rice Genome Project:

Any group willing to sequence large stretches of contiguous genomic DNA is welcome to join the collaborative effort as long as they are willing to following the agreed upon guidelines. Participants agree to share materials, including libraries, and to the timely release to public databases of physical mapping information and annotated DNA sequences. A group must agree to sequence one megabase of DNA per year to maintain membership. Members agree to declare their sequencing plans and to provide detailed plans and progress on their respective web pages.

Individual sequencing groups are encouraged to claim large chromosomal regions or entire chromosomes, if they have the sequencing capacity, to increase the likelihood that entire chromosomes are completed. Groups may claim chromosomal regions which they agree to sequence within one to three years.

Post-sequencing activities, such as functional genomics, are beyond the scope of the International Rice Sequencing Project. Further, the Project does not encompass the cloning and sequencing of specific rice genes for research purposes or industrial sequencing efforts. While the International Project will be happy to share information with these individual efforts, their conduct is beyond the scope of these agreements.


The Rice Genome Working Group:

The Working Group is the body that will make decisions that pertain to the goals, strategies, and coordination of the collaborative effort. The Working Group will be responsible for planning the most efficient means of completing the project. Among its responsibilities will be assigning regions to be sequenced that will avoid duplication and maximize overall progress.

The Working Group is comprised of representatives of each research group participating in the International Rice Genome Sequencing Project. As Japan is recognized as having a leadership role in the Project, the head of the RGP will be the permanent chairman of the Working Group.

Major policy decisions, including sequencing assignments, will be taken by representatives from each of the major national groups participating in the Project. Currently, these regional representatives are Japan, China, Korea, Europe and the U.S.

The Working Group will meet annually in Japan. Interim meetings, as needed, may be held elsewhere. The meetings will be open to the public. Results of Working Group meetings will be posted on web sites and published in the RICE GENOME.


Methodology

The Oryza sativa ssp. japonica cultivar, Nipponbare, also known as GA3, will be sequenced. Seed from a single plant will be distributed by Dr. Sasaki for the purpose of making libraries. The primary reasons for choosing this cultivar are that more than 20,000 EST sequences from the strain have been released to DDBJ and that a physical map based on YACs that covers over 50% of the genome has been published. Sequencing other cultivars is strongly discouraged as genetic polymorphisms cannot be distinguished from sequencing errors. Moreover, groups not sequencing from one of the shared libraries would not benefit from the associated accumulated knowledge and the other advantages of collaboration. It is recognized that comparative mapping and sequencing of other rice subspecies is valuable information that the International Rice Genome Sequencing Project would like to share. Nevertheless, the primary goal of the Project is the complete sequence of the genome of a single cultivar.

The RGP will make a PAC library each with a 20-fold genome coverage. Dr. Rod Wing will make three BAC libraries using partial digests of different enzymes to generate the inserts. 60,000 BAC clones will be isolated to provide a 20-fold coverage of the genome. The quality of these libraries and their coverage will be verified by hybridizing each with 100 single copy EST probes and the number of clones and their insert size will be measured. It is expected that inserts will be greater than 120 kb. The number of clones with organellar DNA and rRNA repeats will also be determined.

The BACs will be fingerprinted for the purposes of preparing contigs and checking the integrity (deletions or rearrangements) of the clones. The information generated will also be invaluable where repeated sequences make BAC end sequences ambiguous. In addition, where there is multi-fold coverage, the assembly program can pick out inserts that have deletions or rearrangements. Fingerprinting information will be publicly available so that individual laboratories can verify the quality of the contigs they plan to sequence.

The RGP plans to increase the number of currently mapped ESTs to 8,000 in order to make their physically map their PAC clones. These mapped ESTs are an unmatched resource in preparing a physical map as they provide sequence, map location, and direction.

In parallel with fingerprinting, the BAC and PAC clones will be subjected to end-sequencing. This should provide an STS every 3 to 4 Kb on average and will allow genome sequencers to pick the clones with minimum overlap.


Accuracy:

The Rice Genome Sequencing Project will serve as a model for all other grasses and cost about $200M. The sequence will be used by other researchers and will thus be scrutinized. It is imperative that these resources not be squandered on inaccurate results. In part, this problem has been addressed by insisting on sequencing DNA from the same cultivar, if not the same plant, to minimize variation due to genetic polymorphism.

Fingerprinting of multiply overlapping inserts is a means of verifying that the BACs chosen for sequencing have not been rearranged. Collinearity with the genome should also be verified by probing restriction enzyme digests of genomic or the appropriate YAC DNA with the BAC and comparing this with digests of the BAC itself.

The Rice Genome Sequencing Project will adopt the standards of The Human Genome Project, established at its Bermuda meetings in 1996 and 1997, which has agreed to accept a standard of less than one error in 10,000 bp. While the level of accuracy is difficult to verify, this standard is typically achieved by a combination of high quality shotgun sequence reads, appropriate redundancy, and the insistence that at least 97% of all bases be sequenced on both strands or on multiple templates using two chemistries. Minimum error estimation values provided by PHRAP of 40 are consistent with this level of accuracy. Less than 1% of the sequence with a PHRAP value of less than 40 is permitted, regions that fall below this value should be indicated. Further, restriction sites predicted from the sequence must conform to observed digest patterns.



Sequence Release:

The Rice Genome Sequencing Project agrees to the immediate release of finished, but not necessarily annotated, sequence in units of intact BAC or PAC inserts. These finished sequences will conform the accuracy standards described above. Release is submission to a public database such as DDBJ, EMBO, or GenBank. In keeping with the NHGRI recommendations, automated release of assemblies greater than 2 Kb to local Web sites is encouraged.



Annotation

Members of the Working Group, while recognizing the importance of annotation to the value of sequence information, view annotation as separate from release of finished sequence. Each sequencing group is responsible for annotating the sequence they contribute. A uniform standard of annotation has been agreed upon that checks the integrity of the sequence, assigns and identifies regions of homologies, delineates potential open reading frames, and names and indicates the beginnings and ends of genes. Common annotation software will be adopted. The annotator must state whether coding sequences and splice sites where determined experimentally or by using software. It is recognized that the use of published CDNA sequences greatly facilitates this task.

A contig will be considered complete when no more clones can be found from available resources.

If gaps cannot be closed, the method of sizing and the reasons for not closing must be stated.

Exact details on how adjacent BACs or PACs were assembled with a minimum overlap of 100 bp should also be stated.

It is hoped that annotation will be expanded to include recognition of genetic markers, ESTs, known genes, and syntenic regions. An annotation workshop is projected for the Working Group meetings.


Rice Genome Database

An integrated database established in Japan will facilitate collaboration, coordinate sequencing work, and provide methods for submitting, using, and sharing information. Sequences will be released to one of the public databases, DDBJ, EMBO, or GenBank. The Rice Genome Database will pick up new submissions from the public databases. The Database will store and manage the annotation information. Each participant will maintain a Web site with a standardized format that describes work in progress and sequences completed. The Database will be linked with the Web sites of each of the Projects participating laboratories and thus be able to maintain a registry of clones being sequenced, monitor progress, and coordinate activities. The database will also be linked with sites that are providing finger printing information and end sequences. With ever expanding databases, annotation is never complete. It may be advisable to assign the task of periodic update of the annotation of rice genomic sequence to the Rice Genome Database.

The larger goals for the Project envision the use of sequence information to provide biological lessons for rice and other cereals. The Rice Genome Database is a means for linking all genomic information related to rice DNA sequence. This information comes from existing genomic databases and from work that derives from DNA sequencing, such as determination of gene function. The Rice Genome Database will thus be linked with other rice and cereal databases and to international groups that will be learning about the function of rice and other cereal genes.



Outreach

To be successful, this large sequencing effort needs the broad support of scientists working on rice and other cereals who will be the potential end users of the sequence information. Ultimately, it is the public at large who supports the project and steps at public education should be undertaken. They must believe that the project is worthwhile, that is well-organized and credible. There are a number of ways that The Rice Genome Sequencing Project will attempt to engender this support:

Timely release of finished, annotated sequence blocks as well as the availability of mapped BACs and YACs. ORYZA will report the results from the Working Group meetings as well as news of the Project. Internet access to The Rice Genome Database will engender awareness and utility of the Project. Publications from participating sequencing laboratories should acknowledge that they are part of the Project.

The IRGSP welcomes the participation in its activities of all scientists who can contribute to its goals.


Last modified on March 4, 1999 by B. Burr.