-
-
|
|
Source:
BAYLOR COLLEGE OF MEDICINE submitted to  |
|
| COMPREHENSIVE IDENTIFICATION AND MAPPING AND CHARACTERIZATION OF HESSIAN FLY GENES USING A INNOVATIVE WHOLE GENOME SEQUENCING APPROACH
|
| |
| PROJECT DIRECTOR: Richards, S.
|
| |
PERFORMING ORGANIZATION
(N/A)
BAYLOR COLLEGE OF MEDICINE
HOUSTON,TX 77030 |
| |
|
NON TECHNICAL SUMMARY:
The Hessian fly is a major pest of US wheat crops, and the world's most important wheat pest. Researchers including many in the US funded by the USDA are trying to find better ways to control this pest and reduce the damage done to wheat crop yields. Often genetic information is required, specifically about gene and protein structure, for both basic and applied research into this pest - for example identification of pesticide resistance genes, and protein sequences of pesticide target proteins to allow better design of pesticide targets, and bacterial expression of pesticide targets allowing interactions between the pesticide and its target to be studied and better understood. The identification of single genes and proteins of interest is an expensive and time-consuming process when conducted a gene at a time. In this proposal we will rapidly and very inexpensively identify, characterize and map every gene in the genome to speed research into this important pest
species. We are applying new massively parallel sequencing technologies to dramatically reduce the cost of sequencing projects of this size from tens of millions of dollars in the late 90's to ~ 5 million dollars around 2003, to $400,000 dollars in this proposal. In other species, the availability of the "toolkit" of genes and proteins that make up an organism has accelerated the progress and results from research dramatically - for example laboratories can now study the entire set of ligand gated ion channels (a target of several major pesticides) with the confidence that they are not missing any, and with the full protein sequence of each of the genes. Whilst until now the high cost of sequencing has made this global approach uneconomic for species with small communities of researchers, the new lower costs make research on insect species uneconomical without a whole genome sequence, and full description of the gene and protein sets. Whilst there is always a delay between the
acquisition of primary basic data and actual results in the field, we have no doubt that the data produced by this proposal will dramatically speed the efforts of Hessian fly researchers to reduce the damage caused by this important pest.
|
| |
| OBJECTIVES:
OBJECTIVES We will identify, characterize and map the vast majority of genes of the wheat pest Mayetiola destructor - the Hessian fly. 1. Generate raw sequence data representing 12-fold coverage of the Hessian fly genome with 19 runs (21 attempted allowing 10% failure rate) of the GS-FLX genome sequencer (454 inc) each run generating 100Mb of sequence in 250bp reads. 2. Generate 32X clone coverage paired-end data with 3kb and 10kb insert sizes using the 454 GS-20. This paired-end data will be used in the assembly process to determine the order and orientation of the majority of contigs in the assembled sequence. 3. Assemble 2Gb of raw 454 GS-FLX sequence reads and paired-end data into sequence scaffolds of ordered and oriented contigs, followed by placement on the existing physical map. 4. Generate ~1,200,000 EST sequences from a variety of Hessian fly tissues, to provide an extensive transcribed sequence data set to drive automated gene identification and annotation.
5. Produce an automated annotation of the assembled Hessian fly genome sequence based on EST data and protein homologies, using the BCM-HGSC import of the Ensembl gene annotation pipeline, and other gene prediction programs including NCBI Gnomen. 6. Deposit data in public databases, and the BCM-HGSC website; establish database collaborations with Flybase and the KSU Arthropod Genomics Center.
|
| |
| APPROACH:
We will generate 12-fold random sequence coverage of the Hessian fly genome using a pyrosequencing technology platform from 454. Additionally, paired end sequence data and transcription data will be generated. This random sequence will be assembled using the Atlas assembly suite of software tools developed at the Baylor College of Medicine Human Genome Sequencing Center into a draft genome sequence. Gene sequences will be annotated automatically using existing annotation software pipelines with reference to extensive transcription sequence data also generated by this project. All results will be placed in multiple publicly accessible data repositories.
|
| |
CRIS NUMBER: 0212900
SUBFILE: CRIS
PROJECT NUMBER: TEXR-2007-04624
SPONSOR AGENCY: NIFA
PROJECT TYPE: NRI COMPETITIVE GRANT
PROJECT STATUS: TERMINATED
MULTI-STATE PROJECT NUMBER: (N/A)
START DATE: Feb 1, 2008
TERMINATION DATE: Jan 31, 2011
GRANT PROGRAM: ENTOMOLOGY/NEMATOLOGY
GRANT PROGRAM AREA: Plant Systems
CLASSIFICATION HEADINGS
KA211 - Insects, Mites, and Other Arthropods Affecting Plants S3110 - Insects F1130 - Entomology and acarology G4.2 - Reduce Number and Severity of Pest and Disease Outbreaks
RESEARCH EFFORT CATEGORIES
| BASIC |
100% |
| APPLIED |
(N/A)% |
| DEVELOPMENTAL |
(N/A)% |
KEYWORDS: hessian fly; wheat pest; genome sequence; pyrosequencing; massively parallel; insect genomics; gene annotation
PROGRESS: Feb 1, 2009 TO Jan 31, 2010
OUTPUTS: Progress report Delays are due to a decision to wait for the 454 titanium platform and assembly difficulties. We have completed all sequence generation goals. Here are the original objectives, interspersed with our current progress: OBJECTIVES 1. Generate raw sequence data representing 120-fold coverage of the Hessian fly: We generated 23X coverage of the Hessian Fly genome with an average read length of 323.2 bp, on the 454 platform, and an additional 12X coverage on the Illumina platform. 2. Generate 32X "clone" coverage paired-end data with 3kb and 10kb insert sizes: We generated 41X clone coverage of the Hessian fly genome in 3kb and 338X clone coverage in 20kb insert sizes. 3. Assemble 2Gb of raw 454 GS-FLX sequence reads and paired-end data into sequence scaffolds of ordered and oriented contigs, followed by placement on the existing physical map. Results: A 0.5 version assembly is available on the BCM-HGSC website (link below) with contig N50 of 9.8kb and Scaffold N50 of 271kb. Unfortunately the 9.8 kb N50 contig length is unlikely to encompassed the majority of genes in single contigs. An improved assembly is being prepared, its current statistics include a contig N50 of 14.1kb and a scaffold N50 of 1.06Mb. Whilst these statistics are significantly better, we require further improvements in the contig N50 before accepting a final assembly (the improved scaffold N50 is more than sufficient). Our aim is to increase the contig N50 beyond 20kb, to ensure most genes be contained within a single contig. 4. Generate ~1,200,000 EST sequences from a variety of Hessian fly tissues: One Illumina paired end lane with 110bp read length produced 7.15 million clones each with 220bp of raw sequence data. The vastly increased depth of sequencing on the illumina platform allows annotation of a higher percentage of Hessian fly genes. We are currently (March 2010) performing additional EST sequencing using RNA from mixed sex eggs, male and female mid stage larvae, and mixed sex late stage larvae. 5. Produce an automated annotation of the assembled Hessian fly genome sequence: We are waiting on the final assembly and additional EST data to start work on this aim. 6. Deposit data in public databases: We are in the process of placing all raw sequence data into the short read trace archive. The initial version of the assembly is already available to the public via the HGSC website at: http://www.hgsc.bcm.tmc.edu/project-species-i-Hessian_fly.hgscpageLo cation=Hessian_fly . As additional assemblies and annotations become available, they will also be made available to the public as described in the original grant. PARTICIPANTS: Nothing significant to report during this reporting period. TARGET AUDIENCES: Nothing significant to report during this reporting period. PROJECT MODIFICATIONS: Nothing significant to report during this reporting period.
IMPACT: 2009-02-01 TO 2010-01-31
Impact: The release of the intermediate Hessian fly assembly has already been used extensively by Dr. Stuarts laboratory and other laboratories to accelerate their research into the hessian fly. In particular Dr Stuarts work to clone susceptibility and resistance genes has been greatly accelerated. We expect this impact will become large as the final assembly an annotation is released and advertised more broadly.
PUBLICATION INFORMATION: 2009-02-01 TO 2010-01-31
No publications reported this period
PROJECT CONTACT INFORMATION
| NAME: |
Stephen Richards |
| PHONE: |
713-798-6667 |
| FAX: |
713-798-5741 |
|
 |