1. What can I found from the Foxtail millet Database?
We set up this database to present the entire DNA sequence of foxtail millet "zhanggu" assembled based on ~64Gb pair-end reads (with different insert sizes (170 bp–40 kbp) ) generated by Illumina Genome Analyzer. In total, 46.3% of genome was identified as transposable elements, and a reference set of 38801 genes was created. We also sequenced the tanscriptomics of foxtail millet, analyzed the syntenic and collinear blocks among grass and give an extensive functional annotation in the genome. Besides, we provide the genetic markers (SNPs, Indels and SVs) which were identified based on the resequencing of another accession of foxtail millet "A2", and the analysis of two F2 Populations. All these features are presented in Map View, and are freely available to download. Furthermore, a BLAST web service offers online alignment of query sequences against the foxtail millet genome.
2. How to use this database?
There are 6 functional pages in the top level: Home, short introduction of the foxtail millet project and database; Mapview, genome browser to show the detailed annotation information for each genomic region; Blast, provide homology searching service against the foxtail millet genome; Download, provide users data access through ftp; Help, equal to FAQ, give answers to the main questions; Links, some related bioinformatic tools. You can get understand by the help information on each page, if any further question about the data, please contact email@example.com.
3.How did you assembly the large genome with so short reads?
We assembled the short reads using SOAPdenovo, a genome assembler developed based on the de Bruijn graph theory. Firstly, we assembled the short reads from small insert-size libraries (<500bp) into contigs according to pure sequence overlap information and break contigs at boundaries of ambiguous connections of repeat sequences. Then, the paired-end information was used step-by-step from the smallest (170bp) to the longest (40Kb) insert size to joint the contigs into scaffolds. Finally, we used the paired-end information to extract reads and performed local assembly to fill in the small gaps inside scaffolds.
4. How did you call SNPs, Indels and SVs on foxtail millet genome?
Besides carried out complete sequencing and de novo assembly of the domesticated foxtail millet "zhanggu", we also resequenced another accession "A2" for identifying genetic variation sites between the two accessions, and conduct an analyzed the SNPs, Indels, and SVs on two F2 populations.
5. How did you do the comparative genomics analysis?
We did the comparative genomics analysis based on two different levels: genome-based level and gene-based level. We identified all the duplicate segments within genomes to infer the genome evolutionary history, such as the WGDs, segmental duplications, and transposon events, etc. At the same time, we compared the related grass genomes to get the homologous chromosomes, or collect all the collinear blocks to elucidate the genes retention or loss, and related the genomic dynamics with the corresponding functional changes, and thus infer the relationship between genotypes and phenotypes.
6. Contact us
Beijing Genomics Institute(BGI) - Shenzhen
Tel: +86 (0) 755 22321495
Add: Beishan Industrial Zone, Beishan Road, Yantian District, Shenzhen, China, 518083.