BH12.12/VariationRDF
提供:TogoWiki
Genomic Variation情報のRDF化
目次 |
目標
Variation情報 + Omics注釈情報の統合
「①Variation解析結果」と「②GeneSet情報」をRDFを介して統合して、新規知見を導く。BH12.12では統合DB微生物の放線菌を事例に実装する。 DDBJ piplineのGenome ReferenceとSRAの比較解析後のVariation情報 (Annovar, GVF) のRDF化を視野に入れる
参加者
- 藤澤
- 神沼
追加してください
BH12.12での作業
Reference(放線菌Storeptomyces coelicolor M145)に対するS.lividans TK24のVariation情報+SBML生成
※S.lividans 66についてはLewis et al.,2010参照
- TogoAnnotation streptomyces含むテストデータのtriple化 (fin by fujisawa)
- 提案内容の紹介 to DBCLS+統合微生物関係者 (fin at SPARQLthon12/12/03)
- データ、作業環境の共有 (fin 12/12/19)
(ToDo)
- S.lividansのVariation情報をSPARQL endpointで出力
- S.lividansのVariation情報→RDF→SBMLファイル生成のフロー構築
調査データ
Source | Year | Accession | Strain |
---|---|---|---|
REFSEQ | 2002 | NC_003888.3 | Streptomyces coelicolor A3(2) |
SRA | 2008 | SRA002846 SRS001037 | Streptomyces lividans TK24 |
SRA | 2010 | SRA050038 SRS151654 | Streptomyces violaceusniger Tu 4113 |
SRA | 2012 | SRA060310 SRS372023 | Streptomyces sp. HPH0547 [Taxonomy ID: 1203592] |
SRA | SRA089971 SRS439072 | Streptomyces sp. TAA204 | |
SRA | SRA090119 SRS440666 | Streptomyces sp. TAA040 | |
SRA | SRA090649 SRS441180 | Streptomyces sp. CNT318 | |
SRA | SRA090029 SRS439118 | Streptomyces sp. CNQ865 | |
SRA | SRA090033 SRS439121 | Streptomyces sp. CNT360 | |
SRA | SRA090027 SRS439116 | Streptomyces sp. CNH287 | |
SRA | SRA090120 SRS440668 | Streptomyces sp. TAA486 |
S.coelicolorのSBMLファイル.
S.lividans 66の情報
調査オントロジー
- ALFRED => polymorphism / allele frequency
- VariO => variation information
- Starpath => pathway + orthology
- SO/FALDO => sequence type / genomic location
- MAO: => multiple alignment
- GELO => genomic element ontology (aligning array probes to RefSeq identifiers)
テスト環境
- NIG大量研サーバ http://semantic.annotation.jp/
- Virtuoso 6.1.5
結果
- Triple dataset(作業中)
1 | triple# | 369,317 |
2 | predicate# | 11 |
S lividans : alignment / variation detection
- genome coverage 92%, sequence depth 18x - 27588 SNPs and 779 indels
- SPARQL endpoint(未)
http://semantic.annotation.jp/sparql/
追加データ
Source | Year | Accession | Strain |
---|---|---|---|
REFSEQ | 1996 | NC_000911.1 | Synechocystis sp. PCC 6803 Kazusa strain |
SRA | --- | SRA030087 SRS172931 | Synechocystis sp. PCC 6803 Shestakov strain |
DRA | --- | DRA000401 DRS000705 | Synechocystis sp. PCC 6803 GT-I strain |
DRA | --- | DRA000401 DRS000706 | Synechocystis sp. PCC 6803 PCC-N strain |
DRA | --- | DRA000401 DRS000707 | Synechocystis sp. PCC 6803 PCC-P strain |
SRA | --- | SRA056276 SRS------ | Nostoc(Anabaena) sp. PCC7120 |
SRA | 2009 | SRA008127 SRS002043 | Planktothrix rubescens NIVA CYA 98 |
SRA | 2012 | SRA061019 SRS373696 | Calothrix sp. PCC 7507 |
SRA | 2012 | SRA061120 SRS373697 | Stanieria cyanosphaera PCC 7437 |
SRA | 2012 | SRA061121 SRS373698 | Pleurocapsa sp. PCC 7319 |
SRA | 2012 | SRA061122 SRS373699 | Nostoc sp. PCC 7107 |
SRA | 2012 | SRA061123 SRS373927 | Cyanobacterium stanieri PCC 7202 |
課題/将来構想
- phenotype注釈情報の統合 formats(hapmap, plink) etc
- phenotype含めた解析系構築(変異+形質検索, GWAS解析系構築, Trait予測, etc)
- population情報の統合(allele mining, etc)