BH12.12/TogoGenome
提供:TogoWiki
ゲノム情報の RDF 化
- INSDC のオントロジー 作成
- SO
- FALDO
- INSDC.owl
- Feature/Qualifier - FT <-> SO
- DB XREF - Identifiers.org
- RefSeq (のprokaryote) エントリの RDF を生成
- DDBJ の http://fat:8892/sparql に <http://v5.genome.db/> としてストアされているものが最新
- 元データは ~ktym/project/rdfgenome/wget_prokaryote.v5/**/*.ttl
- 42GB
- 455,322,591 triples
- 19,981,922 URIs
- 75,363,941 UUIDs
- 96 predicates
- DDBJ の http://fat:8892/sparql に <http://v5.genome.db/> としてストアされているものが最新
152898623 rdf:type 44102584 rdfs:label 40926808 faldo:reference 40926808 faldo:position 30128855 rdfs:seeAlso 20463404 faldo:end 20463404 faldo:begin 20459536 obo:so_part_of 13973729 insdc:location_string 13973729 faldo:location 13868333 insdc:feature_locus_tag 6701192 insdc:feature_product 6486350 obo:so_has_part 6486350 insdc:feature_transl_table 6486350 insdc:feature_codon_start 6485705 insdc:feature_translation 4419283 insdc:feature_note 2793204 insdc:feature_gene 1706510 insdc:feature_inference 762918 insdc:feature_EC_number 309060 insdc:feature_function 164543 insdc:feature_old_locus_tag 126161 insdc:feature_pseudo 72012 insdc:feature_gene_synonym 20815 insdc:feature_experiment 14092 insdc:feature_codon_recognized 12731 insdc:feature_operon 11131 insdc:feature_rpt_family 10385 insdc:feature_mobile_element_type 6145 insdc:feature_anticodon 4558 insdc:feature_rpt_type :
- ヒストリ
- v1: BioRuby を使った RefSeq -> Turtle コンバータ
- v2: URI を見直し
- v3: URI を URN (UUID) 化
- v4: Identifiers.org を使用、バグフィックス
- v5: FALDOの更新 fix、INSDC オントロジーへの暫定移行
- v6: INSDC.owl の正式採用(予定)
Stanzathon
UniProt
% wget http://www.uniprot.org/uniprot/P16033.rdf % rapper -i rdfxml -o turtle P16033.rdf > P16033.ttl rapper: Parsing URI file:///Users/ktym/P16033.rdf with parser rdfxml rapper: Serializing with serializer turtle rapper: Parsing returned 702 triples % export SPARQL_ENDPOINT="http://beta.sparql.uniprot.org/sparql" % sparql.rb query ' prefix up: <http://purl.uniprot.org/core/> prefix tax: <http://purl.uniprot.org/taxonomy/> select * where { ?s up:locusName "slr1311" . ?s ?p ?o . }' s p o _5031363033330011 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.uniprot.org/core/Gene> _5031363033330011 <http://purl.uniprot.org/core/locusName> slr1311 _5031363033330011 <http://www.w3.org/2004/02/skos/core#prefLabel> psbA2 _5031363033330011 <http://www.w3.org/2004/02/skos/core#altLabel> psbA-2
TogoGenome
% export SPARQL_ENDPOINT="http://lod.dbcls.jp/openrdf-sesame/repositories/togogenome" % sparql.rb query ' select * where { ?s rdfs:label "slr1311" . ?s ?p ?o . } ' s p o <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.obolibrary.org/obo/SO_0000316> <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://biohackathon.org/resource/faldo#location> <urn:uuid:3114165b-ffee-4816-b9bf-811dbbcb9b06> <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <http://identifiers.org/ncbigene/951890> <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <http://identifiers.org/ncbigi/16329178> <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <http://identifiers.org/ncbiprotein/NP_439906.1> <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/2000/01/rdf-schema#label> slr1311 <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_gene> psbA2 <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_locus_tag> slr1311 <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://purl.obolibrary.org/obo/so_part_of> <urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_translation> MTTTLQQRESASLWEQFCQWVTSTNNRIYVGWFGTLMIPTLLTATTCFIIAFIAAPPVDIDGIREPVAGSLLYGNNIISGAVVPSSNAIGLHFYPIWEAASLDEWLYNGGPYQLVVFHFLIGIFCYMGRQWELSYRLGMRPWICVAYSAPVSAATAVFLIYPIGQGSFSDGMPLGISGTFNFMIVFQAEHNILMHPFHMLGVAGVFGGSLFSAMHGSLVTSSLVRETTEVESQNYGYKFGQEEETYNIVAAHGYFGRLIFQYASFNNSRSLHFFLGAWPVIGIWFTAMGVSTMAFNLNGFNFNQSILDSQGRVIGTWADVLNRANIGFEVMHERNAHNFPLDLASGEQAPVALTAPAVNG <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://purl.obolibrary.org/obo/so_has_part> node9 <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_codon_start> 1 <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_transl_table> 11 <urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_product> photosystem II D1 protein <urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.obolibrary.org/obo/SO_0000704> <urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://biohackathon.org/resource/faldo#location> <urn:uuid:d90ec492-6164-4f68-b47c-d89e19506302> <urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <http://identifiers.org/ncbigene/951890> <urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://www.w3.org/2000/01/rdf-schema#label> slr1311 <urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://rdf.insdc.org/feature_gene> psbA2 <urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://rdf.insdc.org/feature_locus_tag> slr1311 <urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://purl.obolibrary.org/obo/so_part_of> <urn:uuid:182f171a-7928-4324-8d41-f3e820a872fd>