BH12.12/TogoGenome
提供:TogoWiki
ゲノム情報の RDF 化
- INSDC のオントロジー 作成
- SO
- FALDO
- INSDC.owl
- Feature/Qualifier - FT <-> SO
- DB XREF - Identifiers.org
- RefSeq (のprokaryote) エントリの RDF を生成
- DDBJ の http://fat:8892/sparql に <http://v5.genome.db/> としてストアされているものが最新
- 元データは ~ktym/project/rdfgenome/wget_prokaryote.v5/**/*.ttl
- 42GB
- 455,322,591 triples
- 19,981,922 URIs
- 75,363,941 UUIDs
- 96 predicates
- DDBJ の http://fat:8892/sparql に <http://v5.genome.db/> としてストアされているものが最新
152898623 rdf:type 44102584 rdfs:label 40926808 faldo:reference 40926808 faldo:position 30128855 rdfs:seeAlso 20463404 faldo:end 20463404 faldo:begin 20459536 obo:so_part_of 13973729 insdc:location_string 13973729 faldo:location 13868333 insdc:feature_locus_tag 6701192 insdc:feature_product 6486350 obo:so_has_part 6486350 insdc:feature_transl_table 6486350 insdc:feature_codon_start 6485705 insdc:feature_translation 4419283 insdc:feature_note 2793204 insdc:feature_gene 1706510 insdc:feature_inference 762918 insdc:feature_EC_number 309060 insdc:feature_function 164543 insdc:feature_old_locus_tag 126161 insdc:feature_pseudo 72012 insdc:feature_gene_synonym 20815 insdc:feature_experiment 14092 insdc:feature_codon_recognized 12731 insdc:feature_operon 11131 insdc:feature_rpt_family 10385 insdc:feature_mobile_element_type 6145 insdc:feature_anticodon 4558 insdc:feature_rpt_type :
- ヒストリ
- v1: BioRuby を使った RefSeq -> Turtle コンバータ
- v2: URI を見直し
- v3: URI を URN (UUID) 化
- v4: Identifiers.org を使用、バグフィックス
- v5: FALDOの更新 fix、INSDC オントロジーへの暫定移行
- v6: INSDC.owl の正式採用(予定)
Stanzathon
UniProt
% wget http://www.uniprot.org/uniprot/P16033.rdf
% rapper -i rdfxml -o turtle P16033.rdf > P16033.ttl
rapper: Parsing URI file:///Users/ktym/P16033.rdf with parser rdfxml
rapper: Serializing with serializer turtle
rapper: Parsing returned 702 triples
% export SPARQL_ENDPOINT="http://beta.sparql.uniprot.org/sparql"
% sparql.rb query '
prefix up: <http://purl.uniprot.org/core/>
prefix tax: <http://purl.uniprot.org/taxonomy/>
select *
where {
?s up:locusName "slr1311" .
?s ?p ?o .
}'
s p o
_5031363033330011 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.uniprot.org/core/Gene>
_5031363033330011 <http://purl.uniprot.org/core/locusName> slr1311
_5031363033330011 <http://www.w3.org/2004/02/skos/core#prefLabel> psbA2
_5031363033330011 <http://www.w3.org/2004/02/skos/core#altLabel> psbA-2
TogoGenome
% export SPARQL_ENDPOINT="http://lod.dbcls.jp/openrdf-sesame/repositories/togogenome"
% sparql.rb query '
select *
where {
?s rdfs:label "slr1311" .
?s ?p ?o .
}
'
s p o
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.obolibrary.org/obo/SO_0000316>
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://biohackathon.org/resource/faldo#location> <urn:uuid:3114165b-ffee-4816-b9bf-811dbbcb9b06>
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <http://identifiers.org/ncbigene/951890>
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <http://identifiers.org/ncbigi/16329178>
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <http://identifiers.org/ncbiprotein/NP_439906.1>
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/2000/01/rdf-schema#label> slr1311
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_gene> psbA2
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_locus_tag> slr1311
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://purl.obolibrary.org/obo/so_part_of> <urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c>
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_translation> MTTTLQQRESASLWEQFCQWVTSTNNRIYVGWFGTLMIPTLLTATTCFIIAFIAAPPVDIDGIREPVAGSLLYGNNIISGAVVPSSNAIGLHFYPIWEAASLDEWLYNGGPYQLVVFHFLIGIFCYMGRQWELSYRLGMRPWICVAYSAPVSAATAVFLIYPIGQGSFSDGMPLGISGTFNFMIVFQAEHNILMHPFHMLGVAGVFGGSLFSAMHGSLVTSSLVRETTEVESQNYGYKFGQEEETYNIVAAHGYFGRLIFQYASFNNSRSLHFFLGAWPVIGIWFTAMGVSTMAFNLNGFNFNQSILDSQGRVIGTWADVLNRANIGFEVMHERNAHNFPLDLASGEQAPVALTAPAVNG
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://purl.obolibrary.org/obo/so_has_part> node9
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_codon_start> 1
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_transl_table> 11
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_product> photosystem II D1 protein
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.obolibrary.org/obo/SO_0000704>
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://biohackathon.org/resource/faldo#location> <urn:uuid:d90ec492-6164-4f68-b47c-d89e19506302>
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <http://identifiers.org/ncbigene/951890>
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://www.w3.org/2000/01/rdf-schema#label> slr1311
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://rdf.insdc.org/feature_gene> psbA2
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://rdf.insdc.org/feature_locus_tag> slr1311
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://purl.obolibrary.org/obo/so_part_of> <urn:uuid:182f171a-7928-4324-8d41-f3e820a872fd>