BH12.12/TogoGenome

提供:TogoWiki

(版間での差分)
移動: 案内, 検索
(Stanzathon)
 
57行: 57行:
** v5: FALDOの更新 fix、INSDC オントロジーへの暫定移行
** v5: FALDOの更新 fix、INSDC オントロジーへの暫定移行
** v6: INSDC.owl の正式採用(予定)
** v6: INSDC.owl の正式採用(予定)
-
 
-
== Stanzathon ==
 
-
 
-
UniProt
 
-
 
-
<pre>
 
-
% wget http://www.uniprot.org/uniprot/P16033.rdf
 
-
 
-
% rapper -i rdfxml -o turtle P16033.rdf > P16033.ttl
 
-
rapper: Parsing URI file:///Users/ktym/P16033.rdf with parser rdfxml
 
-
rapper: Serializing with serializer turtle
 
-
rapper: Parsing returned 702 triples
 
-
 
-
% export SPARQL_ENDPOINT="http://beta.sparql.uniprot.org/sparql"
 
-
% sparql.rb query '
 
-
prefix up: <http://purl.uniprot.org/core/> 
 
-
prefix tax: <http://purl.uniprot.org/taxonomy/>
 
-
select *
 
-
where {                     
 
-
  ?s up:locusName "slr1311" .
 
-
  ?s ?p ?o .
 
-
}'
 
-
 
-
s p o
 
-
_5031363033330011 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.uniprot.org/core/Gene>
 
-
_5031363033330011 <http://purl.uniprot.org/core/locusName> slr1311
 
-
_5031363033330011 <http://www.w3.org/2004/02/skos/core#prefLabel> psbA2
 
-
_5031363033330011 <http://www.w3.org/2004/02/skos/core#altLabel> psbA-2
 
-
</pre>
 
-
 
-
TogoGenome
 
-
 
-
<pre>
 
-
% export SPARQL_ENDPOINT="http://lod.dbcls.jp/openrdf-sesame/repositories/togogenome"
 
-
 
-
% sparql.rb query '
 
-
select *
 
-
where {
 
-
  ?s rdfs:label "slr1311" .
 
-
  ?s ?p ?o .
 
-
}
 
-
'
 
-
s p o
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.obolibrary.org/obo/SO_0000316>
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://biohackathon.org/resource/faldo#location> <urn:uuid:3114165b-ffee-4816-b9bf-811dbbcb9b06>
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <http://identifiers.org/ncbigene/951890>
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <http://identifiers.org/ncbigi/16329178>
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <http://identifiers.org/ncbiprotein/NP_439906.1>
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://www.w3.org/2000/01/rdf-schema#label> slr1311
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_gene> psbA2
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_locus_tag> slr1311
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://purl.obolibrary.org/obo/so_part_of> <urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c>
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_translation> MTTTLQQRESASLWEQFCQWVTSTNNRIYVGWFGTLMIPTLLTATTCFIIAFIAAPPVDIDGIREPVAGSLLYGNNIISGAVVPSSNAIGLHFYPIWEAASLDEWLYNGGPYQLVVFHFLIGIFCYMGRQWELSYRLGMRPWICVAYSAPVSAATAVFLIYPIGQGSFSDGMPLGISGTFNFMIVFQAEHNILMHPFHMLGVAGVFGGSLFSAMHGSLVTSSLVRETTEVESQNYGYKFGQEEETYNIVAAHGYFGRLIFQYASFNNSRSLHFFLGAWPVIGIWFTAMGVSTMAFNLNGFNFNQSILDSQGRVIGTWADVLNRANIGFEVMHERNAHNFPLDLASGEQAPVALTAPAVNG
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://purl.obolibrary.org/obo/so_has_part> node9
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_codon_start> 1
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_transl_table> 11
 
-
<urn:uuid:aaf399d2-f84a-4feb-a689-966311a3b116> <http://rdf.insdc.org/feature_product> photosystem II D1 protein
 
-
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.obolibrary.org/obo/SO_0000704>
 
-
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://biohackathon.org/resource/faldo#location> <urn:uuid:d90ec492-6164-4f68-b47c-d89e19506302>
 
-
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://www.w3.org/2000/01/rdf-schema#seeAlso> <http://identifiers.org/ncbigene/951890>
 
-
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://www.w3.org/2000/01/rdf-schema#label> slr1311
 
-
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://rdf.insdc.org/feature_gene> psbA2
 
-
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://rdf.insdc.org/feature_locus_tag> slr1311
 
-
<urn:uuid:8683a33d-e496-43da-a4ce-a454faeb228c> <http://purl.obolibrary.org/obo/so_part_of> <urn:uuid:182f171a-7928-4324-8d41-f3e820a872fd>
 
-
</pre>
 

2012年12月20日 (木) 07:28時点における最新版

ゲノム情報の RDF 化

  • INSDC のオントロジー 作成
    • SO
    • FALDO
    • INSDC.owl
      • Feature/Qualifier - FT <-> SO
      • DB XREF - Identifiers.org
  • RefSeq (のprokaryote) エントリの RDF を生成
    • DDBJ の http://fat:8892/sparql に <http://v5.genome.db/> としてストアされているものが最新
      • 元データは ~ktym/project/rdfgenome/wget_prokaryote.v5/**/*.ttl
      • 42GB
      • 455,322,591 triples
      • 19,981,922 URIs
      • 75,363,941 UUIDs
      • 96 predicates
152898623 rdf:type
44102584 rdfs:label
40926808 faldo:reference
40926808 faldo:position
30128855 rdfs:seeAlso
20463404 faldo:end
20463404 faldo:begin
20459536 obo:so_part_of
13973729 insdc:location_string
13973729 faldo:location
13868333 insdc:feature_locus_tag
6701192 insdc:feature_product
6486350 obo:so_has_part
6486350 insdc:feature_transl_table
6486350 insdc:feature_codon_start
6485705 insdc:feature_translation
4419283 insdc:feature_note
2793204 insdc:feature_gene
1706510 insdc:feature_inference
762918 insdc:feature_EC_number
309060 insdc:feature_function
164543 insdc:feature_old_locus_tag
126161 insdc:feature_pseudo
72012 insdc:feature_gene_synonym
20815 insdc:feature_experiment
14092 insdc:feature_codon_recognized
12731 insdc:feature_operon
11131 insdc:feature_rpt_family
10385 insdc:feature_mobile_element_type
6145 insdc:feature_anticodon
4558 insdc:feature_rpt_type
  :
  • ヒストリ
    • v1: BioRuby を使った RefSeq -> Turtle コンバータ
    • v2: URI を見直し
    • v3: URI を URN (UUID) 化
    • v4: Identifiers.org を使用、バグフィックス
    • v5: FALDOの更新 fix、INSDC オントロジーへの暫定移行
    • v6: INSDC.owl の正式採用(予定)
/mw/BH12.12/TogoGenome」より作成