BH12.12/SPARQLthon5/INSDCオントロジー

提供:TogoWiki

(版間での差分)
移動: 案内, 検索
(INSDCオントロジー)
 
(間の7版分が非表示)
1行: 1行:
-
== insdc.owl ==
+
== INSDC ==
-
* アップデート
+
* insdc.owlアップデート
-
 
+
-
=== bioproject ===
+
-
* データ 2013-02-20時点
+
-
* NCBI ftp://ftp.ncbi.nlm.nih.gov/bioproject/summary.txt 68749件
+
-
* DDBJ ftp://ftp.ddbj.nig.ac.jp/ddbj_database/bioproject/ddbj_summary.txt 448件
+
 +
== bioproject ==
 +
INSDC公開データの確認を行なった。
 +
=== 公開データ ===
 +
2013-02-20時点のFTPサイトのデータを確認した。ENAについては不明。
 +
** NCBI
 +
*** ftp://ftp.ncbi.nlm.nih.gov/bioproject/summary.txt
 +
Organism Name  TaxID  Project Accession      Project ID      Project Type    Project Data Type      Date
 +
Borrelia burgdorferi B31        224326  PRJNA3  3      Primary submission      Genome sequencing      2003/02/23
 +
Treponema denticola ATCC 35405  243275  PRJNA4  4      Primary submission      Genome sequencing      2004/04/06
 +
Treponema pallidum subsp. pallidum str. Nichols 243276  PRJNA5  5      Primary submission      Genome sequencing      2003/02/25
 +
Magnetospirillum magnetotacticum MS-1  272627  PRJNA6  6      Primary submission      Genome sequencing      2003/02/25
 +
Campylobacter fetus subsp. venerealis str. Azul-94      593452  PRJNA7  7      Primary submission      Genome sequencing      2009/04/22
 +
.
 +
.
 +
.
 +
68749件
 +
*** ftp://ftp.ncbi.nlm.nih.gov/bioproject/refseq-genbank.csv
 +
Refseq accn,Genbank accn,Organism name,TaxID
 +
PRJNA116,PRJNA10719,Arabidopsis thaliana,3702
 +
PRJNA122,PRJNA12269,Oryza sativa Japonica Group,39947
 +
PRJNA122,PRJNA13141,Oryza sativa Japonica Group,39947
 +
PRJNA127,PRJNA13836,Schizosaccharomyces pombe 972h-,284812
 +
PRJNA127,PRJNA20755,Schizosaccharomyces pombe,4896
 +
PRJNA128,PRJNA13838,Saccharomyces cerevisiae S288c,559292
 +
PRJNA128,PRJNA43747,Saccharomyces cerevisiae S288c,559292
 +
PRJNA132,PRJNA13841,Neurospora crassa OR74A,367110
 +
PRJNA148,PRJNA13173,Plasmodium falciparum 3D7,36329
 +
PRJNA155,PRJNA13833,Encephalitozoon cuniculi GB-M1,284813
 +
.
 +
.
 +
.
 +
7606件
 +
*** ftp://ftp.ncbi.nlm.nih.gov/bioproject/bioproject.xml
 +
** DDBJ
 +
*** ftp://ftp.ddbj.nig.ac.jp/ddbj_database/bioproject/ddbj_summary.txt
  Organism Name Project Accession Project ID Project Type Project Data Type Released Updated
  Organism Name Project Accession Project ID Project Type Project Data Type Released Updated
  Gluconobacter frateurii NBRC 101659 PRJDB2 Primary submission Genome sequencing 2012/05/14 2012/05/14
  Gluconobacter frateurii NBRC 101659 PRJDB2 Primary submission Genome sequencing 2012/05/14 2012/05/14
12行: 42行:
  Gordonia rhizosphera NBRC 16068 PRJDB4 Primary submission Genome sequencing 2011/11/01 2012/08/29
  Gordonia rhizosphera NBRC 16068 PRJDB4 Primary submission Genome sequencing 2011/11/01 2012/08/29
  Escherichia coli str. K-12 substr. MDS42 PRJDB5 Primary submission Genome sequencing 2011/12/02 2012/02/08
  Escherichia coli str. K-12 substr. MDS42 PRJDB5 Primary submission Genome sequencing 2011/12/02 2012/02/08
 +
.
 +
.
 +
.
 +
448件
 +
*** ftp://ftp.ddbj.nig.ac.jp/ddbj_database/bioproject/ddbj_core_bioproject.xml
 +
 +
=== RDF ===
 +
* prokaryotes.txt.bioproject_refseq.nt
 +
<http://www.ncbi.nlm.nih.gov/bioproject/43389>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/NZ_CM000855.1> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/NZ_CM000855.1>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/43389> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/NZ_CM000855.1>    <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>      <http://purl.obolibrary.org/obo/SO_0000340> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/43389>  <http://www.w3.org/2000/01/rdf-schema#label>    "Campylobacter jejuni subsp. jejuni 414" .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/43389>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/CM000855.1> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/CM000855.1>        <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/43389> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/CM000855.1>        <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>      <http://purl.obolibrary.org/obo/SO_0000340> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/43391>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/NZ_CM000854.1> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/NZ_CM000854.1>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/43391> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/NZ_CM000854.1>    <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>      <http://purl.obolibrary.org/obo/SO_0000340> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/43391>  <http://www.w3.org/2000/01/rdf-schema#label>    "Campylobacter jejuni subsp. jejuni 1336" .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/43391>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/CM000854.1> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/CM000854.1>        <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/43391> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/CM000854.1>        <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>      <http://purl.obolibrary.org/obo/SO_0000340> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/57587>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/NC_002163.1> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/NC_002163.1>      <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/57587> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/NC_002163.1>      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>      <http://purl.obolibrary.org/obo/SO_0000340> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/57587>  <http://www.w3.org/2000/01/rdf-schema#label>    "Campylobacter jejuni subsp. jejuni NCTC 11168 = ATCC 700819" .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/57587>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/AL111168.1> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/AL111168.1>        <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/57587> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/AL111168.1>        <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>      <http://purl.obolibrary.org/obo/SO_0000340> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/57899>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/NC_003912.7> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/NC_003912.7>      <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/57899> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/NC_003912.7>      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>      <http://purl.obolibrary.org/obo/SO_0000340> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/57899>  <http://www.w3.org/2000/01/rdf-schema#label>    "Campylobacter jejuni RM1221" .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/57899>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/CP000025.1> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/CP000025.1>        <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/57899> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/CP000025.1>        <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>      <http://purl.obolibrary.org/obo/SO_0000340> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/58503>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/NC_008787.1> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/NC_008787.1>      <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/58503> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/NC_008787.1>      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>      <http://purl.obolibrary.org/obo/SO_0000340> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/58503>  <http://www.w3.org/2000/01/rdf-schema#label>    "Campylobacter jejuni subsp. jejuni 81-176" .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/58503>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/CP000538.1> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/CP000538.1>        <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/58503> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/CP000538.1>        <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>      <http://purl.obolibrary.org/obo/SO_0000340> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/58503>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/NC_008787.1> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/NC_008787.1>      <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/58503> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/NC_008787.1>      <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>      <http://purl.obolibrary.org/obo/SO_0000155> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/58503>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/CP000549.1> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/CP000549.1>        <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/58503> .
 +
<http://www.ncbi.nlm.nih.gov/nuccore/CP000549.1>        <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>      <http://purl.obolibrary.org/obo/SO_0000155> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/58503>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/CP000550.1> .
 +
 +
* summary.txt.bioproject_taxid_gold.nt
 +
<http://www.ncbi.nlm.nih.gov/bioproject/116>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/taxonomy/3702> .
 +
<http://www.ncbi.nlm.nih.gov/taxonomy/3702>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/116> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/116>    <http://www.w3.org/2000/01/rdf-schema#label>    "RefSeq Genome" .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/116>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/10719> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/10719>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/116> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/10719>  <http://www.w3.org/2000/01/rdf-schema#label>    "Genome sequencing" .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/10719>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/taxonomy/3702> .
 +
<http://www.ncbi.nlm.nih.gov/taxonomy/3702>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/10719> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/122>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/taxonomy/39947> .
 +
<http://www.ncbi.nlm.nih.gov/taxonomy/39947>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/122> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/122>    <http://www.w3.org/2000/01/rdf-schema#label>    "RefSeq Genome" .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/122>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/13141> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/13141>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/122> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/13141>  <http://www.w3.org/2000/01/rdf-schema#label>    "Genome sequencing" .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/13141>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/taxonomy/39947> .
 +
<http://www.ncbi.nlm.nih.gov/taxonomy/39947>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/13141> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/13141>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.genomesonline.org/cgi-bin/GOLD/GOLDCards.cgi?goldstamp=Gc00603> .
 +
<http://www.genomesonline.org/cgi-bin/GOLD/GOLDCards.cgi?goldstamp=Gc00603>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/13141> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/127>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/taxonomy/284812> .
 +
<http://www.ncbi.nlm.nih.gov/taxonomy/284812>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/127> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/127>    <http://www.w3.org/2000/01/rdf-schema#label>    "RefSeq Genome" .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/127>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/20755> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/20755>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/127> .
 +
<http://www.ncbi.nlm.nih.gov/bioproject/20755>  <http://www.w3.org/2000/01/rdf-schema#label>    "Genome sequencing" .
 +
 +
=== XML2RDFコンバーター開発 ===
 +
フラットファイルではなくXMLからのコンバーターを作成する
 +
* BioProjectID -- INSDC Entry Accession のデータがXMLにない
 +
* xmlファイルが大きいため、Nokogiri:XML::SAX::Document を使ってパースする
 +
* project id はプレフィックスPRJDBが付く http://identifiers.org/bioproject/PRJDB116 をSubjectにする?
== 前回 ==
== 前回 ==
* http://wiki.lifesciencedb.jp/mw/index.php/BH12.12/INSDCオントロジー
* http://wiki.lifesciencedb.jp/mw/index.php/BH12.12/INSDCオントロジー
 +
 +
[[Category:INSDC]]
 +
[[Category:DDBJ]]

2013年8月21日 (水) 10:39時点における最新版

目次

INSDC

  • insdc.owlアップデート

bioproject

INSDC公開データの確認を行なった。

公開データ

2013-02-20時点のFTPサイトのデータを確認した。ENAについては不明。

Organism Name   TaxID   Project Accession       Project ID      Project Type    Project Data Type       Date
Borrelia burgdorferi B31        224326  PRJNA3  3       Primary submission      Genome sequencing       2003/02/23
Treponema denticola ATCC 35405  243275  PRJNA4  4       Primary submission      Genome sequencing       2004/04/06
Treponema pallidum subsp. pallidum str. Nichols 243276  PRJNA5  5       Primary submission      Genome sequencing       2003/02/25
Magnetospirillum magnetotacticum MS-1   272627  PRJNA6  6       Primary submission      Genome sequencing       2003/02/25
Campylobacter fetus subsp. venerealis str. Azul-94      593452  PRJNA7  7       Primary submission      Genome sequencing       2009/04/22
.
.
.
68749件
Refseq accn,Genbank accn,Organism name,TaxID
PRJNA116,PRJNA10719,Arabidopsis thaliana,3702
PRJNA122,PRJNA12269,Oryza sativa Japonica Group,39947
PRJNA122,PRJNA13141,Oryza sativa Japonica Group,39947
PRJNA127,PRJNA13836,Schizosaccharomyces pombe 972h-,284812
PRJNA127,PRJNA20755,Schizosaccharomyces pombe,4896
PRJNA128,PRJNA13838,Saccharomyces cerevisiae S288c,559292
PRJNA128,PRJNA43747,Saccharomyces cerevisiae S288c,559292
PRJNA132,PRJNA13841,Neurospora crassa OR74A,367110
PRJNA148,PRJNA13173,Plasmodium falciparum 3D7,36329
PRJNA155,PRJNA13833,Encephalitozoon cuniculi GB-M1,284813
.
.
.
7606件
Organism Name	Project Accession	Project ID	Project Type	Project Data Type	Released	Updated
Gluconobacter frateurii NBRC 101659	PRJDB2		Primary submission	Genome sequencing	2012/05/14	2012/05/14
Gordonia otitidis NBRC 100426	PRJDB3		Primary submission	Genome sequencing	2011/11/01	2012/02/16
Gordonia rhizosphera NBRC 16068	PRJDB4		Primary submission	Genome sequencing	2011/11/01	2012/08/29
Escherichia coli str. K-12 substr. MDS42	PRJDB5		Primary submission	Genome sequencing	2011/12/02	2012/02/08
.
.
.
448件

RDF

  • prokaryotes.txt.bioproject_refseq.nt
<http://www.ncbi.nlm.nih.gov/bioproject/43389>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/NZ_CM000855.1> .
<http://www.ncbi.nlm.nih.gov/nuccore/NZ_CM000855.1>     <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/43389> .
<http://www.ncbi.nlm.nih.gov/nuccore/NZ_CM000855.1>     <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>       <http://purl.obolibrary.org/obo/SO_0000340> .
<http://www.ncbi.nlm.nih.gov/bioproject/43389>  <http://www.w3.org/2000/01/rdf-schema#label>    "Campylobacter jejuni subsp. jejuni 414" .
<http://www.ncbi.nlm.nih.gov/bioproject/43389>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/CM000855.1> .
<http://www.ncbi.nlm.nih.gov/nuccore/CM000855.1>        <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/43389> .
<http://www.ncbi.nlm.nih.gov/nuccore/CM000855.1>        <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>       <http://purl.obolibrary.org/obo/SO_0000340> .
<http://www.ncbi.nlm.nih.gov/bioproject/43391>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/NZ_CM000854.1> .
<http://www.ncbi.nlm.nih.gov/nuccore/NZ_CM000854.1>     <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/43391> .
<http://www.ncbi.nlm.nih.gov/nuccore/NZ_CM000854.1>     <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>       <http://purl.obolibrary.org/obo/SO_0000340> .
<http://www.ncbi.nlm.nih.gov/bioproject/43391>  <http://www.w3.org/2000/01/rdf-schema#label>    "Campylobacter jejuni subsp. jejuni 1336" .
<http://www.ncbi.nlm.nih.gov/bioproject/43391>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/CM000854.1> .
<http://www.ncbi.nlm.nih.gov/nuccore/CM000854.1>        <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/43391> .
<http://www.ncbi.nlm.nih.gov/nuccore/CM000854.1>        <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>       <http://purl.obolibrary.org/obo/SO_0000340> .
<http://www.ncbi.nlm.nih.gov/bioproject/57587>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/NC_002163.1> .
<http://www.ncbi.nlm.nih.gov/nuccore/NC_002163.1>       <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/57587> .
<http://www.ncbi.nlm.nih.gov/nuccore/NC_002163.1>       <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>       <http://purl.obolibrary.org/obo/SO_0000340> .
<http://www.ncbi.nlm.nih.gov/bioproject/57587>  <http://www.w3.org/2000/01/rdf-schema#label>    "Campylobacter jejuni subsp. jejuni NCTC 11168 = ATCC 700819" .
<http://www.ncbi.nlm.nih.gov/bioproject/57587>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/AL111168.1> .
<http://www.ncbi.nlm.nih.gov/nuccore/AL111168.1>        <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/57587> .
<http://www.ncbi.nlm.nih.gov/nuccore/AL111168.1>        <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>       <http://purl.obolibrary.org/obo/SO_0000340> .
<http://www.ncbi.nlm.nih.gov/bioproject/57899>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/NC_003912.7> .
<http://www.ncbi.nlm.nih.gov/nuccore/NC_003912.7>       <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/57899> .
<http://www.ncbi.nlm.nih.gov/nuccore/NC_003912.7>       <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>       <http://purl.obolibrary.org/obo/SO_0000340> .
<http://www.ncbi.nlm.nih.gov/bioproject/57899>  <http://www.w3.org/2000/01/rdf-schema#label>    "Campylobacter jejuni RM1221" .
<http://www.ncbi.nlm.nih.gov/bioproject/57899>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/CP000025.1> .
<http://www.ncbi.nlm.nih.gov/nuccore/CP000025.1>        <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/57899> .
<http://www.ncbi.nlm.nih.gov/nuccore/CP000025.1>        <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>       <http://purl.obolibrary.org/obo/SO_0000340> .
<http://www.ncbi.nlm.nih.gov/bioproject/58503>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/NC_008787.1> .
<http://www.ncbi.nlm.nih.gov/nuccore/NC_008787.1>       <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/58503> .
<http://www.ncbi.nlm.nih.gov/nuccore/NC_008787.1>       <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>       <http://purl.obolibrary.org/obo/SO_0000340> .
<http://www.ncbi.nlm.nih.gov/bioproject/58503>  <http://www.w3.org/2000/01/rdf-schema#label>    "Campylobacter jejuni subsp. jejuni 81-176" .
<http://www.ncbi.nlm.nih.gov/bioproject/58503>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/CP000538.1> .
<http://www.ncbi.nlm.nih.gov/nuccore/CP000538.1>        <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/58503> .
<http://www.ncbi.nlm.nih.gov/nuccore/CP000538.1>        <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>       <http://purl.obolibrary.org/obo/SO_0000340> .
<http://www.ncbi.nlm.nih.gov/bioproject/58503>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/NC_008787.1> .
<http://www.ncbi.nlm.nih.gov/nuccore/NC_008787.1>       <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/58503> .
<http://www.ncbi.nlm.nih.gov/nuccore/NC_008787.1>       <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>       <http://purl.obolibrary.org/obo/SO_0000155> .
<http://www.ncbi.nlm.nih.gov/bioproject/58503>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/CP000549.1> .
<http://www.ncbi.nlm.nih.gov/nuccore/CP000549.1>        <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/58503> .
<http://www.ncbi.nlm.nih.gov/nuccore/CP000549.1>        <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>       <http://purl.obolibrary.org/obo/SO_0000155> .
<http://www.ncbi.nlm.nih.gov/bioproject/58503>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/nuccore/CP000550.1> .
  • summary.txt.bioproject_taxid_gold.nt
<http://www.ncbi.nlm.nih.gov/bioproject/116>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/taxonomy/3702> .
<http://www.ncbi.nlm.nih.gov/taxonomy/3702>     <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/116> .
<http://www.ncbi.nlm.nih.gov/bioproject/116>    <http://www.w3.org/2000/01/rdf-schema#label>    "RefSeq Genome" .
<http://www.ncbi.nlm.nih.gov/bioproject/116>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/10719> .
<http://www.ncbi.nlm.nih.gov/bioproject/10719>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/116> .
<http://www.ncbi.nlm.nih.gov/bioproject/10719>  <http://www.w3.org/2000/01/rdf-schema#label>    "Genome sequencing" .
<http://www.ncbi.nlm.nih.gov/bioproject/10719>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/taxonomy/3702> .
<http://www.ncbi.nlm.nih.gov/taxonomy/3702>     <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/10719> .
<http://www.ncbi.nlm.nih.gov/bioproject/122>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/taxonomy/39947> .
<http://www.ncbi.nlm.nih.gov/taxonomy/39947>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/122> .
<http://www.ncbi.nlm.nih.gov/bioproject/122>    <http://www.w3.org/2000/01/rdf-schema#label>    "RefSeq Genome" .
<http://www.ncbi.nlm.nih.gov/bioproject/122>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/13141> .
<http://www.ncbi.nlm.nih.gov/bioproject/13141>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/122> .
<http://www.ncbi.nlm.nih.gov/bioproject/13141>  <http://www.w3.org/2000/01/rdf-schema#label>    "Genome sequencing" .
<http://www.ncbi.nlm.nih.gov/bioproject/13141>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/taxonomy/39947> .
<http://www.ncbi.nlm.nih.gov/taxonomy/39947>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/13141> .
<http://www.ncbi.nlm.nih.gov/bioproject/13141>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.genomesonline.org/cgi-bin/GOLD/GOLDCards.cgi?goldstamp=Gc00603> .
<http://www.genomesonline.org/cgi-bin/GOLD/GOLDCards.cgi?goldstamp=Gc00603>     <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/13141> .
<http://www.ncbi.nlm.nih.gov/bioproject/127>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/taxonomy/284812> .
<http://www.ncbi.nlm.nih.gov/taxonomy/284812>   <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/127> .
<http://www.ncbi.nlm.nih.gov/bioproject/127>    <http://www.w3.org/2000/01/rdf-schema#label>    "RefSeq Genome" .
<http://www.ncbi.nlm.nih.gov/bioproject/127>    <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/20755> .
<http://www.ncbi.nlm.nih.gov/bioproject/20755>  <http://www.w3.org/2000/01/rdf-schema#seeAlso>  <http://www.ncbi.nlm.nih.gov/bioproject/127> .
<http://www.ncbi.nlm.nih.gov/bioproject/20755>  <http://www.w3.org/2000/01/rdf-schema#label>    "Genome sequencing" .

XML2RDFコンバーター開発

フラットファイルではなくXMLからのコンバーターを作成する

  • BioProjectID -- INSDC Entry Accession のデータがXMLにない
  • xmlファイルが大きいため、Nokogiri:XML::SAX::Document を使ってパースする
  • project id はプレフィックスPRJDBが付く http://identifiers.org/bioproject/PRJDB116 をSubjectにする?

前回

個人用ツール