SPARQLthon61/MicrobeDB.jp-Umakaviewer

提供:TogoWiki

(版間での差分)
移動: 案内, 検索
(umakaparserフィードバック)
108行: 108行:
== umakaparserフィードバック ==
== umakaparserフィードバック ==
-
=== taxonomy.ttlは大きすぎてbuild_index時にエラー
+
=== taxonomy.ttlは大きすぎてbuild_index時にエラー ===
<pre>
<pre>
%umakaparser build_index ontology/taxonomy/taxonomy.ttl --dist test_index
%umakaparser build_index ontology/taxonomy/taxonomy.ttl --dist test_index

2017年10月24日 (火) 07:38時点における版

MicorbeDB.jpのSPARQL epに対して実行時の情報共有およびフィードバック

目次

metadataコマンドのヘルプ

%java -jar ../metadata-0.0.1-SNAPSHOT-jar-with-dependencies.jar
  Version: 20161212-1
Usage: java org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl [options]
   [options]
       1. to print a list of graphURIs
            -g endpointURL
       2. to crawl whole data in the endpoint
            -ac endpointURL crawlName outputFileName
       3. to crawl the specified graph in the endpoint
            -gc endpointURL crawlName graphURI outputFileName

metadata生成フィードバック

versionが違う

→ バグフィックスの更新のため問題ない、山口さんに確認済み

crawlNameがどこで使われているか不明

outputFileNameはディレクトリの指定、ディレクトリがないと実行後にエラーになる

%java -jar ../metadata-0.0.1-SNAPSHOT-jar-with-dependencies.jar -ac http://localhost:18895/sparql mdb mdb_output.txt
log4j:WARN No appenders could be found for logger (org.apache.jena.riot.stream.JenaIOEnvironment).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
-----------------------------------------------------------
  Graph: http://microbedb.jp/assembly        1 / 33
-----------------------------------------------------------
/Users/tf/github/metadata/mdb_20171023/mdb_output.txt/turtle_mdb_assembly_1.ttl
P
properties
http://ddbj.nig.ac.jp/ontologies/nucleotide/dblink
http://www.ncbi.nlm.nih.gov/assembly/refseq_category
http://www.ncbi.nlm.nih.gov/assembly/asm_name
http://www.ncbi.nlm.nih.gov/assembly/assembly_level
http://www.ncbi.nlm.nih.gov/assembly/taxon
.
.
.
java.io.FileNotFoundException: /Users/tf/github/metadata/mdb_20171023/mdb_output.txt/turtle_mdb_assembly_1.ttl (No such file or directory)
    at java.io.FileOutputStream.open0(Native Method)
    at java.io.FileOutputStream.open(FileOutputStream.java:270)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
    at org.sparqlbuilder.metadata.crawler.datastructure.SchemaCategory.write2File(SchemaCategory.java:20)
    at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.crawl(RDFsCrawlerImpl.java:270)
    at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.main(RDFsCrawlerImpl.java:86)
80747 msec.
Error occured.
Exception in thread "main" java.lang.Exception: Error occured s(341)
    at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.crawl(RDFsCrawlerImpl.java:286)
    at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.main(RDFsCrawlerImpl.java:86)
  • グラフ単位でも
%java -jar ../metadata-0.0.1-SNAPSHOT-jar-with-dependencies.jar -gc http://localhost:18893/sparql graph_18893 http://microbedb.jp/chebi chebi
  Version: 20161212-1
log4j:WARN No appenders could be found for logger (org.apache.jena.riot.stream.JenaIOEnvironment).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
-----------------------------------------------------------
  Graph: http://microbedb.jp/chebi        1 / 1
-----------------------------------------------------------
/Users/tf/github/metadata/mdb_20171023/chebi/turtle_graph_18893_chebi_1.ttl
.
.
.
#EndpointAccess: 161
java.io.FileNotFoundException: /Users/tf/github/metadata/mdb_20171023/chebi/turtle_graph_18893_chebi_1.ttl (No such file or directory)
	at java.io.FileOutputStream.open0(Native Method)
	at java.io.FileOutputStream.open(FileOutputStream.java:270)
	at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
	at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
	at org.sparqlbuilder.metadata.crawler.datastructure.SchemaCategory.write2File(SchemaCategory.java:20)
	at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.crawl(RDFsCrawlerImpl.java:270)
	at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.main(RDFsCrawlerImpl.java:103)
130204 msec.
Error occured.
Exception in thread "main" java.lang.Exception: Error occured s(341)
	at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.crawl(RDFsCrawlerImpl.java:286)
	at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.main(RDFsCrawlerImpl.java:103)

SPARQL ep.全体にかけたときは、fmaなど大きいオントロジーも入っていると時間がかかる

→ fma, taxonomy, so, sioなどを除きグラフ単位で取得に変更 以下、グラフ一覧を取得

%java -jar ../metadata-0.0.1-SNAPSHOT-jar-with-dependencies.jar -g http://localhost:18895/sparql
  Version: 20161212-1
log4j:WARN No appenders could be found for logger (org.apache.jena.riot.stream.JenaIOEnvironment).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

. . .

-gcでグラフ単位での実行時に不明なエラー出力された

ERROR: duplicate PDRD found! (L1144): 

→ 同一のプロパティでDomain, Rangeが繰り返して定義されている場合に出力されるエラー、おそらくOWLがインポートされたグラフで出力されていると予想される

umakaparserフィードバック

taxonomy.ttlは大きすぎてbuild_index時にエラー

%umakaparser build_index ontology/taxonomy/taxonomy.ttl --dist test_index
(4, 50300, datetime.timedelta(0, 0, 56585))
(5, 100300, datetime.timedelta(0, 0, 113394))
(6, 150300, datetime.timedelta(0, 0, 181359))
(8, 200300, datetime.timedelta(0, 0, 238139))
.
.
.
(253, 12450300, datetime.timedelta(0, 17, 255336))
(254, 12500300, datetime.timedelta(0, 17, 328418))
Traceback (most recent call last):
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/bin/umakaparser", line 11, in <module>
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 716, in __call__
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 696, in main
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 1060, in invoke
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 889, in invoke
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 534, in invoke
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/services.py", line 34, in build_index
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/scripts/services/assets.py", line 114, in index_owl
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/scripts/services/assets.py", line 50, in separate_large_owl
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/scripts/services/assets.py", line 29, in output
IOError: [Errno 24] Too many open files: '/Users/tf/github/metadata/mdb_20171023/tmpgIsfwy/tmp1p1CMl'
個人用ツール