SPARQLthon61/MicrobeDB.jp-Umakaviewer

提供:TogoWiki

(版間での差分)
移動: 案内, 検索
(taxonomy.ttlは大きすぎてbuild_index時にエラー)
(umakaparserフィードバック)
132行: 132行:
   File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/scripts/services/assets.py", line 29, in output
   File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/scripts/services/assets.py", line 29, in output
IOError: [Errno 24] Too many open files: '/Users/tf/github/metadata/mdb_20171023/tmpgIsfwy/tmp1p1CMl'
IOError: [Errno 24] Too many open files: '/Users/tf/github/metadata/mdb_20171023/tmpgIsfwy/tmp1p1CMl'
 +
</pre>
 +
=== mccv.ttl のbuild_index時にエラー ===
 +
<pre>
 +
%umakaparser build_index ontology/mccv/mccv.ttl --dist test_index
 +
(4, 871, datetime.timedelta(0, 0, 8726))
 +
Traceback (most recent call last):
 +
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/bin/umakaparser", line 11, in <module>
 +
    sys.exit(cmd())
 +
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 716, in __call__
 +
    return self.main(*args, **kwargs)
 +
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 696, in main
 +
    rv = self.invoke(ctx)
 +
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 1060, in invoke
 +
    return _process_result(sub_ctx.command.invoke(sub_ctx))
 +
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 889, in invoke
 +
    return ctx.invoke(self.callback, **ctx.params)
 +
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 534, in invoke
 +
    return callback(*args, **kwargs)
 +
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/services.py", line 34, in build_index
 +
    output = index_owl(owl_data_ttl, target_properties, dist)
 +
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/scripts/services/assets.py", line 127, in index_owl
 +
    p.map(output_process, ((prefix, temp_file, output_properties, temp_dir) for temp_file in temp_files))
 +
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/multiprocessing/pool.py", line 251, in map
 +
    return self.map_async(func, iterable, chunksize).get()
 +
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/multiprocessing/pool.py", line 567, in get
 +
    raise self._value
 +
UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-13: ordinal not in range(128)
 +
</pre>

2017年10月24日 (火) 07:41時点における版

MicorbeDB.jpのSPARQL epに対して実行時の情報共有およびフィードバック

目次

metadataコマンドのヘルプ

%java -jar ../metadata-0.0.1-SNAPSHOT-jar-with-dependencies.jar
  Version: 20161212-1
Usage: java org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl [options]
   [options]
       1. to print a list of graphURIs
            -g endpointURL
       2. to crawl whole data in the endpoint
            -ac endpointURL crawlName outputFileName
       3. to crawl the specified graph in the endpoint
            -gc endpointURL crawlName graphURI outputFileName

metadata生成フィードバック

versionが違う

→ バグフィックスの更新のため問題ない、山口さんに確認済み

crawlNameがどこで使われているか不明

outputFileNameはディレクトリの指定、ディレクトリがないと実行後にエラーになる

%java -jar ../metadata-0.0.1-SNAPSHOT-jar-with-dependencies.jar -ac http://localhost:18895/sparql mdb mdb_output.txt
log4j:WARN No appenders could be found for logger (org.apache.jena.riot.stream.JenaIOEnvironment).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
-----------------------------------------------------------
  Graph: http://microbedb.jp/assembly        1 / 33
-----------------------------------------------------------
/Users/tf/github/metadata/mdb_20171023/mdb_output.txt/turtle_mdb_assembly_1.ttl
P
properties
http://ddbj.nig.ac.jp/ontologies/nucleotide/dblink
http://www.ncbi.nlm.nih.gov/assembly/refseq_category
http://www.ncbi.nlm.nih.gov/assembly/asm_name
http://www.ncbi.nlm.nih.gov/assembly/assembly_level
http://www.ncbi.nlm.nih.gov/assembly/taxon
.
.
.
java.io.FileNotFoundException: /Users/tf/github/metadata/mdb_20171023/mdb_output.txt/turtle_mdb_assembly_1.ttl (No such file or directory)
    at java.io.FileOutputStream.open0(Native Method)
    at java.io.FileOutputStream.open(FileOutputStream.java:270)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
    at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
    at org.sparqlbuilder.metadata.crawler.datastructure.SchemaCategory.write2File(SchemaCategory.java:20)
    at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.crawl(RDFsCrawlerImpl.java:270)
    at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.main(RDFsCrawlerImpl.java:86)
80747 msec.
Error occured.
Exception in thread "main" java.lang.Exception: Error occured s(341)
    at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.crawl(RDFsCrawlerImpl.java:286)
    at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.main(RDFsCrawlerImpl.java:86)
  • グラフ単位でも
%java -jar ../metadata-0.0.1-SNAPSHOT-jar-with-dependencies.jar -gc http://localhost:18893/sparql graph_18893 http://microbedb.jp/chebi chebi
  Version: 20161212-1
log4j:WARN No appenders could be found for logger (org.apache.jena.riot.stream.JenaIOEnvironment).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
-----------------------------------------------------------
  Graph: http://microbedb.jp/chebi        1 / 1
-----------------------------------------------------------
/Users/tf/github/metadata/mdb_20171023/chebi/turtle_graph_18893_chebi_1.ttl
.
.
.
#EndpointAccess: 161
java.io.FileNotFoundException: /Users/tf/github/metadata/mdb_20171023/chebi/turtle_graph_18893_chebi_1.ttl (No such file or directory)
	at java.io.FileOutputStream.open0(Native Method)
	at java.io.FileOutputStream.open(FileOutputStream.java:270)
	at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
	at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
	at org.sparqlbuilder.metadata.crawler.datastructure.SchemaCategory.write2File(SchemaCategory.java:20)
	at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.crawl(RDFsCrawlerImpl.java:270)
	at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.main(RDFsCrawlerImpl.java:103)
130204 msec.
Error occured.
Exception in thread "main" java.lang.Exception: Error occured s(341)
	at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.crawl(RDFsCrawlerImpl.java:286)
	at org.sparqlbuilder.metadata.crawler.sparql.RDFsCrawlerImpl.main(RDFsCrawlerImpl.java:103)

SPARQL ep.全体にかけたときは、fmaなど大きいオントロジーも入っていると時間がかかる

→ fma, taxonomy, so, sioなどを除きグラフ単位で取得に変更 以下、グラフ一覧を取得

%java -jar ../metadata-0.0.1-SNAPSHOT-jar-with-dependencies.jar -g http://localhost:18895/sparql
  Version: 20161212-1
log4j:WARN No appenders could be found for logger (org.apache.jena.riot.stream.JenaIOEnvironment).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

. . .

-gcでグラフ単位での実行時に不明なエラー出力された

ERROR: duplicate PDRD found! (L1144): 

→ 同一のプロパティでDomain, Rangeが繰り返して定義されている場合に出力されるエラー、おそらくOWLがインポートされたグラフで出力されていると予想される

umakaparserフィードバック

taxonomy.ttlのbuild_index時にエラー

%umakaparser build_index ontology/taxonomy/taxonomy.ttl --dist test_index
(4, 50300, datetime.timedelta(0, 0, 56585))
(5, 100300, datetime.timedelta(0, 0, 113394))
(6, 150300, datetime.timedelta(0, 0, 181359))
(8, 200300, datetime.timedelta(0, 0, 238139))
.
.
.
(253, 12450300, datetime.timedelta(0, 17, 255336))
(254, 12500300, datetime.timedelta(0, 17, 328418))
Traceback (most recent call last):
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/bin/umakaparser", line 11, in <module>
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 716, in __call__
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 696, in main
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 1060, in invoke
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 889, in invoke
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 534, in invoke
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/services.py", line 34, in build_index
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/scripts/services/assets.py", line 114, in index_owl
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/scripts/services/assets.py", line 50, in separate_large_owl
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/scripts/services/assets.py", line 29, in output
IOError: [Errno 24] Too many open files: '/Users/tf/github/metadata/mdb_20171023/tmpgIsfwy/tmp1p1CMl'

mccv.ttl のbuild_index時にエラー

%umakaparser build_index ontology/mccv/mccv.ttl --dist test_index
(4, 871, datetime.timedelta(0, 0, 8726))
Traceback (most recent call last):
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/bin/umakaparser", line 11, in <module>
    sys.exit(cmd())
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 716, in __call__
    return self.main(*args, **kwargs)
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 696, in main
    rv = self.invoke(ctx)
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 1060, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 889, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/click/core.py", line 534, in invoke
    return callback(*args, **kwargs)
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/services.py", line 34, in build_index
    output = index_owl(owl_data_ttl, target_properties, dist)
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/site-packages/umakaviewer/scripts/services/assets.py", line 127, in index_owl
    p.map(output_process, ((prefix, temp_file, output_properties, temp_dir) for temp_file in temp_files))
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/multiprocessing/pool.py", line 251, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/Users/tf/.anyenv/envs/pyenv/versions/2.7.10/lib/python2.7/multiprocessing/pool.py", line 567, in get
    raise self._value
UnicodeEncodeError: 'ascii' codec can't encode characters in position 1-13: ordinal not in range(128)
個人用ツール