RubyからRの機能を使う

提供:TogoWiki

(版間での差分)
移動: 案内, 検索
183行: 183行:
ruby-1.8 DNAtranslate-bioruby.rb test-dna.fa 10 > /dev/null  52.98s user 1.48s system 91% cpu 59.319 total
ruby-1.8 DNAtranslate-bioruby.rb test-dna.fa 10 > /dev/null  52.98s user 1.48s system 91% cpu 59.319 total
</pre>
</pre>
 +
 +
Ruby 1.9 の gsub(/re/, hash) を試してみるべき。
=== Rinruby ===
=== Rinruby ===
190行: 192行:
<pre>
<pre>
% sudo gem install rinruby
% sudo gem install rinruby
 +
Successfully installed rinruby-2.0.1
 +
1 gem installed
 +
Installing ri documentation for rinruby-2.0.1...
 +
Installing RDoc documentation for rinruby-2.0.1...
</pre>
</pre>
243行: 249行:
ココで止まってしまうので実用的に使えなかった。Ruby 1.9 も同様。
ココで止まってしまうので実用的に使えなかった。Ruby 1.9 も同様。
-
 
-
=== Rserve ===
 
-
 
-
まずは RubyGems の更新から。。
 
<pre>
<pre>
-
% sudo gem-1.8 update --system
+
^C/usr/local/lib/ruby/gems/1.9.1/gems/rinruby-2.0.1/lib/rinruby.rb:679:in `read': Interrupt
-
Password:
+
from /usr/local/lib/ruby/gems/1.9.1/gems/rinruby-2.0.1/lib/rinruby.rb:679:in `pull_engine'
-
Updating RubyGems
+
from /usr/local/lib/ruby/gems/1.9.1/gems/rinruby-2.0.1/lib/rinruby.rb:466:in `pull'
-
Updating rubygems-update
+
from DNAtranslate-rinruby.rb:28:in `block (2 levels) in <main>'
-
Successfully installed rubygems-update-1.3.7
+
from /Users/ktym/lib/ruby/bio/io/flatfile.rb:336:in `each_entry'
-
Updating RubyGems to 1.3.7
+
from DNAtranslate-rinruby.rb:25:in `block in <main>'
-
Installing RubyGems 1.3.7
+
from DNAtranslate-rinruby.rb:24:in `times'
-
RubyGems 1.3.7 installed
+
from DNAtranslate-rinruby.rb:24:in `<main>'
-
Successfully uninstalled gemcutter-0.3.0
+
</pre>
-
=== 1.3.7 / 2010-05-13
+
仕方なく止めるとソケットからの読み込み待ちで止まっているっぽい。コードを pull -> eval にかえるとどうなるのかな。
-
NOTE:
+
=== Rserve ===
-
 
+
-
http://rubygems.org is now the default source for downloading gems.
+
-
 
+
-
You may have sources set via ~/.gemrc, so you should replace
+
-
http://gems.rubyforge.org with http://rubygems.org
+
-
 
+
-
http://gems.rubyforge.org will continue to work for the forseeable future.
+
-
 
+
-
New features:
+
-
 
+
-
* `gem` commands
+
-
  * `gem install` and `gem fetch` now report alternate platforms when a
+
-
    matching one couldn't be found.
+
-
  * `gem contents` --prefix is now the default as specified in --help.  Bug
+
-
    #27211 by Mamoru Tasaka.
+
-
  * `gem fetch` can fetch of old versions again.  Bug #27960 by Eric Hankins.
+
-
  * `gem query` and friends output now lists platforms.  Bug #27856 by Greg
+
-
    Hazel.
+
-
  * `gem server` now allows specification of multiple gem dirs for
+
-
    documentation.  Bug #27573 by Yuki Sonoda.
+
-
  * `gem unpack` can unpack gems again.  Bug #27872 by Timothy Jones.
+
-
  * `gem unpack` now unpacks remote gems.
+
-
  * --user-install is no longer the default.  If you really liked it, see
+
-
    Gem::ConfigFile to learn how to set it by default.  (This change was made
+
-
    in 1.3.6)
+
-
* RubyGems now has platform support for IronRuby.  Patch #27951 by Will Green.
+
-
 
+
-
Bug fixes:
+
-
 
+
-
* Require rubygems/custom_require if --disable-gem was set.  Bug #27700 by
+
-
  Roger Pack.
+
-
* RubyGems now protects against exceptions being raised by plugins.
+
-
* rubygems/builder now requires user_interaction.  Ruby Bug #1040 by Phillip
+
-
  Toland.
+
-
* Gem::Dependency support #version_requirements= with a warning.  Fix for old
+
-
  Rails versions.  Bug #27868 by Wei Jen Lu.
+
-
* Gem::PackageTask depends on the package dir like the other rake package
+
-
  tasks so dependencies can be hooked up correctly.
+
-
 
+
-
------------------------------------------------------------------------------
+
-
 
+
-
RubyGems installed the following executables:
+
-
    /usr/local/bin/gem-1.8
+
-
</pre>
+
rserve-client をインストールする。http://github.com/clbustos/Rserve-Ruby-client
rserve-client をインストールする。http://github.com/clbustos/Rserve-Ruby-client
<pre>
<pre>
-
% sudo gem-1.8 install rserve-client
+
% sudo gem install rserve-client
Successfully installed rserve-client-0.2.5
Successfully installed rserve-client-0.2.5
1 gem installed
1 gem installed
317行: 275行:
</pre>
</pre>
-
Rserve のインストール http://rosuda.org/Rserve/doc.shtml#inst (←古い) → http://www.rforge.net/Rserve
+
Rserve のインストール。http://www.rforge.net/Rserve/files/
-
 
+
-
<pre>
+
-
% R
+
-
R version 2.9.2 (2009-08-24)
+
-
Copyright (C) 2009 The R Foundation for Statistical Computing
+
-
ISBN 3-900051-07-0
+
-
 
+
-
> install.packages("Rserve")
+
-
--- このセッションで使うために、CRANのミラーサイトを選んでください ---
+
-
Tcl/Tkインターフェースのロード中  終了済
+
-
URL 'http://cran.md.tsukuba.ac.jp/bin/macosx/universal/contrib/2.9/Rserve_0.6-0.tgz' を試しています
+
-
Content type 'application/x-gzip' length 156201 bytes (152 Kb)
+
-
開かれた URL
+
-
==================================================
+
-
downloaded 152 Kb
+
-
 
+
-
ダウンロードされたパッケージは、以下にあります
+
-
      /var/folders/lt/ltVmCLsiF3mLKUpLCN3GlU+++TM/-Tmp-//RtmpnNykYu/downloaded_packages
+
-
>
+
-
</pre>
+
-
 
+
-
<pre>
+
-
% R CMD Rserve
+
-
/Library/Frameworks/R.framework/Resources/bin/Rcmd: line 52: exec: Rserve: not found
+
-
</pre>
+
-
 
+
-
あれ?もういちど。
+
-
 
+
-
<pre>
+
-
% R
+
-
> install.packages("Rserve")
+
-
--- このセッションで使うために、CRANのミラーサイトを選んでください ---
+
-
 
+
-
Tcl/Tkインターフェースのロード中  終了済
+
-
URL 'http://cran.md.tsukuba.ac.jp/bin/macosx/universal/contrib/2.9/Rserve_0.6-0.tgz' を試しています
+
-
Content type 'application/x-gzip' length 156201 bytes (152 Kb)
+
-
開かれた URL
+
-
==================================================
+
-
downloaded 152 Kb
+
-
 
+
-
ダウンロードされたパッケージは、以下にあります
+
-
      /var/folders/lt/ltVmCLsiF3mLKUpLCN3GlU+++TM/-Tmp-//RtmpcFqz6x/downloaded_packages
+
-
</pre>
+
-
 
+
-
ここで、別ターミナルから
+
-
 
+
-
<pre>
+
-
% R CMD INSTALL /var/folders/lt/ltVmCLsiF3mLKUpLCN3GlU+++TM/-Tmp-//RtmpcFqz6x/downloaded_packages/Rserve_0.6-0.tgz
+
-
* Installing to library ‘/Library/Frameworks/R.framework/Resources/library’
+
-
* Installing *binary* package ‘Rserve’ ...
+
-
* DONE (Rserve)
+
-
</pre>
+
-
 
+
-
一見インストールされたかに見えるけど、
+
-
 
+
-
<pre>
+
-
% R CMD Rserve
+
-
/Library/Frameworks/R.framework/Resources/bin/Rcmd: line 52: exec: Rserve: not found
+
-
</pre>
+
-
 
+
-
じつはダメ。
+
-
 
+
-
http://www.rforge.net/Rserve/files/ こちらのダウンロードリンクから取ってインストールしてみる。
+
<pre>
<pre>
488行: 383行:
* DONE (Rserve)
* DONE (Rserve)
</pre>
</pre>
-
 
-
こんどはヨサゲ
 
<pre>
<pre>
498行: 391行:
Rserv started in daemon mode.
Rserv started in daemon mode.
-
 
</pre>
</pre>
-
Rserv サーバ、うごいたわー。
+
Rserve のサーバがうごいたー。
<pre>
<pre>
520行: 412行:
</pre>
</pre>
-
オッケー
+
オッケー。
ここで、BioConductor の GeneR パッケージをインストールする。
ここで、BioConductor の GeneR パッケージをインストールする。
528行: 420行:
> source("http://bioconductor.org/biocLite.R")
> source("http://bioconductor.org/biocLite.R")
> biocLite()
> biocLite()
-
Using R version 2.9.2, biocinstall version 2.4.13.
 
-
Installing Bioconductor version 2.4 packages:
 
-
[1] "affy"        "affydata"    "affyPLM"      "annaffy"      "annotate" 
 
-
[6] "Biobase"      "biomaRt"      "Biostrings"  "DynDoc"      "gcrma"     
 
-
[11] "genefilter"  "geneplotter"  "hgu95av2.db"  "limma"        "marray"   
 
-
[16] "multtest"    "vsn"          "xtable"      "affyQCReport"
 
-
Please wait...
 
-
 
-
also installing the dependencies ‘DBI’, ‘RSQLite’, ‘bitops’, ‘affyio’, ‘preprocessCore’, ‘GO.db’, ‘KEGG.db’, ‘AnnotationDbi’, ‘XML’, ‘RCurl’, ‘IRanges’, ‘RColorBrewer’, ‘simpleaffy’
 
-
 
-
URL 'http://cran.fhcrc.org/bin/macosx/universal/contrib/2.9/DBI_0.2-4.tgz' を試しています
 
-
Content type 'application/x-gzip' length 365584 bytes (357 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 357 Kb
 
-
 
-
URL 'http://cran.fhcrc.org/bin/macosx/universal/contrib/2.9/RSQLite_0.7-3.tgz' を試しています
 
-
Content type 'application/x-gzip' length 635210 bytes (620 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 620 Kb
 
-
 
-
URL 'http://cran.fhcrc.org/bin/macosx/universal/contrib/2.9/bitops_1.0-4.1.tgz' を試しています
 
-
Content type 'application/x-gzip' length 17796 bytes (17 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 17 Kb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/affyio_1.12.0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 134475 bytes (131 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 131 Kb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/preprocessCore_1.6.0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 164900 bytes (161 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 161 Kb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/data/annotation/bin/macosx/universal/contrib/2.9/GO.db_2.2.11.tgz' を試しています
 
-
Content type 'application/x-gzip' length 13912385 bytes (13.3 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 13.3 Mb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/data/annotation/bin/macosx/universal/contrib/2.9/KEGG.db_2.2.11.tgz' を試しています
 
-
Content type 'application/x-gzip' length 965939 bytes (943 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 943 Kb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/AnnotationDbi_1.6.1.tgz' を試しています
 
-
Content type 'application/x-gzip' length 4159124 bytes (4.0 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 4.0 Mb
 
-
 
-
URL 'http://cran.fhcrc.org/bin/macosx/universal/contrib/2.9/XML_2.6-0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 1107207 bytes (1.1 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 1.1 Mb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/extra/bin/macosx/universal/contrib/2.9/RCurl_0.98-1.tgz' を試しています
 
-
Content type 'application/x-gzip' length 419331 bytes (409 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 409 Kb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/IRanges_1.2.3.tgz' を試しています
 
-
Content type 'application/x-gzip' length 1179969 bytes (1.1 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 1.1 Mb
 
-
 
-
URL 'http://cran.fhcrc.org/bin/macosx/universal/contrib/2.9/RColorBrewer_1.0-2.tgz' を試しています
 
-
Content type 'application/x-gzip' length 21103 bytes (20 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 20 Kb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/simpleaffy_2.20.0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 720293 bytes (703 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 703 Kb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/affy_1.22.1.tgz' を試しています
 
-
Content type 'application/x-gzip' length 1440011 bytes (1.4 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 1.4 Mb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/data/experiment/bin/macosx/universal/contrib/2.9/affydata_1.11.8.tgz' を試しています
 
-
Content type 'application/x-gzip' length 12744328 bytes (12.2 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 12.2 Mb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/affyPLM_1.20.0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 5116332 bytes (4.9 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 4.9 Mb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/annaffy_1.16.0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 292399 bytes (285 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 285 Kb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/annotate_1.22.0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 1897215 bytes (1.8 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 1.8 Mb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/Biobase_2.4.1.tgz' を試しています
 
-
Content type 'application/x-gzip' length 2407933 bytes (2.3 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 2.3 Mb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/biomaRt_2.0.0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 261927 bytes (255 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 255 Kb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/Biostrings_2.12.10.tgz' を試しています
 
-
Content type 'application/x-gzip' length 1822236 bytes (1.7 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 1.7 Mb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/DynDoc_1.22.0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 86097 bytes (84 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 84 Kb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/gcrma_2.16.0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 167018 bytes (163 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 163 Kb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/genefilter_1.24.3.tgz' を試しています
 
-
Content type 'application/x-gzip' length 390738 bytes (381 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 381 Kb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/geneplotter_1.22.0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 1539742 bytes (1.5 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 1.5 Mb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/data/annotation/bin/macosx/universal/contrib/2.9/hgu95av2.db_2.2.12.tgz' を試しています
 
-
Content type 'application/x-gzip' length 15229482 bytes (14.5 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 14.5 Mb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/limma_2.18.3.tgz' を試しています
 
-
Content type 'application/x-gzip' length 1121178 bytes (1.1 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 1.1 Mb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/marray_1.22.0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 4250960 bytes (4.1 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 4.1 Mb
 
-
 
-
URL 'http://cran.fhcrc.org/bin/macosx/universal/contrib/2.9/multtest_2.1.3.tgz' を試しています
 
-
Content type 'application/x-gzip' length 1689120 bytes (1.6 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 1.6 Mb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/vsn_3.12.0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 1319686 bytes (1.3 Mb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 1.3 Mb
 
-
 
-
URL 'http://cran.fhcrc.org/bin/macosx/universal/contrib/2.9/xtable_1.5-5.tgz' を試しています
 
-
Content type 'application/x-gzip' length 174172 bytes (170 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 170 Kb
 
-
 
-
URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/affyQCReport_1.22.0.tgz' を試しています
 
-
Content type 'application/x-gzip' length 149729 bytes (146 Kb)
 
-
開かれた URL
 
-
==================================================
 
-
downloaded 146 Kb
 
-
 
-
The downloaded packages are in
 
-
    /var/folders/lt/ltVmCLsiF3mLKUpLCN3GlU+++TM/-Tmp-//RtmpHRXj4g/downloaded_packages
 
</pre>
</pre>
-
どんだけ。。かなーり待たされてインストール完了。
+
かなーり待たされてインストール完了。
 +
しかし GeneR はこの中に入っていないらしい。。
<pre>
<pre>
740行: 429行:
  以下にエラー library(GeneR) :  'GeneR' という名前のパッケージはありません
  以下にエラー library(GeneR) :  'GeneR' という名前のパッケージはありません
</pre>
</pre>
-
 
-
しかも GeneR はこの中に入っていないらしい。orz
 
-
 
-
<pre>
 
-
> install.packages('GeneR')
 
-
Warning message:
 
-
In getDependencies(pkgs, dependencies, available, lib) :
 
-
  package ‘GeneR’ is not available
 
-
</pre>
 
-
 
-
あれ?
 
http://www.bioconductor.org/packages/2.3/bioc/html/GeneR.html によると、
http://www.bioconductor.org/packages/2.3/bioc/html/GeneR.html によると、
756行: 434行:
<pre>
<pre>
> source("http://bioconductor.org/biocLite.R")
> source("http://bioconductor.org/biocLite.R")
-
> biocLite("GeneR")
 
-
</pre>
 
-
 
-
とするらしい。
 
-
 
-
<pre>
 
> biocLite("GeneR")
> biocLite("GeneR")
Using R version 2.9.2, biocinstall version 2.4.13.
Using R version 2.9.2, biocinstall version 2.4.13.
778行: 450行:
</pre>
</pre>
-
おっけー。
+
とするらしい。
<pre>
<pre>
926行: 598行:
-
=== RSruby ===
+
=== RSRuby ===
<pre>
<pre>
-
% sudo gem-1.8 install rsruby -- --with-R-dir=/Library/Frameworks/R.framework/Resources
+
% sudo gem install rsruby -- --with-R-dir=/Library/Frameworks/R.framework/Resources
</pre>
</pre>

2010年10月22日 (金) 16:54時点における版

目次

RSRuby

Tigerでは、

$ export LD_LIBRARY_PATH=:/Library/Frameworks/R.framework/Resources/lib
$ ruby187/bin/gem install rsruby-0.5.1.1.gem -- --with-R-dir=$R_HOME
Building native extensions.  This could take a while...
Successfully installed rsruby-0.5.1.1
1 gem installed
Installing ri documentation for rsruby-0.5.1.1...
Installing RDoc documentation for rsruby-0.5.1.1...
$ R --version
R version 2.8.0 (2008-10-20)

利用コードは

ENV['R_HOME']='/Library/Frameworks/R.framework/Resources/lib'
require 'rsruby'

RinRuby

二階堂さんのブログ記事

Rserve + Rserve-Ruby-client

gem install rserve-client

ベンチマーク

Rserve-Ruby-clientの作者による3者比較

ここまでが事前調査


ここからがテストしてみた結果

BioRuby と RinRuby, Rserve の比較と RSRuby, RSOAP の挫折

ベンチマークは 6.8MB, 5210 エントリの塩基配列 FASTA ファイルを読み込んで翻訳するスピードを競う、らしい。

入力データ:

% head -50 test-dna.fa
>2L52.1
atgtcaatggtaagaaatgtatcaaatcagagcgaaaaattggaaatttt
gtcatgtaaatgggtaggatgtctcaaatcaacagaagtgttcaaaacgg
ttgaaaagttattagatcatgttacggctgatcatattccagaagttatt
gtaaacgatgacgggtcggaggaagtcgtttgtcagtgggattgctgcga
aatgggtgccagtcgtggaaatcttcaaaaaaagaaagagtggatggaga
atcacttcaaaacacgtcatgttcgcaaagcaaaaatattcaaatgctta
attgaggattgccctgtggtaaagtcaagtagtcaggaaattgaaaccca
tctcagaataagtcatccaataaatccgaaaaaagagagactgaaagagt
ttaaaagttctaccgaccacatcgaacctactcaagctaatagagtatgg
acaattgtgaacggagaggttcaatggaagactccaccgcgggttaaaaa
aaagactgtgatatactatgatgatgggccgaggtatgtatttccaacgg
gatgtgcgagatgcaactatgatagtgacgaatcagaactggaatcagat
gagttttggtcagccacagagatgtcagataatgaagaagtatatgtgaa
cttccgtggaatgaactgtatctcaacaggaaagtcggccagtatggtcc
cgagcaaacgaagaaattggccaaaaagagtgaagaaaaggctatcgaca
caaagaaacaatcagaaaactattcgaccaccagagctgaataaaaataa
tatagagataaaagatatgaactcaaataaccttgaagaacgcaacagag
aagaatgcattcagcctgtttctgttgaaaagaacatcctgcattttgaa
aaattcaaatcaaatcaaatttgcattgttcgggaaaacaataaatttag
agaaggaacgagaagacgcagaaagaattctggtgaatcggaagacttga
aaattcatgaaaactttactgaaaaacgaagacccattcgatcatgcaaa
caaaatataagtttctatgaaatggacggggatatagaagaatttgaagt
gtttttcgatactcccacaaaaagcaaaaaagtacttctggatatctaca
gtgcgaagaaaatgccaaaaattgaggttgaagattcattagttaataag
tttcattcaaaacgtccatcaagagcatgtcgagttcttggaagtatgga
agaagtaccatttgatgtggaaataggatattga
>2RSSE.1
atgacagtggcgagttacagtatggtgctgtgtggctcatctgatgatca
tcgctatcgaggcagaatcgaaaaagtaaaattcggcgtacccataaacg
aagcatttgcccatgacattcccgccacgcttctcatgctcttgctcaaa
gtgaacaaggatggacccgcgaaaaaggatatttggcgagcgcccggaaa
tcaggctcaagtgcgaaaattgtcgcaagtgatgcaacacgggcggcttg
taaatatcgagaatttcacggtttacacggcggcatctgtcatcaaaaag
tttctttcaaagttgccaaacggcatttttggacgggataatgaggagac
actgttcaatagtgcatcgactggaatggatattgagaagcagagacagg
tgttttataggatatttggatcacttccagtcgcatcccaacacttgctc
gtcctacttttcggcacatttcgggtcgtcgccgactcgtcggacggtca
ttcgaacgcgatgaacccgaatgcgatcgcgatttcggtggcaccatcgc
tttttcacacttgtatacacgatggacggacggcgcgagtagaagacctt
caacggttcaagctggcctcgaacattgtgtgctcgataatttgctcatt
cggcgacacgaagctcttcccacgcgagtgctacgagtattacgccagat
acacgggtcgcacgttgcgaatcgacgagaatcgaatgttcacttttcat
aatccatccaaccgtcgtgctcgtggcgaagagttctccgcgttggcggc
aaagtgtgcgggcgcctactcgctggccgccatccacctggccgaagaag
cgtcaccggagcccactccgacaacctcgaagcctccacgtggcaacggc
gtcgggcgtgccgggagtctgaagcagcacgcgttgacccagacgacgga
tcatccgaagagaagcgtgtcgatcgcggctaaggatccgtatccaactg
atttaaggacatcggtcagctgtgatttttga
>2RSSE.2
 :

結果:Ruby 1.9 は 1.8 よりかなり速い。R を使うより BioRuby 単独の方が速い(ただしBioLib + Ruby は超速いらしい)。Rserve は結構安定して使える。

  • pure BioRuby

Ruby 1.9 で約3秒、Ruby 1.8 で約6秒

% time ruby-1.9 DNAtranslate-bioruby.rb test-dna.fa > /dev/null
ruby-1.9 DNAtranslate-bioruby.rb test-dna.fa > /dev/null  2.96s user 0.07s system 97% cpu 3.120 total

% time ruby-1.9 DNAtranslate-bioruby.rb test-dna.fa 10 > /dev/null
ruby-1.9 DNAtranslate-bioruby.rb test-dna.fa 10 > /dev/null  28.25s user 0.29s system 91% cpu 31.250 total
% time ruby-1.8 DNAtranslate-bioruby.rb test-dna.fa > /dev/null
ruby-1.8 DNAtranslate-bioruby.rb test-dna.fa > /dev/null  5.36s user 0.18s system 97% cpu 5.666 total

% time ruby-1.8 DNAtranslate-bioruby.rb test-dna.fa 10 > /dev/null
ruby-1.8 DNAtranslate-bioruby.rb test-dna.fa 10 > /dev/null  52.98s user 1.48s system 91% cpu 59.319 total
  • RinRuby

途中で実行が継続できなくなりテストできなかった。

  • Rserve

Ruby 1.9 で約6秒、Ruby 1.8 で約16秒

% time ruby-1.9 DNAtranslate-rserve.rb test-dna.fa > /dev/null 
ruby-1.9 DNAtranslate-rserve.rb test-dna.fa > /dev/null  4.16s user 0.25s system 76% cpu 5.737 total

% time ruby-1.9 DNAtranslate-rserve.rb test-dna.fa 10 > /dev/null
ruby-1.9 DNAtranslate-rserve.rb test-dna.fa 10 > /dev/null  39.71s user 1.90s system 75% cpu 55.061 total
% time ruby-1.8 DNAtranslate-rserve.rb test-dna.fa > /dev/null
ruby-1.8 DNAtranslate-rserve.rb test-dna.fa > /dev/null  13.69s user 0.28s system 87% cpu 16.010 total

% time ruby-1.8 DNAtranslate-rserve.rb test-dna.fa 10 > /dev/null
ruby-1.8 DNAtranslate-rserve.rb test-dna.fa 10 > /dev/null  135.66s user 2.56s system 82% cpu 2:46.68 total
  • RSRuby

Snow Leopard では?インストールが困難でテストまでたどり着けなかった。

  • RSOAP

サーバのインストール方法を調べるところから。

pure BioRuby

% cat DNAtranslate-bioruby.rb
require 'rubygems'
require 'bio'

fasta = ARGV.shift
repeat = (ARGV.shift || 1).to_i

repeat.times do
  Bio::FlatFile.auto(fasta).each do |entry|
    puts ">#{entry.entry_id}"
    puts entry.naseq.translate
  end
end
% time ruby-1.9 DNAtranslate-bioruby.rb test-dna.fa > /dev/null
ruby-1.9 DNAtranslate-bioruby.rb test-dna.fa > /dev/null  2.96s user 0.07s system 97% cpu 3.120 total

% time ruby-1.9 DNAtranslate-bioruby.rb test-dna.fa 10 > /dev/null
ruby-1.9 DNAtranslate-bioruby.rb test-dna.fa 10 > /dev/null  28.25s user 0.29s system 91% cpu 31.250 total

% time ruby-1.8 DNAtranslate-bioruby.rb test-dna.fa > /dev/null
ruby-1.8 DNAtranslate-bioruby.rb test-dna.fa > /dev/null  5.36s user 0.18s system 97% cpu 5.666 total

% time ruby-1.8 DNAtranslate-bioruby.rb test-dna.fa 10 > /dev/null
ruby-1.8 DNAtranslate-bioruby.rb test-dna.fa 10 > /dev/null  52.98s user 1.48s system 91% cpu 59.319 total

Ruby 1.9 の gsub(/re/, hash) を試してみるべき。

Rinruby

Pure Ruby で書かれた R の実行ライブラリ。http://blog.itoshi.tv/2010/09/rinruby/

% sudo gem install rinruby
Successfully installed rinruby-2.0.1
1 gem installed
Installing ri documentation for rinruby-2.0.1...
Installing RDoc documentation for rinruby-2.0.1...
% cat DNAtranslate-rinruby.rb
require 'rubygems'
require 'rinruby'
require 'bio'

R.echo(enable = false)

R.pull('library(GeneR)')

fasta = ARGV.shift
repeat = (ARGV.shift || 1).to_i

repeat.times do
  Bio::FlatFile.auto(fasta).each do |entry|
    puts ">#{entry.entry_id}"
    ntseq = entry.seq
    result = R.pull(%Q[strTranslate("#{ntseq}")])
    puts result
  end
end
% ruby-1.8 DNAtranslate-rinruby.rb test-dna.fa
>2L52.1
MSMVRNVSNQSEKLEILSCKWVGCLKSTEVFKTVEKLLDHVTADHIPEVIVNDDGSEEVVCQWDCCEMGASRGNLQKKKEWMENHFKTRHVRKAKIFKCLIEDCPVVKSSSQEIETHLRISHPINPKKERLKEFKSSTDHIEPTQANRVWTIVNGEVQWKTPPRVKKKTVIYYDDGPRYVFPTGCARCNYDSDESELESDEFWSATEMSDNEEVYVNFRGMNCISTGKSASMVPSKRRNWPKRVKKRLSTQRNNQKTIRPPELNKNNIEIKDMNSNNLEERNREECIQPVSVEKNILHFEKFKSNQICIVRENNKFREGTRRRRKNSGESEDLKIHENFTEKRRPIRSCKQNISFYEMDGDIEEFEVFFDTPTKSKKVLLDIYSAKKMPKIEVEDSLVNKFHSKRPSRACRVLGSMEEVPFDVEIGY*
>2RSSE.1
MTVASYSMVLCGSSDDHRYRGRIEKVKFGVPINEAFAHDIPATLLMLLLKVNKDGPAKKDIWRAPGNQAQVRKLSQVMQHGRLVNIENFTVYTAASVIKKFLSKLPNGIFGRDNEETLFNSASTGMDIEKQRQVFYRIFGSLPVASQHLLVLLFGTFRVVADSSDGHSNAMNPNAIAISVAPSLFHTCIHDGRTARVEDLQRFKLASNIVCSIICSFGDTKLFPRECYEYYARYTGRTLRIDENRMFTFHNPSNRRARGEEFSALAAKCAGAYSLAAIHLAEEASPEPTPTTSKPPRGNGVGRAGSLKQHALTQTTDHPKRSVSIAAKDPYPTDLRTSVSCDF*
>2RSSE.2
MRVPTIQENEPMRNQPSTSRATTKPMPTMARLNNRLSSSVGEVLIEGISDVEELSDLDIIRPLTACGGDRSLSYLQYVHENQARRMRSRSEWFLSPVSNAKKTSSKSVDYFGPVTIEENPKPTPLPKPARAPLQTSSKSNISNASDDSVPTRRRSLKMQMRAAAFASNPSHSLDYQEVGASNPRLRGHTSVEDDTWLAEVVPHDEIPKKRRSLKKKTSTQF*
>3R5.1
MFSPLECRLAVACKFQDDRYYKLFHQYFDLLAQVHSVVETMDGLWMLRVWRAQKFGPESIKERRERQLFHVTQFSFKRYIVPPNPRIGKAIEEFGKEYLIEINVYDEHRADLVSLNSGNFVAIQNVHAASTPHREIQILHGGGEAYQRGISTVPVDFEVDAFQNFKRKVESVLENVLYDENFIEFQQPEEVTENRVPEEPQLNIHKDPLPSDLPQ*
>4R79.1a
MLDHVLLLTYCLVSTVVRSQPSADVFRSFAGYIPEDHRVTHHEWQNSGKFQGDIDGVDPNLLKLPEGPVLFNALKNKQLTWEGGVIPYEMDTAFSPNEIKILEKAFDSYRRTTCIRFEKREGQTDYLNIVKGYGCYSQVGRTGGKQEISLGRGCFFHEIIVHELMHSVGFWHEHSRADRDDHIKINWDNILPGMKSQFDKISAVLQDLQGENYDYKSIMHYDSTAFSRNGRNTIETVENGFTQVIGTAMDLSPLDIVKINKLYSCKTKKKEKVKPATTEEPHQLIPQVVDKNSVDSGEKCVDHFADCPHFAQYCTRASFFFVMKSYCPFTCKHCPGDRKLKKSG*
>4R79.1b
MHSVGFWHEHSRADRDDHIKINWDNILPGMKSQFDKISAVLQDLQGENYDYKSIMHYDSTAFSRNGRNTIETVENGFTQVIGTAMDLSPLDIVKINKLYSCKTKKKEKVKPATTEEPHQLIPQVVDKNSVDSGEKCVDHFADCPHFAQYCTRASFFFVMKSYCPFTCKHCPGDRKLKKSG*
>4R79.2a
MEVESATNSSSEESIFHEQTRKLFRLCDTKHIGLIGQSDLETLVDLIPQDDLHKIARFIGDQKVNEQAFCRILKAIVNQSLQQNMAKNEVEIPCILDKSQYLEESSLEDEMREIEQNKSYLDDPLTKILKKELEEIKNYEDFQVRNEQVLDNIIIKKPLYRPIQPEQSIPKVSSLAEELNAIGKKVLQKEEIEEEVTQPDRIFKVVFVGDSAVGKTCFLHRFCHNRFKPLFNATIGVDFTVKTMKIPPNRAIAMQLWDTAGQERFRSITKQYFRKADGVVLMFDVTSEQSFLNVRNWIDSVRAGVDDATVMCLVGNKMDLFGSDIARSAVYRAAEKLAVEFKIPFFETSAYTGFGIDTCMRQMAENLQRREDNHLEEALKLDINSNYKKRSWCCI*
>4R79.2b
MAKNEVEIPCILDKSQYLEESSLEDEMREIEQNKSYLDDPLTKILKKELEEIKNYEDFQVRNEQVLDNIIIKKPLYRPIQPEQSIPKVSSLAEELNAIGKKVLQKEEIEEEVTQPDRIFKVVFVGDSAVGKTCFLHRFCHNRFKPLFNATIGVDFTVKTMKIPPNRAIAMQLWDTAGQERFRSITKQYFRKADGVVLMFDVTSEQSFLNVRNWIDSVRAGVDDATVMCLVGNKMDLFGSDIARSAVYRAAEKLAVEFKIPFFETSAYTGFGIDTCMRQMAENLQRREDNHLEEALKLDINSNYKKRSWCCI*
>6R55.2
MPTLYLSTDYTVTNWKGDGSPLSFSRVDKEYTYTHTHTLSHTTSLSIFFLTFSSVSSFVHFRKKMYVIFIVRQIILISFHLINNL*
>AC3.1
MIMFTEAEVMSFSYAVDFGVPEWLKLYYHVISVVSTVISFFSMYIILFQSGKMDGYRFYLFYMQFAGWLMDLHLSTFMQFIPLFPVFGGYCTGLLTQIFRIDDSFQTTYTAFTICLVASALNSCFVRKHQAISKISSKYLLDNVTYCIVIFLLNIYPVIAASLLYLSMLNKSEQVELVKSVYPNLVDKFASLPNYVVFDSNIWAIVFFAFIFFGCTYTLVLIVTTTYQMFKILDDNRKHISASNYAKHRATLRSLLAQFTTCFLIVGPASLFSLLVVIRYEHSQVATHWTIVALTLHSSANAIVMVITYPPYRHFVMLWKTNRSFHFASSQYQRSTLPNTRIQTERSIAVTITTH*
>AC3.2
MYTFLFLLLSLLAVDAGKILVYSPSISRSHLISNGRIADALVDAGHDVVMFITEYEPLTEFTGTKKAKVITMKGFSTKFAEDMDGIGEYLLSSSRLSFLERLMFEKTCTGACDDLMTRREELEQLRAYNFDVAFSEQIDLCGVGIVRYLGIKNHLWISTTPIMDAVSYNLGIPAPSSYVPTIEENDNGDKMDFWQRTFNLYMKIGSILIHRYGTDGTTEVFRKYIPDFPNVREIAANSSLCFVNSDEVLDLPRPTITKAIYVGGLGIPKVSKPLDKKFTNIMSKGKEGVVIISLGSIIPFGDLPAAAKEGVLRAIQEISDYHFLIKIAKGDNNTKKLVEGIKNVDVAEWLPQVDILSHPRLKLFVMHGGINGLVETAIQAVPTVIVPVFADQFRNGRMVEKRGIGKVLLKLDIGYESFKNTVLTVLNTPSYKKNAIRIGKMMRDKPFSPEERLTKWTQFAIDHGVLEELHVEGSRLNTIIYYNLDVIAFVLFVFVAVLHVFIYAFKFLCCKKRSQSNIKKSKKNN*
>AC3.3

ココで止まってしまうので実用的に使えなかった。Ruby 1.9 も同様。

^C/usr/local/lib/ruby/gems/1.9.1/gems/rinruby-2.0.1/lib/rinruby.rb:679:in `read': Interrupt
	from /usr/local/lib/ruby/gems/1.9.1/gems/rinruby-2.0.1/lib/rinruby.rb:679:in `pull_engine'
	from /usr/local/lib/ruby/gems/1.9.1/gems/rinruby-2.0.1/lib/rinruby.rb:466:in `pull'
	from DNAtranslate-rinruby.rb:28:in `block (2 levels) in <main>'
	from /Users/ktym/lib/ruby/bio/io/flatfile.rb:336:in `each_entry'
	from DNAtranslate-rinruby.rb:25:in `block in <main>'
	from DNAtranslate-rinruby.rb:24:in `times'
	from DNAtranslate-rinruby.rb:24:in `<main>'

仕方なく止めるとソケットからの読み込み待ちで止まっているっぽい。コードを pull -> eval にかえるとどうなるのかな。

Rserve

rserve-client をインストールする。http://github.com/clbustos/Rserve-Ruby-client

% sudo gem install rserve-client
Successfully installed rserve-client-0.2.5
1 gem installed
Installing ri documentation for rserve-client-0.2.5...
Installing RDoc documentation for rserve-client-0.2.5...

Rserve のインストール。http://www.rforge.net/Rserve/files/

% wget http://www.rforge.net/Rserve/snapshot/Rserve_0.6-2.tar.gz

% R CMD INSTALL Rserve_0.6-2.tar.gz
* Installing to library ‘/Library/Frameworks/R.framework/Resources/library’
* Installing *source* package ‘Rserve’ ...
checking whether to compile the server... yes
checking whether to compile the client... no
checking for gcc... gcc -arch i386 -std=gnu99
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc -arch i386 -std=gnu99 accepts -g... yes
checking for gcc -arch i386 -std=gnu99 option to accept ISO C89... none needed
checking how to run the C preprocessor... gcc -arch i386 -std=gnu99 -E
checking for grep that handles long lines and -e... /usr/bin/grep
checking for egrep... /usr/bin/grep -E
checking for ANSI C header files... rm: conftest.dSYM: is a directory
rm: conftest.dSYM: is a directory
yes
checking for sys/wait.h that is POSIX.1 compatible... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for string.h... (cached) yes
checking for memory.h... (cached) yes
checking sys/time.h usability... yes
checking sys/time.h presence... yes
checking for sys/time.h... yes
checking for unistd.h... (cached) yes
checking for sys/stat.h... (cached) yes
checking for sys/types.h... (cached) yes
checking sys/socket.h usability... yes
checking sys/socket.h presence... yes
checking for sys/socket.h... yes
checking sys/un.h usability... yes
checking sys/un.h presence... yes
checking for sys/un.h... yes
checking netinet/in.h usability... yes
checking netinet/in.h presence... yes
checking for netinet/in.h... yes
checking netinet/tcp.h usability... yes
checking netinet/tcp.h presence... yes
checking for netinet/tcp.h... yes
checking for an ANSI C-conforming const... yes
checking whether byte ordering is bigendian... no
checking whether time.h and sys/time.h may both be included... yes
checking for pid_t... yes
checking vfork.h usability... no
checking vfork.h presence... no
checking for vfork.h... no
checking for fork... yes
checking for vfork... yes
checking for working fork... yes
checking for working vfork... (cached) yes
checking return type of signal handlers... void
checking for memset... yes
checking for mkdir... yes
checking for rmdir... yes
checking for select... yes
checking for socket... yes
checking for library containing crypt... none required
checking crypt.h usability... no
checking crypt.h presence... no
checking for crypt.h... no
checking for socklen_t... yes
checking for connect... yes
checking for dlopen in -ldl... yes
configure: creating ./config.status
config.status: creating src/Makefile
config.status: creating src/client/cxx/Makefile
config.status: creating src/config.h
** libs
** arch - i386
gcc -arch i386 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386  -I/usr/local/include   -DDAEMON -Iinclude -I. -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386 -fPIC  -g -O2 -c Rserv.c -o Rserv.o
gcc -arch i386 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386  -I/usr/local/include   -DDAEMON -Iinclude -I. -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386 -fPIC  -g -O2 -c session.c -o session.o
gcc -arch i386 -std=gnu99 -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386  -I/usr/local/include   -DDAEMON -Iinclude -I. -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386 -fPIC  -g -O2 -c md5.c -o md5.o
gcc -arch i386 -std=gnu99 Rserv.o session.o md5.o -o Rserve -F/Library/Frameworks/R.framework/.. -framework R -ldl 
cp Rserve Rserve.so
gcc -arch i386 -std=gnu99 -Iinclude -I. -I/Library/Frameworks/R.framework/Resources/include -I/Library/Frameworks/R.framework/Resources/include/i386 -c Rserv.c -o Rserv_d.o -DNODAEMON -DRSERV_DEBUG -g  -I/usr/local/include -g -O2
gcc -arch i386 -std=gnu99 Rserv_d.o session.o md5.o -o Rserve.dbg -F/Library/Frameworks/R.framework/.. -framework R -ldl 
cp Rserve Rserve-bin.so
cp Rserve.dbg Rserve-dbg.so
./mergefat Rserve "/Library/Frameworks/R.framework/Resources/bin/Rserve"
./mergefat Rserve.dbg "/Library/Frameworks/R.framework/Resources/bin/Rserve.dbg"
** R
** inst
** preparing package for lazy loading
** help
*** installing help indices
 >>> Building/Updating help pages for package 'Rserve'
     Formats: text html latex example
  Rclient                           text    html    latex   example
  Rserv                             text    html    latex
** building package indices ...
* DONE (Rserve)
% R CMD Rserve
R version 2.9.2 (2009-08-24)
Copyright (C) 2009 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

Rserv started in daemon mode.

Rserve のサーバがうごいたー。

% cat rserve-test.rb
require 'rubygems'
require 'rserve'
include Rserve
c = Connection.new
x = c.eval("R.version.string");
puts x.as_string

テストコードを動かしてみる。

% ruby-1.8 rserve-test.rb
R version 2.9.2 (2009-08-24)

オッケー。

ここで、BioConductor の GeneR パッケージをインストールする。

% R
> source("http://bioconductor.org/biocLite.R")
> biocLite()

かなーり待たされてインストール完了。 しかし GeneR はこの中に入っていないらしい。。

> library(GeneR)
 以下にエラー library(GeneR) :  'GeneR' という名前のパッケージはありません

http://www.bioconductor.org/packages/2.3/bioc/html/GeneR.html によると、

> source("http://bioconductor.org/biocLite.R")
> biocLite("GeneR")
Using R version 2.9.2, biocinstall version 2.4.13.
Installing Bioconductor version 2.4 packages:
[1] "GeneR"
Please wait...

 URL 'http://bioconductor.org/packages/2.4/bioc/bin/macosx/universal/contrib/2.9/GeneR_2.14.0.tgz' を試しています
Content type 'application/x-gzip' length 411393 bytes (401 Kb)
 開かれた URL
==================================================
downloaded 401 Kb

The downloaded packages are in
     /var/folders/lt/ltVmCLsiF3mLKUpLCN3GlU+++TM/-Tmp-//RtmpHRXj4g/downloaded_packages

とするらしい。

> library(GeneR)
 次のパッケージを付け加えます: 'GeneR'
     The following object(s) are masked from package:utils :
      relist

ちょっと試してみる。

> seq = "atgacagtggcgagttacagtatggtgctgtgtggctcatctgatgatca
+ tcgctatcgaggcagaatcgaaaaagtaaaattcggcgtacccataaacg
+ aagcatttgcccatgacattcccgccacgcttctcatgctcttgctcaaa
+ gtgaacaaggatggacccgcgaaaaaggatatttggcgagcgcccggaaa
+ tcaggctcaagtgcgaaaattgtcgcaagtgatgcaacacgggcggcttg
+ taaatatcgagaatttcacggtttacacggcggcatctgtcatcaaaaag
+ tttctttcaaagttgccaaacggcatttttggacgggataatgaggagac
+ actgttcaatagtgcatcgactggaatggatattgagaagcagagacagg
+ tgttttataggatatttggatcacttccagtcgcatcccaacacttgctc
+ gtcctacttttcggcacatttcgggtcgtcgccgactcgtcggacggtca
+ ttcgaacgcgatgaacccgaatgcgatcgcgatttcggtggcaccatcgc
+ tttttcacacttgtatacacgatggacggacggcgcgagtagaagacctt
+ caacggttcaagctggcctcgaacattgtgtgctcgataatttgctcatt
+ cggcgacacgaagctcttcccacgcgagtgctacgagtattacgccagat
+ acacgggtcgcacgttgcgaatcgacgagaatcgaatgttcacttttcat
+ aatccatccaaccgtcgtgctcgtggcgaagagttctccgcgttggcggc
+ aaagtgtgcgggcgcctactcgctggccgccatccacctggccgaagaag
+ cgtcaccggagcccactccgacaacctcgaagcctccacgtggcaacggc
+ gtcgggcgtgccgggagtctgaagcagcacgcgttgacccagacgacgga
+ tcatccgaagagaagcgtgtcgatcgcggctaaggatccgtatccaactg
+ atttaaggacatcggtcagctgtgatttttga
+ "
> seq
[1] "atgacagtggcgagttacagtatggtgctgtgtggctcatctgatgatca\ntcgctatcgaggcagaatcgaaaaagtaaaattcggcgtacccataaacg\naagcatttgcccatgacattcccgccacgcttctcatgctcttgctcaaa\ngtgaacaaggatggacccgcgaaaaaggatatttggcgagcgcccggaaa\ntcaggctcaagtgcgaaaattgtcgcaagtgatgcaacacgggcggcttg\ntaaatatcgagaatttcacggtttacacggcggcatctgtcatcaaaaag\ntttctttcaaagttgccaaacggcatttttggacgggataatgaggagac\nactgttcaatagtgcatcgactggaatggatattgagaagcagagacagg\ntgttttataggatatttggatcacttccagtcgcatcccaacacttgctc\ngtcctacttttcggcacatttcgggtcgtcgccgactcgtcggacggtca\nttcgaacgcgatgaacccgaatgcgatcgcgatttcggtggcaccatcgc\ntttttcacacttgtatacacgatggacggacggcgcgagtagaagacctt\ncaacggttcaagctggcctcgaacattgtgtgctcgataatttgctcatt\ncggcgacacgaagctcttcccacgcgagtgctacgagtattacgccagat\nacacgggtcgcacgttgcgaatcgacgagaatcgaatgttcacttttcat\naatccatccaaccgtcgtgctcgtggcgaagagttctccgcgttggcggc\naaagtgtgcgggcgcctactcgctggccgccatccacctggccgaagaag\ncgtcaccggagcccactccgacaacctcgaagcctccacgtggcaacggc\ngtcgggcgtgccgggagtctgaagcagcacgcgttgacccagacgacgga\ntcatccgaagagaagcgtgtcgatcgcggctaaggatccgtatccaactg\natttaaggacatcggtcagctgtgatttttga\n"
> strTranslate(seq)
[1] "MTVASYSMVLCGSSDD-SLSRQNRKSKIRRTHK-KHLPMTFPPRFSCSCS-VNKDGPAKKDIWRAPG-SGSSAKIVASDATRAA-*ISRISRFTRRHLSSK-FLSKLPNGIFGRDNEE-TVQ*CIDWNGY*EAET-CFIGYLDHFQSHPNTC-VLLFGTFRVVADSSDG-FERDEPECDRDFGGTI-FFTLVYTMDGRRE*KT-QRFKLASNIVCSIICS-RRHEALPTRVLRVLRQ-TRVARCESTRIECSLF-NPSNRRARGEEFSALA-KVCGRLLAGRHPPGRR-RHRSPLRQPRSLHVAT-VGRAGSLKQHALTQTT-SSEEKRVDRG*GSVSN-I*GHRSAVIF-"

大丈夫そう。

頂いた Python 用のテストプログラム:

# Read a FASTA file a number of times (default once), translate
# using R/Bioconductor GeneR and print to STDOUT
#
# Usage:
#
#   python DNAtranslate.py dna.fa [n]
#
# Example:
#
#   python DNAtranslate.py ../../../test/data/test-dna.fa
#

verbose=False

import sys
import time
from Bio.Seq import Seq
from Bio import SeqIO
from Bio.Alphabet import generic_dna

import subprocess
import pyRserve

fn = sys.argv[1]
times = 1
if len(sys.argv) > 2:
  times = int(sys.argv[2])

# Start the RServer
subprocess.Popen([r"R","CMD", "Rserve"], stdout=subprocess.PIPE).wait()

time.sleep(0.5)
conn = pyRserve.rconnect()
conn('library(GeneR)')

if verbose:
  print >> sys.stderr, 'Biopython translate ',fn, ':', times
for i in range(0, times):
  if verbose:
    print >> sys.stderr, i+1
  for seq_record in SeqIO.parse(fn, "fasta", generic_dna):
    print ">",seq_record.id
    ntseq = str(seq_record.seq)
    print conn('strTranslate("'+ntseq+'")')

# Kill the RServer
subprocess.Popen([r"killall", "Rserve"], stdout=subprocess.PIPE)

これを Ruby に翻訳

require 'rubygems'
require 'rserve'
require 'bio'

rserve = Rserve::Connection.new

rserve.eval('library(GeneR)')

fasta = ARGV.shift
repeat = (ARGV.shift || 1).to_i

repeat.times do
  Bio::FlatFile.auto(fasta).each do |entry|
    puts ">#{entry.entry_id}"
    ntseq = entry.seq
    result = rserve.eval(%Q[strTranslate("#{ntseq}")])
    puts result.as_string
  end
end

実行してみる

% ruby-1.8 DNAtranslate-rserve.rb test-dna.fa           
>2L52.1
MSMVRNVSNQSEKLEILSCKWVGCLKSTEVFKTVEKLLDHVTADHIPEVIVNDDGSEEVVCQWDCCEMGASRGNLQKKKEWMENHFKTRHVRKAKIFKCLIEDCPVVKSSSQEIETHLRISHPINPKKERLKEFKSSTDHIEPTQANRVWTIVNGEVQWKTPPRVKKKTVIYYDDGPRYVFPTGCARCNYDSDESELESDEFWSATEMSDNEEVYVNFRGMNCISTGKSASMVPSKRRNWPKRVKKRLSTQRNNQKTIRPPELNKNNIEIKDMNSNNLEERNREECIQPVSVEKNILHFEKFKSNQICIVRENNKFREGTRRRRKNSGESEDLKIHENFTEKRRPIRSCKQNISFYEMDGDIEEFEVFFDTPTKSKKVLLDIYSAKKMPKIEVEDSLVNKFHSKRPSRACRVLGSMEEVPFDVEIGY*
>2RSSE.1
MTVASYSMVLCGSSDDHRYRGRIEKVKFGVPINEAFAHDIPATLLMLLLKVNKDGPAKKDIWRAPGNQAQVRKLSQVMQHGRLVNIENFTVYTAASVIKKFLSKLPNGIFGRDNEETLFNSASTGMDIEKQRQVFYRIFGSLPVASQHLLVLLFGTFRVVADSSDGHSNAMNPNAIAISVAPSLFHTCIHDGRTARVEDLQRFKLASNIVCSIICSFGDTKLFPRECYEYYARYTGRTLRIDENRMFTFHNPSNRRARGEEFSALAAKCAGAYSLAAIHLAEEASPEPTPTTSKPPRGNGVGRAGSLKQHALTQTTDHPKRSVSIAAKDPYPTDLRTSVSCDF*
 :

計測

% time ruby-1.9 DNAtranslate-rserve.rb test-dna.fa > /dev/null 
ruby-1.9 DNAtranslate-rserve.rb test-dna.fa > /dev/null  4.16s user 0.25s system 76% cpu 5.737 total

% time ruby-1.9 DNAtranslate-rserve.rb test-dna.fa 10 > /dev/null
ruby-1.9 DNAtranslate-rserve.rb test-dna.fa 10 > /dev/null  39.71s user 1.90s system 75% cpu 55.061 total
% time ruby-1.8 DNAtranslate-rserve.rb test-dna.fa > /dev/null
ruby-1.8 DNAtranslate-rserve.rb test-dna.fa > /dev/null  13.69s user 0.28s system 87% cpu 16.010 total

% time ruby-1.8 DNAtranslate-rserve.rb test-dna.fa 10 > /dev/null
ruby-1.8 DNAtranslate-rserve.rb test-dna.fa 10 > /dev/null  135.66s user 2.56s system 82% cpu 2:46.68 total


RSRuby

% sudo gem install rsruby -- --with-R-dir=/Library/Frameworks/R.framework/Resources

しかし R.h が見つからないといわれる

% sudo gem-1.8 install rsruby -- --with-R-dir=/Library/Frameworks/R.framework/Resources --with-R-include=/Library/Frameworks/R.framework/Resources/include
Building native extensions.  This could take a while...
ERROR:  Error installing rsruby:
     ERROR: Failed to build gem native extension.

/usr/local/bin/ruby-1.8 extconf.rb --with-R-dir=/Library/Frameworks/R.framework/Resources --with-R-include=/Library/Frameworks/R.framework/Resources/include
checking for main() in -lR... yes
checking for R.h... no

ERROR: Cannot find the R header, aborting.
*** extconf.rb failed ***
Could not create Makefile due to some reason, probably lack of
necessary libraries and/or headers.  Check the mkmf.log file for more
details.  You may need configuration options.

Provided configuration options:
     --with-opt-dir
     --without-opt-dir
     --with-opt-include
     --without-opt-include=${opt-dir}/include
     --with-opt-lib
     --without-opt-lib=${opt-dir}/lib
     --with-make-prog
     --without-make-prog
     --srcdir=.
     --curdir
     --ruby=/usr/local/bin/ruby-1.8
     --with-R-dir
     --with-R-include=${R-dir}/include
     --with-R-lib
     --without-R-lib=${R-dir}/lib
     --with-Rlib
     --without-Rlib

Gem files will remain installed in /usr/local/lib/ruby/gems/1.8/gems/rsruby-0.5.1.1 for inspection.
Results logged to /usr/local/lib/ruby/gems/1.8/gems/rsruby-0.5.1.1/ext/gem_make.out

ちゃんと extconf.rb にオプションは渡っているし

% ls /Library/Frameworks/R.framework/Resources/include
R.h           Rdefines.h    Rinternals.h  S.h           ppc/
R_ext/        Rembedded.h   Rmath.h       i386/
Rconfig.h     Rinterface.h  Rversion.h    libintl.h

ここに R.h あるんだけどな。。

% cd /usr/local/lib/ruby/gems/1.8/gems/rsruby-0.5.1.1/ext
% sudo ruby-1.8 -rmkmf -e 'create_makefile("rsruby_c")'
creating Makefile
% sudo vi Makefile

#INCFLAGS = -I. -I$(topdir) -I$(hdrdir) -I$(srcdir)
INCFLAGS = -I. -I$(topdir) -I$(hdrdir) -I$(srcdir) -I/Library/Frameworks/R.framework/Resources/include

#ldflags  = -L.
ldflags  = -L. -L/Library/Frameworks/R.framework/Resources/lib

無理やりガリガリやっつけてみる。

% make
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin10.0.0 -I/usr/local/lib/ruby/1.8/i686-darwin10.0.0 -I. -I/Library/Frameworks/R.framework/Resources/include -D_XOPEN_SOURCE -D_DARWIN_C_SOURCE   -fno-common -g -O2 -pipe -fno-common   -c Converters.c
In file included from /Library/Frameworks/R.framework/Resources/include/R.h:40,
                 from ./rsruby.h:37,
                 from Converters.c:32:
/Library/Frameworks/R.framework/Resources/include/Rconfig.h:9:28: error: x86_64/Rconfig.h: No such file or directory
Converters.c: In function ‘to_ruby_vector’:
Converters.c:356: warning: assignment discards qualifiers from pointer target type
Converters.c:384: warning: assignment discards qualifiers from pointer target type
Converters.c: In function ‘to_ruby_hash’:
Converters.c:601: warning: assignment discards qualifiers from pointer target type
{standard input}:unknown:FATAL:can't create output file: Converters.o
make: *** [Converters.o] Error 1

どうも arch がちゃんと設定されていないらしい。

% ls /Library/Frameworks/R.framework/Resources/include
R.h           Rdefines.h    Rinternals.h  S.h           ppc/
R_ext/        Rembedded.h   Rmath.h       i386/
Rconfig.h     Rinterface.h  Rversion.h    libintl.h

% less /Library/Frameworks/R.framework/Resources/include/Rconfig.h
/* This is an automatically generated universal stub for architecture-dependent
headers. */
#ifdef __i386__
#include "i386/Rconfig.h"
#elif defined __ppc__
#include "ppc/Rconfig.h"
#elif defined __ppc64__
#include "ppc64/Rconfig.h"
#elif defined __x86_64__
#include "x86_64/Rconfig.h"
#elif defined __arm__
#include "arm/Rconfig.h"
#else
#error "Unsupported architecture."
#endif

R のパッケージは i386 と ppc に対応しているけど、どうも x86_64 を期待しているらしい。

% sudo vi Makefile

#CPPFLAGS =   -D_XOPEN_SOURCE -D_DARWIN_C_SOURCE $(DEFS) $(cppflags)
CPPFLAGS = -D__i386__ -D_XOPEN_SOURCE -D_DARWIN_C_SOURCE $(DEFS) $(cppflags)

無理やり指定する。

% sudo make
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin10.0.0 -I/usr/local/lib/ruby/1.8/i686-darwin10.0.0 -I. -I/Library/Frameworks/R.framework/Resources/include -D__i386__ -D_XOPEN_SOURCE -D_DARWIN_C_SOURCE   -fno-common -g -O2 -pipe -fno-common   -c Converters.c
Converters.c: In function ‘to_ruby_vector’:
Converters.c:356: warning: assignment discards qualifiers from pointer target type
Converters.c:384: warning: assignment discards qualifiers from pointer target type
Converters.c: In function ‘to_ruby_hash’:
Converters.c:601: warning: assignment discards qualifiers from pointer target type
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin10.0.0 -I/usr/local/lib/ruby/1.8/i686-darwin10.0.0 -I. -I/Library/Frameworks/R.framework/Resources/include -D__i386__ -D_XOPEN_SOURCE -D_DARWIN_C_SOURCE   -fno-common -g -O2 -pipe -fno-common   -c R_eval.c
R_eval.c: In function ‘get_last_error_msg’:
R_eval.c:143: warning: return discards qualifiers from pointer target type
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin10.0.0 -I/usr/local/lib/ruby/1.8/i686-darwin10.0.0 -I. -I/Library/Frameworks/R.framework/Resources/include -D__i386__ -D_XOPEN_SOURCE -D_DARWIN_C_SOURCE   -fno-common -g -O2 -pipe -fno-common   -c robj.c
gcc -I. -I/usr/local/lib/ruby/1.8/i686-darwin10.0.0 -I/usr/local/lib/ruby/1.8/i686-darwin10.0.0 -I. -I/Library/Frameworks/R.framework/Resources/include -D__i386__ -D_XOPEN_SOURCE -D_DARWIN_C_SOURCE   -fno-common -g -O2 -pipe -fno-common   -c rsruby.c
cc -dynamic -bundle -undefined suppress -flat_namespace -o rsruby_c.bundle Converters.o R_eval.o robj.o rsruby.o -L. -L/usr/local/lib -L. -L/Library/Frameworks/R.framework/Resources/lib    -ldl -lobjc  

コンパイルは通った。しかしこれをインストールする方法がわからない。。

とりあえず gem を分解してみてみる。

% tar xvfz /usr/local/lib/ruby/gems/1.8/cache/rsruby-0.5.1.1.gem
data.tar.gz
metadata.gz

% sudo tar xvfz data.tar.gz
x History.txt
x License.txt
x Manifest.txt
x README.txt
x Rakefile.rb
x examples/arrayfields.rb
x examples/bioc.rb
x examples/dataframe.rb
x examples/erobj.rb
x ext/Converters.c
x ext/Converters.h
x ext/R_eval.c
x ext/R_eval.h
x ext/extconf.rb
x ext/robj.c
x ext/rsruby.c
x ext/rsruby.h
x lib/rsruby.rb
x lib/rsruby/dataframe.rb
x lib/rsruby/erobj.rb
x lib/rsruby/robj.rb
x test/table.txt
x test/tc_array.rb
x test/tc_boolean.rb
x test/tc_cleanup.rb
x test/tc_eval.rb
x test/tc_extensions.rb
x test/tc_init.rb
x test/tc_io.rb
x test/tc_library.rb
x test/tc_matrix.rb
x test/tc_modes.rb
x test/tc_robj.rb
x test/tc_sigint.rb
x test/tc_to_r.rb
x test/tc_to_ruby.rb
x test/tc_util.rb
x test/tc_vars.rb
x test/test_all.rb

% sudo gzcat metadata.gz > metadata

こういう構造になってるんですね。metadata が gemspec ぽい。 しかし、時間切れのためここで挫折。。

RSOAP

RSOAP は Ruby 1.9 で SOAP が使えないのと RSOAP サーバの準備大変そうなのであきらめた。

個人用ツール