Running python script (importing spacy) from Java using Runtime.exec

huangapple go评论51阅读模式
英文:

Running python script (importing spacy) from Java using Runtime.exec

问题

我遇到了这个问题。我有一个Python脚本执行一些语义相似性计算。这个脚本包括导入spacy库。这个脚本有一个方法和两个参数。当我在终端中运行这个脚本时,一切都正常。

import spacy
import sys

# 在这里创建你的模型。
class Clustering():
    nlp = None
    def __init__(self):
        self.nlp = spacy.load("es_core_news_md")
                
    def process_text(self, text):
        ...
        return " ".join(result)
    
    def find_similarity(self, text1, text2):
        fixedText1 = self.process_text(text1)
        fixedText2 = self.process_text(text2)
        
        doc1 = self.nlp(fixedText1)
        doc2 = self.nlp(fixedText2)
        
        print(doc1.similarity(doc2))
    
if __name__ == '__main__':
    clus = Clustering('es', 'md')
    clus.find_similarity(sys.argv[2], sys.argv[3])

这是我运行脚本的方式:

python semanticsimilarity.py find_similarity 'El perro se salió del pozo' 'El banano se salió del pozo'

然而,当我在Java中运行它时:

Process p = Runtime.getRuntime().exec("python semanticsimilarity.py find_similarity 'El perro se salió del pozo' 'El banano se salió del pozo'");
p.waitFor();

我得到这个错误:

Traceback (most recent call last):
  File "semanticsimilarity.py", line 9, in <module>
    import spacy
ImportError: No module named spacy

有没有办法解决这个问题?我认为Java正在尝试在JVM内运行这个脚本,我不太清楚。

谢谢。

英文:

I'm having this problem. I have a Python script doing some semantic similarity. This script include the import spacy sentence. This script have a method and two parameters. When I run this script in my terminal, everything goes well.

import spacy
import sys

# Create your models here.
class Clustering():
    nlp=None
    def __init__(self):
        self.nlp = spacy.load(&quot;es_core_news_md&quot;)
                
    def process_text(self, text):
        ...
        return &quot; &quot;.join(result)
    
    def find_similarity(self, text1, text2):
        fixedText1 = self.process_text(text1)
        fixedText2 = self.process_text(text2)
        
        doc1 = self.nlp(fixedText1)
        doc2 = self.nlp(fixedText2)
        
        print(doc1.similarity(doc2))
    
if __name__ == &#39;__main__&#39;:
    clus = Clustering(&#39;es&#39;, &#39;md&#39;)
    clus.find_similarity(sys.argv[2],sys.argv[3])

This is how i run the script

python semanticsimilarity.py find_similarity &#39;El perro se sali&#243; del pozo&#39; &#39;El banano se sali&#243; del pozo&#39;

However, when I run it in Java:

Process p = Runtime.getRuntime().exec(&quot;python semanticsimilarity.py find_similarity &#39;El perro se sali&#243; del pozo&#39; &#39;El banano se sali&#243; del pozo&#39;&quot;);
p.waitFor();

I get this error:

Traceback (most recent call last):
  File &quot;semanticsimilarity.py&quot;, line 9, in &lt;module&gt;
    import spacy
ImportError: No module named spacy

Is there any way to move around this? I think java is trying to run this inside the JVM or something, I dont know.

Thank you.

答案1

得分: 1

与环境变量相关的最可能是 PATHLD_LIBRARY_PATHPYTHONPATH。在原始的Java启动中设置它们,或者使用Python命令的Shell脚本包装器显式设置它们。

从脚本正常工作的终端中获取环境变量的值:echo PATH=$PATH ; echo PYTHONPATH =$PYTHONPATH ; echo LD_LIBRARY_PATH =$LD_LIBRARY_PATH ;


选项1:在您的配置文件中设置

export PATH=
export PYTHONPATH=
export LD_LIBRARY_PATH=

选项2:创建一个包装脚本 wrapSymSim.sh

export PATH=
export PYTHONPATH=
export LD_LIBRARY_PATH=

python semanticsimilarity.py "$@"

在您的Java代码中调用它:

Process p = Runtime.getRuntime().exec("wrapSymSim.sh find_similarity 'El perro se salió del pozo' 'El banano se salió del pozo'");
p.waitFor();
英文:

most likely to do with ENV variables - PATH , LD_LIBRARY_PATH and PYTHONPATH. Set them for the original java start or use a shell script wrapper on python command to set them explicitly

grab the ENV values from terminal where the script works echo PATH=$PATH ; echo PYTHONPATH =$PYTHONPATH ; echo LD_LIBRARY_PATH =$LD_LIBRARY_PATH ;


Option-1 Set in your profile

export PATH=
export PYTHONPATH=
export LD_LIBRARY_PATH=

Option-2 Create a wrapper script
wrapSymSim.sh

export PATH=
export PYTHONPATH=
export LD_LIBRARY_PATH=

python semanticsimilarity.py &quot;$@&quot;

call this in your java

Process p = Runtime.getRuntime().exec(&quot;wrapSymSim.sh find_similarity &#39;El perro se sali&#243; del pozo&#39; &#39;El banano se sali&#243; del pozo&#39;&quot;);
p.waitFor();

答案2

得分: 0

你的Python脚本依赖于由命令行 shell 设置的变量。@PrasadU 建议的设置应该有效,或者你可以使用 ProcessBuilder 类为Python设置相同的依赖变量。

由于你的普通命令行 shell 已经进行了设置,另一种方法是首先启动你的普通 SHELL,然后让它启动你的Python脚本 - 因为这正是你在普通脚本启动时实际执行的操作。

因此,如果你使用 Windows CMD.EXE 或 Linux BASH,其中一个可能会像这样工作:

Process p = Runtime.getRuntime().exec("/bin/bash -c \"python semanticsimilarity.py find_similarity 'El perro se salió del pozo' 'El banano se salió del pozo'\"");

或者

Process p = Runtime.getRuntime().exec("CMD.EXE /c \"python semanticsimilarity.py find_similarity 'El perro se salió del pozo' 'El banano se salió del pozo'");
英文:

Your Python script depends on variables set up by your command line shell. The setup suggested by @PrasadU should work, or you could use ProcessBuilder class to the set same set of dependent variables for Python.

As your normal command line shell already has the setup, another approach is to simply launch your normal SHELL first and let that to launch your Python script - as that is exactly what you are really doing in your normal script launch.

So if you use Windows CMD.EXE or Linux BASH say, one of these may work as-is:

Process p = Runtime.getRuntime().exec(&quot;/bin/bash -c \&quot;python semanticsimilarity.py find_similarity &#39;El perro se sali&#243; del pozo&#39; &#39;El banano se sali&#243; del pozo&#39;\&quot;&quot;);

OR

Process p = Runtime.getRuntime().exec(&quot;CMD.EXE /c \&quot;python semanticsimilarity.py find_similarity &#39;El perro se sali&#243; del pozo&#39; &#39;El banano se sali&#243; del pozo&#39;\&quot;&quot;);

huangapple
  • 本文由 发表于 2020年7月22日 12:21:43
  • 转载请务必保留本文链接:https://go.coder-hub.com/63026829.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定