英文:
ModuleNotFoundError: No module named 'haystack.nodes'
问题
我在 haystack 网站上遵循提取问答系统的教程。我试图将 PDF 转换为文本。博客链接在这里:(https://www.deepset.ai/blog/automating-information-extraction-with-question-answering)
我使用 pip 安装了 haystack,但是出现了这个错误。我甚至尝试过 !pip install haystack.nodes 但没有起作用。
注意:我在 Google Colab 上进行这项工作。
以下是我的详细代码和错误:
!pip -q install haystack haystack.nodes
path = '/content/drive/MyDrive/Colab Notebooks/NLP/Information Extraction QA with Haystack (Adidas Financial corpus)'
from haystack.nodes import PDFToTextConverter
pdf_converter = PDFToTextConverter(remove_numeric_tables=True, valid_languages=['en'])
converted = pdf_converter.convert(file_path=path, meta={'company': 'Company_1', 'processed': False})
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-7-61021fb3b7b8> in <cell line: 1>()
----> 1 from haystack.nodes import PDFToTextConverter
2
3 pdf_converter = PDFToTextConverter(remove_numeric_tables=True, valid_languages=['en'])
4
5 converted = pdf_converter.convert(file_path=path, meta={'company': 'Company_1', 'processed': False})
英文:
I am following the tutorial from haystacks website for Extractive QA system. I am trying to convert PDF to Text.
Link to the blog is here : (https://www.deepset.ai/blog/automating-information-extraction-with-question-answering)
I pip installed haystack but I get this error. I even tried !pip install haystack.nodes but that doesn't work.
Note: I am using Google Colab for this.
Here is my detailed code and error:
!pip -q install haystack haystack.nodes
path = '/content/drive/MyDrive/Colab Notebooks/NLP/Information Extraction QA with Haystack (Adidas Financial corpus)'
from haystack.nodes import PDFToTextConverter
pdf_converter = PDFToTextConverter(remove_numeric_tables=True, valid_languages=['en'])
converted = pdf_converter.convert(file_path = path, meta = { 'company': 'Company_1', 'processed': False })
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-7-61021fb3b7b8> in <cell line: 1>()
----> 1 from haystack.nodes import PDFToTextConverter
2
3 pdf_converter = PDFToTextConverter(remove_numeric_tables=True, valid_languages=['en'])
4
5 converted = pdf_converter.convert(file_path = path, meta = { 'company': 'Company_1', 'processed': False })
答案1
得分: 1
要安装Haystack,您需要运行pip install farm-haystack
。pypi软件包称为farm-haystack,而不仅仅是像Stefano提到的那样的haystack。
一个很好的起点是Haystack教程,您可以在Google Colab上运行它们,例如此教程使用PDFToTextConverter。
英文:
To install Haystack, you need to run pip install farm-haystack
. The pypi package is called farm-haystack and not just haystack as Stefano mentioned.
A good starting point are the Haystack tutorials, which you can run as python notebooks on Google Colab, for example this tutorial using the PDFToTextConverter.
答案2
得分: 0
不要将任何文件命名为haystack.py,否则会出现导入失败。这适用于所有项目,永远不要将任何文件命名为库本身。;-)
英文:
Do not name any of your files haystack.py otherwise you will get import fails. This goes for all projects, never name any file like the library itself.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论