英文:
Using pdftotext in Google Colab
问题
我的德国研究机构提供的笔记本电脑出了问题,现在我正在使用我荷兰研究机构提供的新笔记本电脑,但我还没有安装Python和Jupyter Notebook。这就是为什么我想在Google Colab中运行代码,但我意识到无法安装pdftotext
Python包。
使用!pip install pdftotext
或!apt-get install
都会导致以下错误通知:
E: 无法定位包pdftotext
我认为我缺少依赖项。有没有办法在Google Colab中使这个工作,或者我需要在其他地方运行我的代码?
英文:
The laptop provided by my German research institute broke down and I am now using a new laptop provided by my Dutch institute, but I have not set up Python and Jupyter Notebook yet. This is why I wanted to run code in Google Colab but realise that the pdftotext
Python package cannot be installed.
Using !pip install pdftotext
or !apt-get install
both result in this error notification:
E: Unable to locate package pdftotext
I assume that I am missing dependencies. Is there any way can make this work in Google Colab, or will I need to run my code elsewhere?
答案1
得分: 1
根据GitHub上的README文件上的说明,需要在安装此包之前安装其他附加依赖项。
-
更新Google Colab会话中的软件包,然后下载所需软件包。
!sudo apt-get update !sudo apt install build-essential libpoppler-cpp-dev pkg-config python3-dev
-
然后使用pip安装pdftotext。
!pip install pdftotext
-
最后测试一下包是否正常工作。下面是来自该包的存储库的代码块。
import pdftotext with open("your_pdf.pdf", "rb") as f: pdf = pdftotext.PDF(f) for page in pdf: print(page)
英文:
Per the README on pdftotext on GitHub, there are additional dependencies that need to be installed before you can install the package.
- Update the packages within the Google Colab session then download the required packages.
!sudo apt-get update
!sudo apt install build-essential libpoppler-cpp-dev pkg-config python3-dev
- Next install pdftotext with pip.
!pip install pdftotext
- Finally test out that the package is working correctly. Below is a codeblock from the package's repo.
import pdftotext
with open("your_pdf.pdf", "rb") as f:
pdf = pdftotext.PDF(f)
for page in pdf:
print(page)
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论