在Google Colab中使用pdftotext

huangapple go评论81阅读模式
英文:

Using pdftotext in Google Colab

问题

我的德国研究机构提供的笔记本电脑出了问题,现在我正在使用我荷兰研究机构提供的新笔记本电脑,但我还没有安装Python和Jupyter Notebook。这就是为什么我想在Google Colab中运行代码,但我意识到无法安装pdftotext Python包。

使用!pip install pdftotext!apt-get install都会导致以下错误通知:

E: 无法定位包pdftotext

我认为我缺少依赖项。有没有办法在Google Colab中使这个工作,或者我需要在其他地方运行我的代码?

英文:

The laptop provided by my German research institute broke down and I am now using a new laptop provided by my Dutch institute, but I have not set up Python and Jupyter Notebook yet. This is why I wanted to run code in Google Colab but realise that the pdftotext Python package cannot be installed.

Using !pip install pdftotext or !apt-get install both result in this error notification:

E: Unable to locate package pdftotext

I assume that I am missing dependencies. Is there any way can make this work in Google Colab, or will I need to run my code elsewhere?

答案1

得分: 1

根据GitHub上的README文件上的说明,需要在安装此包之前安装其他附加依赖项。

  1. 更新Google Colab会话中的软件包,然后下载所需软件包。

    !sudo apt-get update
    !sudo apt install build-essential libpoppler-cpp-dev pkg-config python3-dev
    
  2. 然后使用pip安装pdftotext。

    !pip install pdftotext

  3. 最后测试一下包是否正常工作。下面是来自该包的存储库的代码块。

    import pdftotext
    
    with open("your_pdf.pdf", "rb") as f:
        pdf = pdftotext.PDF(f)
    
    for page in pdf:
        print(page)
    
英文:

Per the README on pdftotext on GitHub, there are additional dependencies that need to be installed before you can install the package.

  1. Update the packages within the Google Colab session then download the required packages.
!sudo apt-get update
!sudo apt install build-essential libpoppler-cpp-dev pkg-config python3-dev
  1. Next install pdftotext with pip.

!pip install pdftotext

  1. Finally test out that the package is working correctly. Below is a codeblock from the package's repo.
import pdftotext

with open("your_pdf.pdf", "rb") as f:
    pdf = pdftotext.PDF(f)

for page in pdf:
    print(page)

huangapple
  • 本文由 发表于 2023年6月19日 06:03:49
  • 转载请务必保留本文链接:https://go.coder-hub.com/76502697.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定