Unable to get page count. Is poppler installed and in PATH?

现象

1
2
3
4
5
6
7
8
9
10
11
12
Traceback (most recent call last):
File "/home/lp/iocr-pdf-service/app/utils.py", line 135, in request_api_update
img_list, img_base64_list = pdf2img_bytes(file, show_flag=True)
File "/home/lp/iocr-pdf-service/app/utils.py", line 42, in pdf2img_bytes
image_from_path = convert_from_bytes(pdf_file, output_folder=path, fmt=IMAGE_FMT)
File "/usr/local/python3.6.9/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 380, in convert_from_bytes
hide_annotations=hide_annotations,
File "/usr/local/python3.6.9/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 128, in convert_from_path
pdf_path, userpw, ownerpw, poppler_path=poppler_path
File "/usr/local/python3.6.9/lib/python3.6/site-packages/pdf2image/pdf2image.py", line 595, in pdfinfo_from_path
"Unable to get page count. Is poppler installed and in PATH?"
pdf2image.exceptions.PDFInfoNotInstalledError: Unable to get page count. Is poppler installed and in PATH?

解决

系统中未安装 poppler,这是一个用于呈现可移植文档格式(PDF)文档的免费软件实用程序库。

poppler 官网:https://poppler.freedesktop.org/

Centos

直接用下面的命令进行安装:

1
yum install poppler poppler-cpp-devel poppler-utils

如果没有安装“poppler-utils”就会出现如下所示的错误:Exception: Unable to get page count. Is poppler installed and in PATH?

Ubuntu

直接用下面的命令进行安装:

1
apt-get install -y poppler-utils libpoppler-dev

安装完成之后,可以通过输入以下命令检查 Poppler 版本:

1
pdftotext -v

输出 pdftotext version 代表成功

1
2
3
pdftotext version 0.86.1
Copyright 2005-2020 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1996-2011 Glyph & Cog, LLC

windows

poppler-windows 下载地址:https://github.com/oschwartz10612/poppler-windows/releases/

下载压缩包 zip 即可。下载后解压到任意一个文件目录,建议是c盘,如 C:\Program Files

有两种方式使用:

  1. 将解压后目录 C:\Program Files\poppler-23.05.0\Library\bin 添加到系统的环境变量中。加完之后需要重启 pycharm。

  2. 修改源码,修改安装包内的 pdf2image.py 中的 poppler 路径

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    def convert_from_path(
    pdf_path,
    dpi=200,
    output_folder=None,
    first_page=None,
    last_page=None,
    fmt="ppm",
    jpegopt=None,
    thread_count=1,
    userpw=None,
    use_cropbox=False,
    strict=False,
    transparent=False,
    single_file=False,
    output_file=uuid_generator(),
    poppler_path=r'C:\Program Files\poppler-23.05.0\Library\bin', #将这里改为你解压安装的poppler路径
    grayscale=False,
    size=None,
    paths_only=False,
    use_pdftocairo=False,
    timeout=None,
    }

Unable to get page count. Is poppler installed and in PATH?
https://flepeng.github.io/021-Python-71-报错-Unable-to-get-page-count-Is-poppler-installed-and-in-PATH?/
作者
Lepeng
发布于
2021年7月5日
许可协议