pdfplumber

May 25, 2024 | seedling, permanent

tags :

Python Apps #

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables. github

Al Rajhi Bank #

<2024-02-15 Thu> this library was parsing this banks’ personal account statements cleanly.

! pip install

import pdfplumber
transactions = []

with pdfplumber.open(file_path) as pdf:
    for page in pdf.pages:
        text = page.extract_text()
        for table in page.extract_tables():
            transactions = transactions + table
            print(table)
            print(text)

print(transactions)
print("length of transactions:")
print(len(transactions))

Links to this note

deepdoctection