Web Connection
Foxit SDK or other options
Gravatar is a globally recognized avatar based on your email address. Foxit SDK or other options
  Harvey Mushman
  All
  Sep 7, 2023 @ 05:57am

I need to convert a PDF form on a regular bases into something I can consume the data from. I have used iSED Quick PDF in the past but now when I visit their url, I get redirected to Foxit. I have subscribed to Foxit Software for reading and editing PDF's for many years but never considered using their SDK.

Any thoughts about a good PDF to XLS (or better to JSON) converter? I'm working on a limited budget and Foxit at $3000/year seems like a lot for a single developer license especially after the application is finished.

Gravatar is a globally recognized avatar based on your email address. re: Foxit SDK or other options
  Carl Chambers
  Harvey Mushman
  Sep 8, 2023 @ 09:39am

Hi Harvey,

I have used ABBYY FineReader for a few years and the version I have (not the latest) does not have an SDK but does have a command line that can be invoked programmatically.
I use it to compare PDF's and to convert PDF's to Excel. Not full automation (UI is displayed) but it does what I need.

May be worth a look.

Carl

Gravatar is a globally recognized avatar based on your email address. re: Foxit SDK or other options
  Bob Roenigk
  Harvey Mushman
  Sep 13, 2023 @ 08:00am

Harvey,

Would Python be an option? This works.

!pip install PyPDF2

import PyPDF2

# Open the PDF file
pdf_file_path = 'sample.pdf'
pdf_file = open(pdf_file_path, 'rb')

# Create a PDF reader object
pdf_reader = PyPDF2.PdfReader(pdf_file)

# Initialize an empty string to store the text
pdf_text = ''

# Loop through each page and extract text
for page_num in range(len(pdf_reader.pages)):
    page = pdf_reader.pages[page_num]
    pdf_text += page.extract_text()

# Close the PDF file
pdf_file.close()

# Open a text file for writing
text_file_path = 'output.txt'
with open(text_file_path, 'w', encoding='utf-8') as text_file:
    # Write the extracted text to the text file
    text_file.write(pdf_text)

print(f"PDF content has been successfully exported to '{text_file_path}'.")

© 1996-2024