this post was submitted on 13 Nov 2023
1 points (100.0% liked)

LocalLLaMA

1 readers
1 users here now

Community to discuss about Llama, the family of large language models created by Meta AI.

founded 10 months ago
MODERATORS
 

I am working on a project where I have to extract tables from PDFs(usually financial reports which contain lot of tables(simple tables and cells merged tables) and graphs).
Following are the libraries that have been used without much great results:
Naugat, PyMuPDF(fitz) , PyPDF2 , pdfplumber, PDFMiner, Camelot, Tabula, pdfquery

What other OCR, LLMs or other tools do you recommend to proceed further? Thanks in advance!

you are viewing a single comment's thread
view the rest of the comments
[–] vec1nu@alien.top 1 points 10 months ago