Skip to content

uzairkabeer1/Python-PDF-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Python-PDF-Scraper

Previous version

In this file, I used pdfQuery library and with the help of pdf->xml. I get the specific pdf data.

Newer version

This version used PyMuPDF and fitz library to able to extract the hightlighted text from pdf. it will require no xml conversion and is alot faster and fairly more accurate. Before running it, run the command: pip install fitz PyMuPDF

Releases

No releases published

Packages

No packages published

Languages