In this file, I used pdfQuery library and with the help of pdf->xml. I get the specific pdf data.
This version used PyMuPDF and fitz library to able to extract the hightlighted text from pdf. it will require no xml conversion and is alot faster and fairly more accurate.
Before running it, run the command:
pip install fitz PyMuPDF