A Streamlit web application for converting various document formats using the Docling library.
Streamlit Application: https://doclingconvert.streamlit.app/
- Convert multiple document formats (PDF, DOCX, HTML, PPTX, Images)
- Multiple output formats (Markdown, JSON, YAML)
- OCR support for scanned documents
- Advanced image resolution settings
- Clean and intuitive interface
- Clone the repository:
git clone https://github.com/hparreao/doclingconverter.git
cd docling-converter
- Create a virtual environment and install dependencies:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
- Run the app locally:
streamlit run app.py
- Select the document type from the dropdown
- Upload your document
- Choose the desired output format
- Adjust advanced settings if needed
- Click "Start Conversion"
- Download the converted file