Skip to content

Latest commit

 

History

History
33 lines (21 loc) · 1.98 KB

README.md

File metadata and controls

33 lines (21 loc) · 1.98 KB

Extract your docs using Unstructured-IO

Description

This Streamlit app is designed to help you analyze and extract valuable insights from challenging data formats commonly found in enterprise settings, such as HTML, PDF, CSV, PNG, PPTX, and more.

This app uses unstructured.io as a base library, providing an easy way to extract and convert unstructured data into a format compatible with popular vector databases and LLM frameworks. With this tool, you can streamline complex data handling and ensure compatibility with your preferred data analysis pipelines.

Supported file types:

Category Document Types
Plaintext .txt, .eml, .msg, .xml, .html, .md, .rst, .json, .rtf
Images .jpeg, .png
Documents .doc, .docx, .ppt, .pptx, .pdf, .odt, .epub, .csv, .tsv, .xlsx
Find out more about it unstructured.io

To get started, upload any docs file and it will be show's on the preview. You can also adjust the parameters to fine-tune your tests.

Accessing the App

You can access the app on the Streamlit Cloud community at https://unstructured-demo.streamlit.app/.

Getting Started

The app does not require any API key to function; extractions will be processed on streamlit cloud serverunless you choose to process them on unstructured.io server.

However, if you choose to use unstructured.io API, I gave you a temporary key in the app, but it might be limited. Create your own at unstructured. After obtaining your API key, select unstructured.io API, enter your own API, and upload your file.

Feedback

If you have any feedback or questions about this app, please reach out to me on Twitter at @rririanto.

Thank you for checking out the tool!