Skip to content

The ETL+ Platform for GenAI

Welcome to Unstructured! We're trusted by 82% of the Fortune 1000 and used by over 60,000 organizations globally.

We automatically transform complex, unstructured data into clean, structured data for GenAI applications. Data is routed through dynamic transformation and enrichment pipelines to deliver the highest quality output to your LLM. Continuously. Effortlessly. Automatically.

To get started, check out our open source offerings:

Ready for a more performant and reliable experience? Try Unstructured for free today and experience the next evolution of ETL for GenAI applications.

Learn more:

  • Company Website - Transform complex, unstructured data into clean, structured data. Securely. Continuously. Effortlessly.
  • Extensive Documentation - Our comprehensive docs cover everything from getting started guides to in-depth API references, ensuring you have the resources you need to succeed.
  • Developer Community on Slack - Connect with fellow developers, share knowledge, and get support through our vibrant community Slack channel.

Popular repositories Loading

  1. unstructured unstructured Public

    Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website …

    HTML 12.8k 1k

  2. unstructured-api unstructured-api Public

    Python 814 172

  3. unstructured-inference unstructured-inference Public

    Python 193 69

  4. pipeline-sec-filings pipeline-sec-filings Public archive

    Preprocessing pipeline notebooks and API supporting text extraction from SEC documents

    Jupyter Notebook 147 36

  5. unstructured-python-client unstructured-python-client Public

    A Python client for the Unstructured Platform API

    Python 106 18

  6. unstructured-ingest unstructured-ingest Public

    HTML 98 52

Repositories

Showing 10 of 40 repositories
  • unstructured-python-client Public

    A Python client for the Unstructured Platform API

    Unstructured-IO/unstructured-python-client’s past year of commit activity
    Python 106 MIT 18 13 3 Updated Oct 4, 2025
  • notebooks Public
    Unstructured-IO/notebooks’s past year of commit activity
    Jupyter Notebook 1 0 0 3 Updated Oct 3, 2025
  • unstructured-js-client Public

    A JavaScript/Typescript client for the Unstructured Platform API

    Unstructured-IO/unstructured-js-client’s past year of commit activity
    TypeScript 57 MIT 17 6 1 Updated Oct 3, 2025
  • Unstructured-IO/unstructured-platform-plugins’s past year of commit activity
    Python 5 Apache-2.0 2 0 2 Updated Oct 1, 2025
  • docs Public

    Documentation for all Unstructured products and libraries

    Unstructured-IO/docs’s past year of commit activity
    MDX 7 25 0 8 Updated Oct 1, 2025
  • unstructured Public

    Convert documents to structured data effortlessly. Unstructured is open-source ETL solution for transforming complex documents into clean, structured formats for language models. Visit our website to learn more about our enterprise grade Platform product for production grade workflows, partitioning, enrichments, chunking and embedding.

    Unstructured-IO/unstructured’s past year of commit activity
    HTML 12,817 Apache-2.0 1,049 178 (3 issues need help) 49 Updated Sep 26, 2025
  • Unstructured-IO/unstructured-api’s past year of commit activity
    Python 814 Apache-2.0 172 31 8 Updated Sep 25, 2025
  • Unstructured-IO/unstructured-ingest’s past year of commit activity
    HTML 98 Apache-2.0 52 56 26 Updated Sep 23, 2025
  • rag-over-hybrid-data-sources Public

    Two sources (S3, ElasticSearch) to RAG DB pipeline.

    Unstructured-IO/rag-over-hybrid-data-sources’s past year of commit activity
    Jupyter Notebook 0 0 0 1 Updated Sep 15, 2025
  • Unstructured-IO/unstructured-inference’s past year of commit activity
    Python 193 Apache-2.0 69 23 16 Updated Sep 12, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.