What is Docling?

Docling is an open-source document processing library that converts messy documents into structured data, simplifying downstream document and AI processing. It provides advanced capabilities for detecting tables, formulas, reading order, OCR, and much more.

Docling simplifies document processing by parsing diverse formats — including advanced PDF understanding — and providing seamless integrations with the gen AI ecosystem.

Key Features

  • Advanced PDF Understanding - Page layout detection, reading order, table structure, formulas, code blocks, and image classification
  • Multi-Format Support - PDF, DOCX, PPTX, XLSX, HTML, Markdown, images, audio files, and more
  • Unified Document Representation - All documents converted to a consistent DoclingDocument format
  • Flexible Export Formats - Markdown, HTML, JSON, DocTags, and plain text
  • Local Execution - Process documents locally for sensitive data and air-gapped environments
  • AI Framework Integrations - Native integrations with LangChain, LlamaIndex, Crew AI, Haystack, and more
  • Extensive OCR Support - Advanced OCR for scanned PDFs and images
  • Visual Language Models - Support for VLMs like GraniteDocling with MLX acceleration
  • Simple CLI - Command-line interface for quick conversions

History & Development

Docling was started by the AI for knowledge team at IBM Research Zurich. The project represents years of research and development in document understanding, layout analysis, and information extraction.

In recognition of its value to the open-source community, Docling is now hosted as a project in the LF AI & Data Foundation, ensuring long-term sustainability and community governance.

LF AI & Data Foundation

Docling is part of the LF AI & Data Foundation, a project of the Linux Foundation. The foundation provides a neutral forum for open-source innovation in AI, machine learning, and data technologies.

Being part of LF AI & Data Foundation ensures:

  • Neutral governance and community-driven development
  • Long-term sustainability and support
  • Collaboration with other leading AI projects
  • Enterprise-grade support and best practices

Technical Report

For detailed information about Docling's inner workings, architecture, and capabilities, refer to the Docling Technical Report:

Docling Technical Report
Deep Search Team
arXiv:2408.09869
DOI: 10.48550/arXiv.2408.09869

The technical report provides comprehensive information about:

  • Document processing pipeline architecture
  • Layout detection and reading order algorithms
  • Table extraction and structure recognition
  • OCR and image processing capabilities
  • Performance benchmarks and evaluations

License

Docling is released under the MIT License, making it free to use, modify, and distribute for both commercial and non-commercial purposes.

For individual model usage, please refer to the model licenses found in the original packages, as some models may have different licensing terms.

Contributing

Docling is an open-source project and welcomes contributions from the community. Whether you're fixing bugs, adding features, improving documentation, or helping with testing, your contributions are valuable.

To contribute:

  • Visit the GitHub repository
  • Read the contributing guidelines
  • Submit issues and pull requests
  • Join discussions and help improve the project

Community

Join the Docling community to get help, share ideas, and stay updated:

Acknowledgments

Docling is made possible by:

  • The AI for knowledge team at IBM Research Zurich
  • The open-source community contributors
  • The LF AI & Data Foundation
  • All users and developers who provide feedback and help improve the project

Citation

If you use Docling in your research or projects, please consider citing:

BibTeX
@techreport{Docling,
  author = {Deep Search Team},
  month = {8},
  title = {Docling Technical Report},
  url = {https://arxiv.org/abs/2408.09869},
  eprint = {2408.09869},
  doi = {10.48550/arXiv.2408.09869},
  version = {1.0.0},
  year = {2024}
}

IBM ❤️ Open Source AI

Docling is part of IBM's commitment to open-source AI. The project was started by IBM Research Zurich's AI for knowledge team, demonstrating IBM's dedication to advancing AI research and making cutting-edge technology accessible to everyone.

Get Started

Ready to start using Docling?