Mistral adds a new API that turns any PDF document into an AI-ready Markdown file with pictures

Unlike most OCR APIs, Mistral OCR is a multimodal API, meaning that it can detect when there are illustrations and photos intertwined with blocks of text. The OCR API creates bounding boxes around these graphical elements and includes them in the output.

Mistral OCR also doesn’t just output a big wall of text; the output is formatted in Markdown, a formatting syntax that developers use to add links, headers, and other formatting elements to a plain text file.

LLMs rely heavily on Markdown for their training datasets. Similarly, when you use an AI assistant, such as Mistral’s Le Chat or OpenAI’s ChatGPT, they often generate Markdown to create bullet lists, add links, or put some elements in bold.

[…]

Mistral OCR is available on Mistral’s own API platform or through its cloud partners (AWS, Azure, Google Cloud Vertex, etc.). And for companies working with classified or sensitive data, Mistral offers on-premise deployment.

[…]

Companies and developers will most likely use Mistral OCR with a RAG (aka Retrieval-Augmented Generation) system to use multimodal documents as input in an LLM. And there are many potential use cases. For instance, we could envisage law firms using it to help them swiftly plough through huge volumes of documents.

RAG is a technique that’s used to retrieve data and use it as context with a generative AI model.

Source: Mistral adds a new API that turns any PDF document into an AI-ready Markdown file | TechCrunch

Robin Edgar

Organisational Structures | Technology and Science | Military, IT and Lifestyle consultancy | Social, Broadcast & Cross Media | Flying aircraft

 robin@edgarbv.com  https://www.edgarbv.com