Prepare your unstructured data for training/augmenting an LLM.

DocDat is a python package which uses object detection, OCR, and LLMs to transform raw documents into structured data which can be used in RAG or Fine-Tuning systems.

Previous
Previous

AgenticCRM

Next
Next

TextClass