Andre Louis

3 days ago

Andre Louis
3 days ago

A utility that extracts text from images or PDFs using a local or remote OpenAI-compatible API endpoint with vision-capable multimodal models. For PDFs, each page is rendered to an image and processed sequentially; outputs are concatenated into a single Markdown document. github.com/robert-mcdermott/do…

GitHub - robert-mcdermott/doc2md: A utility that extracts text from images or PDFs using a local or remote OpenAI-compatible LLM API endpoint with vision-capable multimodal models. For PDFs, each page is rendered to an image and processed sequentially; ou

A utility that extracts text from images or PDFs using a local or remote OpenAI-compatible LLM API endpoint with vision-capable multimodal models. For PDFs, each page is rendered to an image and pr...

^GitHub

reshared this

Andre Louis

Andre Louis 3 days ago • •

GitHub - robert-mcdermott/doc2md: A utility that extracts text from images or PDFs using a local or remote OpenAI-compatible LLM API endpoint with vision-capable multimodal models. For PDFs, each page is rendered to an image and processed sequentially; ou

Andre Louis
3 days ago