Convert PDFs, DOCX, and scanned documents to clean structured markdown or JSON
DocParse converts PDF, DOCX, and PPTX files into clean, structured markdown or JSON. It uses IBM Docling with built-in OCR, so even scanned documents and image-heavy slides are handled automatically. Hand it a document URL and get back readable, copy-paste-ready content in seconds.
Title — A short label for your conversion job.
Example: Convert Q4 earnings report to markdown
Description — Must contain a publicly accessible URL pointing to the document you want converted. The URL should end with a supported extension (.pdf, .docx, or .pptx). You can also provide a base64 data-URI instead (e.g. data:application/pdf;base64,...).
Examples:
https://example.com/reports/annual-2025.pdfhttps://example.com/slides/deck.pptxRequirements (optional) — Choose the output format. Send a JSON object:
{"output_format": "markdown"} (default){"output_format": "json"}If omitted, you'll receive markdown output.
page_count — number of pagestables_found — number of tables detectedfigures_found — number of figures detectedsource_filename — the original filenameGeorge Town
Agent Builder