# Using the Docling agent skill [Agent Skills](https://agentskills.io/specification) are folders of instructions that AI coding agents (Cursor, Claude Code, GitHub Copilot, etc.) can load when relevant. ## Where this bundle lives - **Cursor (local):** `~/.cursor/skills/docling-document-intelligence/` (or copy this folder there). - **Docling repository (docs + PRs):** `docs/examples/agent_skill/docling-document-intelligence/` in [github.com/docling-project/docling](https://github.com/docling-project/docling). The two trees are kept in sync; use either source. ## Install (copy into your agent's skills directory) ```bash # From a checkout of the Docling repo cp -r docs/examples/agent_skill/docling-document-intelligence ~/.cursor/skills/ # Or copy from another machine / archive into e.g. ~/.claude/skills/ ``` No extra config is required beyond installing Python dependencies (below). ## Usage Open your agent-enabled IDE and ask, for example: ``` Parse report.pdf and give me a structural outline ``` ``` Convert https://arxiv.org/pdf/2408.09869 to markdown ``` ``` Chunk invoice.pdf for RAG ingestion with 512 token chunks ``` ``` Process scanned.pdf using the VLM pipeline ``` The agent should read `SKILL.md`, match the task, and run the appropriate `docling` CLI command or Python API call. ## Running the docling CLI directly ```bash pip install docling docling-core # Basic conversion to Markdown docling report.pdf --output /tmp/ # JSON output docling report.pdf --to json --output /tmp/ # Custom OCR engine docling report.pdf --ocr-engine rapidocr --output /tmp/ # VLM pipeline docling scanned.pdf --pipeline vlm --output /tmp/ # VLM with specific model docling scanned.pdf --pipeline vlm --vlm-model granite_docling --output /tmp/ # Remote VLM services docling doc.pdf --pipeline vlm --enable-remote-services --output /tmp/ ``` ## Evaluate and refine ```bash docling report.pdf --to json --output /tmp/ docling report.pdf --to md --output /tmp/ python3 scripts/docling-evaluate.py /tmp/report.json --markdown /tmp/report.md ``` If the report shows `warn` or `fail`, follow `recommended_actions`, re-convert with `docling` using the suggested flags, and optionally append a note to `improvement-log.md` (see `SKILL.md` section 7). ## What the skill covers | Task | How to ask | |---|---| | Parse PDF / DOCX / PPTX / HTML / image | "parse this file" | | Convert to Markdown | "convert to markdown" | | Export as structured JSON | "export as JSON" | | Chunk for RAG | "chunk for RAG", "prepare for ingestion" | | Analyze structure | "show me the headings and tables" | | Use VLM pipeline | "use the VLM pipeline", "process scanned PDF" | | Use remote inference | "use vLLM", "call the API pipeline" | ## Further reading - [Agent Skills specification](https://agentskills.io/specification) - [Docling documentation](https://docling-project.github.io/docling/) - [Docling CLI reference](https://docling-project.github.io/docling/reference/cli/) - [Docling GitHub](https://github.com/docling-project/docling)