feat: Implementation of HTML backend with headless browser (#2969)

- Implementation of HTML backend that (optionally) uses headless browser (via Playwright) to materialize HTML pages into images, and add provenances with bboxes to all elements in the converted docling document.
- Conversion preserves reading order given by HTML DOM tree
- Added support for HTML "input" fields: checkboxes, radiobuttons, text inputs, etc.
- Added support to Key-Value convention in HTML (i.e. elements with id "key1" and "key1_value1" will be paired as key-values, see test cases as examples)
- Heuristic that glues independent inline HTML elements with single-character text in them into larger text blocks
- Support for inline styling (bold, italic, etc.)

Signed-off-by: Maksym Lysak <mly@zurich.ibm.com>
Co-authored-by: Maksym Lysak <mly@zurich.ibm.com>

Maxim Lysak committed 1mo ago

1c74a9b9c7c2019b85abef8f0f94381a83b721df

Parent: 90d6dd4

Committed by GitHub <noreply@github.com> on 3/24/2026, 1:28:57 PM