fix(omml): correct LaTeX output for fractions, math operators, and functions (#3122)
* fix(omml): correct LaTeX output for fractions, math operators, and functions
Fixes three related bugs in OMML-to-LaTeX conversion:
A) Fraction raised to a power now produces correct grouping braces:
{\frac{(x-c)}{v}}^{2} instead of \frac{(x-c)}{v}^{2}
Adds dedicated do_ssub/do_ssup/do_ssubsup handlers that wrap
complex base expressions (fractions, radicals) in braces.
B) EN DASH (U+2013) and CIRCUMFLEX (U+005E) inside math runs are
now mapped to their math-mode equivalents (- and ^) instead of
being escaped as \text{\textendash} and \text{\textasciicircum}.
C) Adds missing standard math functions to the FUNC dict: log, ln,
exp, det, gcd, deg, hom, ker, dim, arg, inf, sup, lim, Pr.
These now emit proper LaTeX commands (e.g. \log) instead of
falling back to plain italic text.
Closes #3120
Signed-off-by: giulio-leone <giulio.leone@users.noreply.github.com>
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>
* test(omml): add test documents for OMML-to-LaTeX conversion bugs
Add three minimal DOCX files exercising the fixed edge cases:
- omml_frac_superscript.docx: fraction as superscript base (Bug A)
- omml_text_escapes_in_math.docx: en-dash and caret in math runs (Bug B)
- omml_func_log.docx: log function recognition (Bug C)
Each file includes matching groundtruth (md, json, itxt).
Requested-by: @dolfim-ibm
Signed-off-by: Giulio Leone <giulioleone10@gmail.com>
Signed-off-by: giulio-leone <giulio.leone@users.noreply.github.com>
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>
* fix(omml): avoid double-wrapping nested sub/sup containers
Signed-off-by: giulio-leone <giulio.leone@users.noreply.github.com>
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>
* fix(omml): fix Bug B caret escape + use issue #3120 test documents
Bug B fix: prevent escape_latex from re-escaping characters that
process_unicode intentionally mapped to math operators. The caret
character U+005E inside <m:r><m:t> math runs was being converted
to ^ by _MATH_CHAR_MAP, then immediately re-escaped to \^ by
escape_latex. Now do_r restores math-mapped chars after escaping.
Result: x - y\^2 → x - y^2 (correct superscript)
Test documents: replace minimal programmatic fixtures (~1.2 KB)
with the real Word documents from issue #3120 reporter (smroels,
~37 KB each). Regenerate all groundtruth.
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>
* test: regenerate groundtruth for omml_text_escapes_in_math
Update .itxt to use proper indented-text export format (item hierarchy)
and refresh .json to match current converter output.
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>
* test(omml): regenerate indented text snapshots
The OMML regression documents were exported into the .itxt fixtures using the
wrong format, so the real DOCX end-to-end check failed even though the rebased
converter output was correct.
Regenerate the two broken indented-text snapshots from the current branch so
the MS Word E2E test verifies the actual converter behavior.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* style(omml): apply ruff format normalization
Normalize the multiline condition in omml.py to match the repository
ruff-format output so the pre-commit gate stays clean on the refreshed
PR head.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>
* DCO Remediation Commit for giulio-leone <giulio97.leone@gmail.com>
I, giulio-leone <giulio97.leone@gmail.com>, hereby add my Signed-off-by to this commit: 08001d9c5ce1e4c12e31031529b15454d664f85e
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>
---------
Signed-off-by: giulio-leone <giulio.leone@users.noreply.github.com>
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>
Signed-off-by: Giulio Leone <giulioleone10@gmail.com>
Co-authored-by: giulio-leone <giulio.leone@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> G
Giulio Leone committed
e36125ba2ddfbe584fc752e6dc7ca0f0f8f58d87
Parent: a0fc3c9
Committed by GitHub <noreply@github.com>
on 3/25/2026, 6:07:31 AM