SIGN IN SIGN UP

fix(omml): correct LaTeX output for fractions, math operators, and functions (#3122)

* fix(omml): correct LaTeX output for fractions, math operators, and functions

Fixes three related bugs in OMML-to-LaTeX conversion:

A) Fraction raised to a power now produces correct grouping braces:
   {\frac{(x-c)}{v}}^{2} instead of \frac{(x-c)}{v}^{2}
   Adds dedicated do_ssub/do_ssup/do_ssubsup handlers that wrap
   complex base expressions (fractions, radicals) in braces.

B) EN DASH (U+2013) and CIRCUMFLEX (U+005E) inside math runs are
   now mapped to their math-mode equivalents (- and ^) instead of
   being escaped as \text{\textendash} and \text{\textasciicircum}.

C) Adds missing standard math functions to the FUNC dict: log, ln,
   exp, det, gcd, deg, hom, ker, dim, arg, inf, sup, lim, Pr.
   These now emit proper LaTeX commands (e.g. \log) instead of
   falling back to plain italic text.

Closes #3120

Signed-off-by: giulio-leone <giulio.leone@users.noreply.github.com>
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>

* test(omml): add test documents for OMML-to-LaTeX conversion bugs

Add three minimal DOCX files exercising the fixed edge cases:
- omml_frac_superscript.docx: fraction as superscript base (Bug A)
- omml_text_escapes_in_math.docx: en-dash and caret in math runs (Bug B)
- omml_func_log.docx: log function recognition (Bug C)

Each file includes matching groundtruth (md, json, itxt).

Requested-by: @dolfim-ibm
Signed-off-by: Giulio Leone <giulioleone10@gmail.com>
Signed-off-by: giulio-leone <giulio.leone@users.noreply.github.com>
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>

* fix(omml): avoid double-wrapping nested sub/sup containers

Signed-off-by: giulio-leone <giulio.leone@users.noreply.github.com>
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>

* fix(omml): fix Bug B caret escape + use issue #3120 test documents

Bug B fix: prevent escape_latex from re-escaping characters that
process_unicode intentionally mapped to math operators.  The caret
character U+005E inside <m:r><m:t> math runs was being converted
to ^ by _MATH_CHAR_MAP, then immediately re-escaped to \^ by
escape_latex.  Now do_r restores math-mapped chars after escaping.

Result: x - y\^2 → x - y^2 (correct superscript)

Test documents: replace minimal programmatic fixtures (~1.2 KB)
with the real Word documents from issue #3120 reporter (smroels,
~37 KB each).  Regenerate all groundtruth.

Signed-off-by: giulio-leone <giulio97.leone@gmail.com>

* test: regenerate groundtruth for omml_text_escapes_in_math

Update .itxt to use proper indented-text export format (item hierarchy)
and refresh .json to match current converter output.

Signed-off-by: giulio-leone <giulio97.leone@gmail.com>

* test(omml): regenerate indented text snapshots

The OMML regression documents were exported into the .itxt fixtures using the
wrong format, so the real DOCX end-to-end check failed even though the rebased
converter output was correct.

Regenerate the two broken indented-text snapshots from the current branch so
the MS Word E2E test verifies the actual converter behavior.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* style(omml): apply ruff format normalization

Normalize the multiline condition in omml.py to match the repository
ruff-format output so the pre-commit gate stays clean on the refreshed
PR head.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>

* DCO Remediation Commit for giulio-leone <giulio97.leone@gmail.com>

I, giulio-leone <giulio97.leone@gmail.com>, hereby add my Signed-off-by to this commit: 08001d9c5ce1e4c12e31031529b15454d664f85e

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>

---------

Signed-off-by: giulio-leone <giulio.leone@users.noreply.github.com>
Signed-off-by: giulio-leone <giulio97.leone@gmail.com>
Signed-off-by: Giulio Leone <giulioleone10@gmail.com>
Co-authored-by: giulio-leone <giulio.leone@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
G
Giulio Leone committed
e36125ba2ddfbe584fc752e6dc7ca0f0f8f58d87
Parent: a0fc3c9
Committed by GitHub <noreply@github.com> on 3/25/2026, 6:07:31 AM