fix: let user params override engine defaults in API VLM engine (#3116)
* fix: let user params override engine defaults in API VLM engine User/model spec params now take precedence over engine defaults (temperature, max_tokens). This enables Azure OpenAI compatibility where max_completion_tokens must be used instead of max_tokens, and temperature=0.0 may be rejected by some deployments. When max_completion_tokens is set in user params, conflicting max_tokens is automatically removed. Fixes #3112 Signed-off-by: majiayu000 <1835304752@qq.com> * test: add coverage for user param override in API VLM engine Signed-off-by: majiayu000 <1835304752@qq.com> * fix: remove mock unit tests per reviewer feedback Signed-off-by: majiayu000 <1835304752@qq.com> --------- Signed-off-by: majiayu000 <1835304752@qq.com>
L
lif committed
fdf5e20ccd8ae85ea73effa6c743910ed295564d
Parent: f0e3d1d
Committed by GitHub <noreply@github.com>
on 3/23/2026, 10:38:00 AM