0 0 1 Rust

onnx: add com.microsoft MultiHeadAttention handler

Standard (bidirectional) multi-head attention over unpacked query/key/value,
lowered onto tract Sdpa, with optional present_key/present_value outputs.
Bias, attention/padding masks, packed QKV and past KV cache are rejected
with clear errors.

Validated bit-close against onnxruntime (output + present_key/value).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

czoli1976 committed 20d ago

7b2ea86fb9d7c321632f331ead65bbe996fa180c

Parent: 4876cc7

Committed by Mathieu Poumeyrol <kali@users.noreply.github.com> on 5/27/2026, 3:11:46 PM