ggml-webgpu: address precision issues for multimodal (#22808)

* fix(mixed-types): use f32 for precision and update the shared memory calculation logic for f32

* fix(unary): correct the gelu, gelu quick and gelu erf functions

* fix(flash-attn-tile): fix the hardcode v type

* fix(flash_attn): fix tile path

* fix: pass editorconfig and address the type conflicts

* fix: remove reduant pipeline keys

* fix: remove inline min/max group size functions and revert the flash attn path order

* fix: use clamp to avoid NaN for GELU

* fix: use the right range for exp, 80 is safer for f32 exp

Chen Yuan committed 1mo ago

239a497e5f6a19dffcad4d4e601d66b1a8e51895

Parent: 89730c8

Committed by GitHub <noreply@github.com> on 5/12/2026, 2:27:04 PM