[npu] support triangle attention for llama (#5130)

* update fused attn

* update spda

* tri attn

* update triangle

* import

* fix

* fix

Xuanlei Zhao committed 2y ago

d6df19bae7cdb9e116c1f218a4465855623c80b1

Parent: f4e72c9

Committed by GitHub <noreply@github.com> on 11/30/2023, 6:21:30 AM