A high-throughput and memory-efficient inference and serving engine for LLMs
[Bugfix][Spec Decode] Fix extract_hidden_states for VLM models (#38987)
Signed-off-by: Aaron Batilo <abatilo@coreweave.com>
A
Aaron Batilo committed
9a528260ef648500262709550807c292098a70c0
Parent: 968ed02
Committed by GitHub <noreply@github.com>
on 4/5/2026, 9:41:54 AM