Add a flag `vocab_as_last_dim` to MoE model config to control if reshape activation into `vocab_size` as the last dimension.
PiperOrigin-RevId: 508757132
C
Chaochao Yan committed
b86b777b88a6c126b8bd8c9035cd089937498a1e
Parent: 7077a83
Committed by A. Unique TensorFlower <gardener@tensorflow.org>
on 2/10/2023, 10:36:57 PM