Add a flag `vocab_as_last_dim` to MoE model config to control if reshape activation into `vocab_size` as the last dimension.
PiperOrigin-RevId: 508757132
C
Chaochao Yan committed
1c99aadd7b0cd97407b80f83b6b760a2d9fb0b73
Parent: 946d989
Committed by A. Unique TensorFlower <gardener@tensorflow.org>
on 2/10/2023, 10:37:18 PM