Transformer multi gpu, remove multi_gpu flag, distribution helper functions (#4457)

* Add DistributionStrategy to transformer model

* add num_gpu flag

* Calculate per device batch size for transformer

* remove reference to flags_core

* Add synthetic data option to transformer

* fix typo

* add import back in

* Use hierarchical copy

* address PR comments

* lint

* fix spaces

* group train op together to fix single GPU error

* Fix translate bug (sorted_keys is a dict, not a list)

* Change params to a default dict (translate.py was throwing errors because params didn't have the TPU parameters.)

* Address PR comments. Removed multi gpu flag + more

* fix lint

* fix more lints

* add todo for Synthetic dataset

* Update docs

Katherine Wu committed 7y ago

29c9f9855711006704b8fa9364f966d67694287e

Parent: e7957b7

Committed by GitHub <noreply@github.com> on 6/12/2018, 4:54:13 PM