Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Add Github Action to run TPU tests. (#2376)
* Add Github Action to run TPU tests. * Trigger new Github Actions run. * Clean up more comments. * Use different fixed version of ml-testing-accelerators and update config to match. * use cluster in us-central1-a * Run 'gcloud logging read' directly without 'echo' to preserve newlines. * cat coverage.xml on the TPU VM side and upload xml on the Github Action side * Use new commit on ml-testing-accelerators so command runs fully. * Preserve newlines in the xml and use if: always() temporarily to upload codecov * Use pytorch_lightning for coverage instead of pytorch-lightning * Remove the debug cat of coverage xml * Apply suggestions from code review * jsonnet rename * name * add codecov flags * add codecov flags * codecov * codecov * revert codecov * Clean up after apt-get and remove old TODOs. * More codefactor cleanups. * drone * drone * disable codecov * cleaning * docker py versions * docker py 3.7 * readme * bash * docker * freeze conda * py3.6 * Stop using apt-get clean. * Dont rm pytorch-lightning * Update docker/tpu/Dockerfile * Longer timeout in the Github Action to wait for GKE to finish. * job1 * job2 * job3 Co-authored-by: Jirka Borovec <[email protected]> Co-authored-by: Jirka <[email protected]>
Z
zcain117 committed
1a40963d1dd64ba61edd91ab552611ba0dc3367c
Parent: dcd6000
Committed by GitHub <[email protected]>
on 7/2/2020, 1:44:19 AM