mirror of
https://github.com/karpathy/nanoGPT.git
synced 2026-04-19 18:06:55 +00:00
Commit Graph
Select branches
Hide Pull Requests
autocast_wip
bias_test
init_test
master
multi_node_ddp
tie_weights
#1
#10
#101
#102
#106
#115
#116
#119
#120
#122
#125
#128
#14
#142
#143
#144
#145
#146
#15
#151
#152
#156
#156
#16
#160
#166
#17
#173
#174
#177
#18
#180
#181
#189
#19
#191
#195
#199
#20
#201
#201
#205
#207
#209
#21
#214
#220
#224
#225
#227
#23
#236
#238
#238
#24
#240
#242
#243
#246
#246
#247
#248
#248
#250
#254
#254
#258
#258
#259
#26
#260
#263
#263
#265
#266
#27
#270
#274
#275
#276
#277
#279
#284
#286
#29
#292
#292
#295
#295
#296
#298
#3
#300
#300
#301
#302
#305
#306
#307
#309
#311
#314
#315
#316
#321
#324
#330
#330
#332
#332
#333
#333
#334
#334
#335
#337
#338
#339
#339
#34
#342
#346
#346
#348
#350
#353
#356
#356
#359
#359
#360
#360
#361
#361
#362
#362
#364
#364
#365
#365
#366
#368
#368
#369
#369
#372
#372
#376
#379
#379
#382
#382
#384
#386
#386
#387
#387
#392
#393
#393
#394
#394
#395
#398
#398
#4
#40
#400
#402
#402
#403
#403
#404
#406
#408
#409
#409
#412
#412
#414
#415
#415
#419
#420
#427
#428
#429
#43
#430
#431
#44
#441
#444
#446
#446
#449
#450
#450
#452
#453
#453
#454
#454
#459
#459
#46
#462
#463
#464
#481
#481
#487
#488
#488
#490
#491
#5
#51
#526
#527
#527
#528
#528
#530
#530
#533
#533
#534
#538
#538
#54
#540
#540
#541
#546
#546
#549
#55
#550
#553
#555
#555
#556
#557
#558
#558
#559
#561
#562
#563
#564
#565
#566
#568
#569
#569
#57
#571
#572
#573
#574
#574
#575
#576
#578
#580
#580
#581
#582
#582
#583
#583
#584
#585
#585
#587
#587
#588
#590
#590
#592
#593
#596
#596
#597
#599
#599
#6
#600
#601
#601
#603
#604
#607
#607
#608
#609
#610
#611
#612
#612
#613
#613
#614
#614
#615
#617
#619
#619
#620
#620
#621
#623
#625
#625
#626
#626
#628
#628
#630
#632
#632
#634
#638
#638
#640
#641
#643
#643
#645
#645
#647
#648
#649
#650
#650
#651
#651
#652
#652
#655
#657
#657
#659
#66
#660
#662
#663
#663
#664
#664
#666
#667
#667
#668
#669
#669
#670
#670
#671
#671
#672
#672
#673
#673
#674
#676
#678
#679
#679
#68
#680
#680
#681
#681
#682
#683
#683
#684
#687
#688
#688
#690
#691
#691
#694
#695
#696
#697
#697
#698
#698
#699
#699
#700
#701
#701
#702
#705
#705
#709
#709
#71
#710
#710
#711
#711
#712
#712
#713
#714
#716
#72
#720
#721
#721
#73
#74
#76
#77
#78
#79
#80
#82
#86
#87
#9
#96
Select branches
Hide Pull Requests
autocast_wip
bias_test
init_test
master
multi_node_ddp
tie_weights
#1
#10
#101
#102
#106
#115
#116
#119
#120
#122
#125
#128
#14
#142
#143
#144
#145
#146
#15
#151
#152
#156
#156
#16
#160
#166
#17
#173
#174
#177
#18
#180
#181
#189
#19
#191
#195
#199
#20
#201
#201
#205
#207
#209
#21
#214
#220
#224
#225
#227
#23
#236
#238
#238
#24
#240
#242
#243
#246
#246
#247
#248
#248
#250
#254
#254
#258
#258
#259
#26
#260
#263
#263
#265
#266
#27
#270
#274
#275
#276
#277
#279
#284
#286
#29
#292
#292
#295
#295
#296
#298
#3
#300
#300
#301
#302
#305
#306
#307
#309
#311
#314
#315
#316
#321
#324
#330
#330
#332
#332
#333
#333
#334
#334
#335
#337
#338
#339
#339
#34
#342
#346
#346
#348
#350
#353
#356
#356
#359
#359
#360
#360
#361
#361
#362
#362
#364
#364
#365
#365
#366
#368
#368
#369
#369
#372
#372
#376
#379
#379
#382
#382
#384
#386
#386
#387
#387
#392
#393
#393
#394
#394
#395
#398
#398
#4
#40
#400
#402
#402
#403
#403
#404
#406
#408
#409
#409
#412
#412
#414
#415
#415
#419
#420
#427
#428
#429
#43
#430
#431
#44
#441
#444
#446
#446
#449
#450
#450
#452
#453
#453
#454
#454
#459
#459
#46
#462
#463
#464
#481
#481
#487
#488
#488
#490
#491
#5
#51
#526
#527
#527
#528
#528
#530
#530
#533
#533
#534
#538
#538
#54
#540
#540
#541
#546
#546
#549
#55
#550
#553
#555
#555
#556
#557
#558
#558
#559
#561
#562
#563
#564
#565
#566
#568
#569
#569
#57
#571
#572
#573
#574
#574
#575
#576
#578
#580
#580
#581
#582
#582
#583
#583
#584
#585
#585
#587
#587
#588
#590
#590
#592
#593
#596
#596
#597
#599
#599
#6
#600
#601
#601
#603
#604
#607
#607
#608
#609
#610
#611
#612
#612
#613
#613
#614
#614
#615
#617
#619
#619
#620
#620
#621
#623
#625
#625
#626
#626
#628
#628
#630
#632
#632
#634
#638
#638
#640
#641
#643
#643
#645
#645
#647
#648
#649
#650
#650
#651
#651
#652
#652
#655
#657
#657
#659
#66
#660
#662
#663
#663
#664
#664
#666
#667
#667
#668
#669
#669
#670
#670
#671
#671
#672
#672
#673
#673
#674
#676
#678
#679
#679
#68
#680
#680
#681
#681
#682
#683
#683
#684
#687
#688
#688
#690
#691
#691
#694
#695
#696
#697
#697
#698
#698
#699
#699
#700
#701
#701
#702
#705
#705
#709
#709
#71
#710
#710
#711
#711
#712
#712
#713
#714
#716
#72
#720
#721
#721
#73
#74
#76
#77
#78
#79
#80
#82
#86
#87
#9
#96
-
97e2ab1b8d
enhance readme, add some todos
Andrej Karpathy
2022-12-29 05:23:36 +00:00 -
cc11744131
Add MIT LICENSE file
Andrej
2022-12-28 21:11:26 -08:00 -
dea1507252
add support for DDP training. the scaling timings right now do not look good by default, have to dig more into
Andrej Karpathy
2022-12-29 05:06:07 +00:00 -
ee6459f1d0
readme tweaks
Andrej Karpathy
2022-12-29 02:00:25 +00:00 -
3000cf5dda
add pytorch profiler support. not sure how to support both profiler and simple benchmarking, a bit gnarly atm hmm
Andrej Karpathy
2022-12-29 01:49:53 +00:00 -
b760ef1358
add data loading into benchmarking as well, just for completeness
Andrej Karpathy
2022-12-29 00:05:32 +00:00 -
70b5d93aee
add benchmarking script v0
Andrej Karpathy
2022-12-28 23:55:43 +00:00 -
5d2b4807bf
adding a lightweight configurator that may be a terrible mistake lol. also adding configs to evaluate the baseline GPT2 versions released by OpenAI on OWT. we have some ways to go to match those numbers atm
Andrej Karpathy
2022-12-28 23:31:23 +00:00 -
c9fe00c0e9
small readme clarification and training script defaults changes
Andrej Karpathy
2022-12-28 01:45:55 +00:00 -
fe8042867c
first very bad commit
Andrej Karpathy
2022-12-28 00:58:19 +00:00