Commit Graph

  • e77182c355 revert unrelated changes Arthur 2026-03-19 16:54:53 +01:00
  • 92c625889f Fix glm dsa (#44564) Arthur 2026-03-19 16:13:33 +01:00
  • eb1712baf1 revert deepseek IlyasMoutawwakil 2026-03-19 15:50:44 +01:00
  • 0ad9382e0a style IlyasMoutawwakil 2026-03-19 15:38:43 +01:00
  • 656161b753 Merge branch 'main' into hf-exporters IlyasMoutawwakil 2026-03-19 15:38:29 +01:00
  • 884333368f 🚨🚨 Refactor Image Processors to support different backends (#43514) Yoni Gozlan 2026-03-19 10:33:28 -04:00
  • e3c01db741 update XingweiDeng 2026-03-19 22:28:34 +08:00
  • 17e894bf0d allow doctest of some stuff (dynamo specifically) IlyasMoutawwakil 2026-03-19 15:28:11 +01:00
  • 7805ec28c1 [generate] Never use cache_position anymore in generation (#44816) Cyril Vallez 2026-03-19 15:18:26 +01:00
  • 68ca6b4ac0 Fix passing of args Sai-Suraj-27 2026-03-19 14:11:55 +00:00
  • bf06736b33 Merge branch 'main' of github.com:huggingface/transformers into fix_qwen3_omni_config Sai-Suraj-27 2026-03-19 14:11:15 +00:00
  • a9fb26054d docs IlyasMoutawwakil 2026-03-19 15:10:35 +01:00
  • be8d8a4cae Update AFMoE architecture to use v5-style MoE impl (#44063) AutumnAurelium 2026-03-19 07:00:45 -07:00
  • 25a91051ed Fix KeyError in convert_to_native_format for dict vocab (#44452) Weiguang Li 2026-03-19 21:46:43 +08:00
  • eb61fba3a4 update XingweiDeng 2026-03-19 21:17:51 +08:00
  • 70e454c973 fix: XLNet: relative_positional_encoding computes on CPU every forward (#44782) Zakir Jiwani 2026-03-19 09:16:31 -04:00
  • be6cf08486 Fix annotations reader for python 3.14 in PreTrainedModel (#44672) Wenchen Li 2026-03-19 09:15:25 -04:00
  • e0f69d3810 [CB] Better parametrization for compile (#44578) Rémi Ouazan 2026-03-19 12:37:05 +01:00
  • 2fd4cdf6f1 test(kernels): align kernel mask funciton cleanup kernel-mask-function David Corvoysier 2026-03-19 12:15:22 +01:00
  • 1155918bef final onnx patches and fixes IlyasMoutawwakil 2026-03-19 11:36:40 +01:00
  • cecacd374f Fix KeyError when patching mistral regex (#43376) Leonardo Emili 2026-03-19 08:55:40 +01:00
  • f4a7c27a2e Fix Qwen3OmniMoeConfig has no attribute initializer_range Sai-Suraj-27 2026-03-19 07:06:57 +00:00
  • 529504b2fa Correct code block formatting in weightconverter.md (#44839) Zhu Lin 2026-03-19 14:59:32 +08:00
  • 5f8d1c6bac fixup jina v3 (doesnt use token type ids it seems) and inherit more fixups for nomic bert vasqu 2026-03-19 06:02:04 +01:00
  • a388046f60 Final changes yonigozlan 2026-03-19 03:45:57 +00:00
  • 06519011db Refactor nomic bert to inherit more from Jina V3 Sonny Cooper 2026-03-19 01:01:26 +00:00
  • e5a31a08f7 Merge branch 'main' into bert-rope-model Sonny Cooper 2026-03-18 23:43:36 +00:00
  • 16a5b0936d deepseek_v2, deepseek_v3, and modernbert fix for having incorrect tokenizer class on the hub (#44801) Ita Zaporozhets 2026-03-18 22:21:04 +01:00
  • 84a9dc81f1 Merge remote-tracking branch 'upstream/main' into feat/uvdoc yonigozlan 2026-03-18 21:16:36 +00:00
  • e3eae48dff skip modernbert decoder test 3outeille 2026-03-18 20:52:37 +00:00
  • 3bc60f2e61 Merge branch 'main' into fsdp-vs-ddp Ferdinand Mom 2026-03-18 21:25:27 +01:00
  • 6ad2cb9277 linting 3outeille 2026-03-18 20:21:35 +00:00
  • 7315a488b6 better way to pass eager by default 3outeille 2026-03-18 20:18:50 +00:00
  • c55f65056b [Model] Add PP-OCRv5_server_rec and PP-OCRv5_mobile_rec models Support (#44808) zhang-prog 2026-03-19 04:11:28 +08:00
  • d2fd9d15bf force eager attn 3outeille 2026-03-18 20:04:06 +00:00
  • b65acdd698 Merge branch 'main' into fsdp-vs-ddp Ferdinand Mom 2026-03-18 20:47:04 +01:00
  • 2db5f951e9 fucking dist.barrier() 3outeille 2026-03-18 19:46:09 +00:00
  • 28dedc1a35 for save/load test, just test it on 3 steps only 3outeille 2026-03-18 19:06:56 +00:00
  • 21950930a6 Add Jina-Embeddings-V3 Model (#44251) Sai-Suraj-27 2026-03-19 00:35:49 +05:30
  • 3d8d54fbce linting 3outeille 2026-03-18 18:54:08 +00:00
  • 12b4c7596c Merge branch 'main' into fsdp-vs-ddp Ferdinand Mom 2026-03-18 19:50:53 +01:00
  • d6fcf62131 Remove skipped test for FSDP all-in-one due to recurrent-specific shape mismatch 3outeille 2026-03-18 18:49:49 +00:00
  • dd86a46d08 Merge branch 'fsdp-vs-ddp' of https://github.com/huggingface/transformers into fsdp-vs-ddp 3outeille 2026-03-18 18:47:38 +00:00
  • 421861dba8 undo fsdp2 test all 3outeille 2026-03-18 18:47:04 +00:00
  • 2513237cbe feat(ci): added a network debug report (#44636) Tarek Ziade 2026-03-18 19:34:50 +01:00
  • 814ddd8330 Merge branch 'main' into fsdp-vs-ddp Ferdinand Mom 2026-03-18 19:11:07 +01:00
  • bd8209dcb5 add logging to profile how long test takes 3outeille 2026-03-18 18:06:01 +00:00
  • 981ca7bc26 Add GreedyLR adaptive learning rate scheduler (#44271) Bala Krishnamoorthy 2026-03-18 09:49:07 -07:00
  • d00640b6e6 Fix unexpected position_ids keys when loading OwlViT models (#44508) Kartik Pawade 2026-03-18 21:57:02 +05:30
  • 24a4dc22b9 Update more modular examples (#44834) Cyril Vallez 2026-03-18 18:11:51 +01:00
  • b49249811a Fix and re-run modular converter on examples (#44833) Cyril Vallez 2026-03-18 18:00:41 +01:00
  • 4ec84a022d Remove cache_position in more models (4 and last one) (#44828) Cyril Vallez 2026-03-18 16:58:42 +01:00
  • 9d4bb26186 Add custom_generate integration analysis to investigation notes static_shape_generation David Corvoysier 2026-03-18 13:14:35 +00:00
  • 7ded19ec5d Update investigation notes with full neuron_sample comparison and benchmarks David Corvoysier 2026-03-18 13:02:41 +00:00
  • e3106956c0 Pre-build static 4D causal mask for decode loop in _static_sample David Corvoysier 2026-03-18 13:02:34 +00:00
  • 8ea9438f32 Move output_ids and EOS masking to CPU in _static_sample David Corvoysier 2026-03-18 13:00:58 +00:00
  • 20cb3c8e11 Move EOS tracking to CPU in _static_sample to avoid blocking device sync David Corvoysier 2026-03-16 17:24:47 +00:00
  • 55fc5dc903 Fix left-padding position_ids in _static_sample for batched generation David Corvoysier 2026-03-16 17:24:31 +00:00
  • 68c6465a7f Add investigation notes for remaining neuron_sample optimizations David Corvoysier 2026-03-16 17:20:34 +00:00
  • 5ea795baec Auto-dispatch to _static_sample when StaticCache is detected David Corvoysier 2026-03-16 16:01:30 +00:00
  • e7ea0ee913 Replace scatter_ with direct indexing for output buffer in _static_sample David Corvoysier 2026-03-16 16:01:08 +00:00
  • ee8603b61e Replace .item() device-host sync with loop index in _static_sample David Corvoysier 2026-03-16 16:00:42 +00:00
  • 2b2bcb016c Remove redundant past_key_values reassignment in _static_sample David Corvoysier 2026-03-16 15:59:57 +00:00
  • be80b096d1 Bypass prepare_inputs_for_generation in _static_sample decode loop David Corvoysier 2026-03-16 15:59:36 +00:00
  • 3dc07ec0fd Add _static_sample method for static-shape generation with StaticCache David Corvoysier 2026-03-16 15:58:50 +00:00
  • 779cd2d692 Fix loading issue in Sam3 (#44831) Raushan Turganbay 2026-03-18 16:36:01 +01:00
  • 16fb5833b4 update XingweiDeng 2026-03-18 23:09:53 +08:00
  • 4b28cb89c2 update XingweiDeng 2026-03-18 23:07:37 +08:00
  • b052b57247 Merge branch 'fsdp-vs-ddp' of https://github.com/huggingface/transformers into fsdp-vs-ddp 3outeille 2026-03-18 14:48:49 +00:00
  • 0046f00ab8 restoring test traning mixin 3outeille 2026-03-18 14:48:37 +00:00
  • 443ebcd6ee Merge branch 'main' into fsdp-vs-ddp Ferdinand Mom 2026-03-18 15:39:33 +01:00
  • 222e9ac117 cleaning 3outeille 2026-03-18 14:37:04 +00:00
  • c87deb21b9 feat(integration): Add KubeflowCallback to enable automatic progress … (#44487) Abhijeet Dhumal 2026-03-18 20:03:24 +05:30
  • 236b10e746 trigger fsdp mixin only to the 10 most download models in dense and moe category 3outeille 2026-03-18 14:29:18 +00:00
  • aa57e1cd2f Add GGUF support for MiniMax-M2.1 model (#44526) PikaPikachu 2026-03-18 22:25:28 +08:00
  • ec9991c08d Merge branch 'fsdp-vs-ddp' of https://github.com/huggingface/transformers into fsdp-vs-ddp 3outeille 2026-03-18 14:23:57 +00:00
  • 13d8c0ba84 Revert "undo grouped test" 3outeille 2026-03-18 14:22:31 +00:00
  • e0a09ef9b2 Merge branch 'main' into fsdp-vs-ddp Ferdinand Mom 2026-03-18 15:21:15 +01:00
  • 670f3c85fd Centralize AI agent templates in .ai (#44489) Tarek Ziade 2026-03-18 15:06:58 +01:00
  • 498846b86d Merge branch 'main' into fsdp-vs-ddp Ferdinand Mom 2026-03-18 14:43:26 +01:00
  • fd568474db unskip FSDP test for BLT 3outeille 2026-03-18 13:43:11 +00:00
  • 0e706081f3 undo grouped test 3outeille 2026-03-18 13:42:45 +00:00
  • 7cd2dd86ce support xxxFast alias in v5 tokenizers (#44766) Ita Zaporozhets 2026-03-18 14:40:02 +01:00
  • 37c4db7a45 Merge branch 'main' into fsdp-vs-ddp Ferdinand Mom 2026-03-18 14:15:18 +01:00
  • e2ff1b1e6d dont test mixed precision as it is too flaky, End to end results are already proving it's working 3outeille 2026-03-18 13:14:34 +00:00
  • 83a6c5b577 Remove cache_position in more models (3) (#44759) Cyril Vallez 2026-03-18 14:09:35 +01:00
  • a29c3ff7b1 update XingweiDeng 2026-03-18 21:01:55 +08:00
  • 961facfdb3 Merge remote-tracking branch 'official/main' into feat/uvdoc XingweiDeng 2026-03-18 20:39:37 +08:00
  • 9f93b61209 Fix supports_{tp/pp}_plan (#44696) Harry Mellor 2026-03-18 12:22:17 +00:00
  • 09fea1e6e9 [CI] Temporarily skip Mistral4 tests as they almost all fail (#44825) Cyril Vallez 2026-03-18 13:05:50 +01:00
  • 2bbbbee35b update flex attention to use return_aux instead of return_lse when torch verison >= 2.9 (#44684) Neil Tenenholtz 2026-03-18 07:44:18 -04:00
  • a1acc28593 make tests run on CPU only 3outeille 2026-03-18 11:31:57 +00:00
  • 6be343f638 Merge branch 'fsdp-vs-ddp' of https://github.com/huggingface/transformers into fsdp-vs-ddp 3outeille 2026-03-18 10:47:49 +00:00
  • eb2a1c0397 fix RuntimeError: expected data_ptr to be aligned to 16 bytes 3outeille 2026-03-18 10:46:47 +00:00
  • 58c79245bf Cleanup transformers-chat-serve-progress-2 Lysandre 2026-03-18 19:42:46 +09:00
  • c7b71101f2 Merge branch 'main' into fsdp-vs-ddp Ferdinand Mom 2026-03-18 11:40:16 +01:00
  • 6f30825877 [Gemma] Update conversion scripts for Transformers v5 Comaptibility (#44631) Ryan Mullins 2026-03-18 06:39:52 -04:00
  • 30db83869b Merge branch 'fsdp-vs-ddp' of https://github.com/huggingface/transformers into fsdp-vs-ddp 3outeille 2026-03-18 10:29:58 +00:00
  • 2c285248ad Merge branch 'main' into fsdp-vs-ddp Ferdinand Mom 2026-03-18 11:29:44 +01:00
  • a4fd937224 Merge branch 'fsdp-vs-ddp' of https://github.com/huggingface/transformers into fsdp-vs-ddp 3outeille 2026-03-18 10:24:34 +00:00