SIGN IN SIGN UP

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

158577 0 0 Python

blt wip (#38579)

* blt wip

* cpu version

* cpu friendly with full entropy model (real time patching)

* adding config file instead of args file

* enable MPS

* refactoring unused code

* single config class in config file

* inherit from PreTrainedModel

* refactor LMTransformer --> BLTPatcher

* add conversion script

* load from new checkpoing with form_pretrained

* fixed demo from_pretrained

* clean up

* clean a few comments

* cleanup folder

* clean up dir

* cleaned up modeling further

* rename classes

* adding transformers Attention class and RotaryEmbedding class

* exchanged blt modules for transformers modules: attention, rotary_emb, create_causal_mask, etc

* seperate out patcher config, update modeling and conversion script

* rename vars to be more transformers-like

* rm unused functions

* adding cross attention from transformers

* pass arg

* rename weights

* updated conversion script

* overwritten commit! fixing PR

* apply feedback

* adding BLTRMSNorm like Llama

* add repeat_kv and eager_attention_forward copied from

* BLTMLP identical to MllamTextMLP

* clean up some args'

* more like mllama, but busier inits

* BLTTransformerLayer config

* decoder, encoder, global configs

* wip working on modular file

* cleaning up patch and configs

* clean up patcher helpers

* clean up patcher helpers further

* clean up

* some config renaming

* clean up unused configs

* clean up configs

* clean up configs

* update modular

* clean

* update demo

* config more like mllama, seperated subconfigs from subdicts

* read from config instead of self args

* update demo file

* model weights to causal lm weights

* missed file

* added tied weights keys

* BLTForCausalLM

* adding files after add-new-model-like

* update demo

* working on tests

* first running integration tests

* added integration tests

* adding tokenization tests, integration tests, and cleaned up tokenization file, + ruff

* tokenizer clean up

* modular file

* fixing rebase

* ruff

* adding correct basemodel output and updating config with checkpoint vals (for testing)

* BLTModelTests git status

* enabling inputs_embeds, although won't be equal to input_ids since need ids for patching logic

* fix sdpa == causal tests

* fix small model test and some gradient checkpointing

* skip training GC tests

* fix test

* updated modular

* update modular

* ruff

* adding modular + modeling

* modular

* more modern is_casual check

* cleaning up modular

* more modular reduction

* ruff

* modular fix

* fix styling

* return 2

* return 2

* fix some tests

* fix bltcrossattention after modular break

* some fixes / feedback

* try cache generate fix

* try cache generate fix

* fix generate tests

* attn_impl workaround

* refactoring to use recent TransformersKwargs changes

* fix hidden_states shape test

* refactor to new outputs

* simplify outputs a bit

* rm unneeded decoderlayer overwriting

* rename blt

* forgot tokenizer test renamed

* Reorder

* Reorder

* working on modular

* updates from modular

* new modular

* ruff and such

* update pretrainedmodel modular

* using cohere2 apply_rotary_pos_emb

* small changes

* apply feedback r2

* fix cross_attention

* apply more feedback

* update modeling fix

* load submodules from pretrainedmodel

* set initializer_range to subconfigs

* rm cross_attnetion_states pass when not needed

* add 7b projection layer support

* check repo

* make copies

* lost cohere2 rotate_half

* ruff

* copies?

* don't tie weights for submodules

* tie weights setting

* check docstrings

* apply feedback

* rebase

* rebased modeling

* update docs

* applying feedback

* few more fixes

* fix can_record_outputs

* fast tokenizer

* no more modulelist

* tok auto

* rm tokenizersss

* fix docs

* ruff

* fix after rebase

* fix test, configs are not subscriptable

---------

Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: Lysandre <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Co-authored-by: [email protected] <[email protected]>
I
Ita Zaporozhets committed
ddfa3d4402915b8aafd82b3135cb37af9a5d6b69
Parent: 46ea7e6
Committed by GitHub <[email protected]> on 9/19/2025, 9:55:55 AM