🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
blt wip (#38579)
* blt wip * cpu version * cpu friendly with full entropy model (real time patching) * adding config file instead of args file * enable MPS * refactoring unused code * single config class in config file * inherit from PreTrainedModel * refactor LMTransformer --> BLTPatcher * add conversion script * load from new checkpoing with form_pretrained * fixed demo from_pretrained * clean up * clean a few comments * cleanup folder * clean up dir * cleaned up modeling further * rename classes * adding transformers Attention class and RotaryEmbedding class * exchanged blt modules for transformers modules: attention, rotary_emb, create_causal_mask, etc * seperate out patcher config, update modeling and conversion script * rename vars to be more transformers-like * rm unused functions * adding cross attention from transformers * pass arg * rename weights * updated conversion script * overwritten commit! fixing PR * apply feedback * adding BLTRMSNorm like Llama * add repeat_kv and eager_attention_forward copied from * BLTMLP identical to MllamTextMLP * clean up some args' * more like mllama, but busier inits * BLTTransformerLayer config * decoder, encoder, global configs * wip working on modular file * cleaning up patch and configs * clean up patcher helpers * clean up patcher helpers further * clean up * some config renaming * clean up unused configs * clean up configs * clean up configs * update modular * clean * update demo * config more like mllama, seperated subconfigs from subdicts * read from config instead of self args * update demo file * model weights to causal lm weights * missed file * added tied weights keys * BLTForCausalLM * adding files after add-new-model-like * update demo * working on tests * first running integration tests * added integration tests * adding tokenization tests, integration tests, and cleaned up tokenization file, + ruff * tokenizer clean up * modular file * fixing rebase * ruff * adding correct basemodel output and updating config with checkpoint vals (for testing) * BLTModelTests git status * enabling inputs_embeds, although won't be equal to input_ids since need ids for patching logic * fix sdpa == causal tests * fix small model test and some gradient checkpointing * skip training GC tests * fix test * updated modular * update modular * ruff * adding modular + modeling * modular * more modern is_casual check * cleaning up modular * more modular reduction * ruff * modular fix * fix styling * return 2 * return 2 * fix some tests * fix bltcrossattention after modular break * some fixes / feedback * try cache generate fix * try cache generate fix * fix generate tests * attn_impl workaround * refactoring to use recent TransformersKwargs changes * fix hidden_states shape test * refactor to new outputs * simplify outputs a bit * rm unneeded decoderlayer overwriting * rename blt * forgot tokenizer test renamed * Reorder * Reorder * working on modular * updates from modular * new modular * ruff and such * update pretrainedmodel modular * using cohere2 apply_rotary_pos_emb * small changes * apply feedback r2 * fix cross_attention * apply more feedback * update modeling fix * load submodules from pretrainedmodel * set initializer_range to subconfigs * rm cross_attnetion_states pass when not needed * add 7b projection layer support * check repo * make copies * lost cohere2 rotate_half * ruff * copies? * don't tie weights for submodules * tie weights setting * check docstrings * apply feedback * rebase * rebased modeling * update docs * applying feedback * few more fixes * fix can_record_outputs * fast tokenizer * no more modulelist * tok auto * rm tokenizersss * fix docs * ruff * fix after rebase * fix test, configs are not subscriptable --------- Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: Lysandre <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]> Co-authored-by: [email protected] <[email protected]>
I
Ita Zaporozhets committed
ddfa3d4402915b8aafd82b3135cb37af9a5d6b69
Parent: 46ea7e6
Committed by GitHub <[email protected]>
on 9/19/2025, 9:55:55 AM