🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Add Music Flamingo (#43538)
* Music flamingo * Fix pos embeddings * Method arg docstrings * Add tests & docs * Fix AF3 dtype bug * Fix the MF performance issue * Fix pos embeddings * Fix embeddings & format * Remove external deps * Update processor token names * Cleanup * Simplify RotaryEmbedding to lang-only * Reuse AF3 config classes * Trim+rename rotary embedding * Call parent _init_weights first and drop rotary einsum * Precompute rotary cache at init * Use modular processor pattern for MusicFlamingo * Remove audio-only inference example * Refactor Audio Feature Casting Path * Clarify private source repo * Clean up modular * Move config to modular * Formatting * Remove dummy * Derive musicflamingo timing and rotary config * Llama style rotary embeddings * Added reproducer comments * Expose _init_weights for modular. * Satisfy repo checks * Align MusicFlamingo rotary with Llama style * Move MusicFlamingo _init_weights to encoder * Keep old behavior * Move MusicFlamingo rotary settings into encoder rope_parameters * Use AutoConfig in AF3/MF * Align MusicFlamingo RoTE with Llama RoPE conventions * Update outdated fixtures * init_weights without changing others * FIx import * Remove backward compat * Regenerate modeling for MF * Fix AF3 batch inference bug * Simplify config and nit. * Conform more to transformers convention, e.g. removing unused code paths. * Add another possible AF3 prefix. * Use auto_docstring and update docstrings. * Nits * Nit for review * Shift RoTE to main model so that encoder can be directly used from AF3. * Refactoring nit. * Fix init * Fix some failing tests * Fix AF3 & MF and add batching tests * Fix audio embedding masking (bad post length) * Nits and remove since same as GLM was bug in post length computation * Simplify MF as AF3, and style checks. * New config after merge and modular update. * Address music flamingo tests, and some cleanup. * style check * Regenerate config. * Update fixtures. * Nits * Nit * Improve RoTE config * Refine MusicFlamingo rotary time handling * Simplification, and update AF3 processor for better modular * Fix torch export * Simplify modular, including upstreaming input_ids input to get_audio_features * Remove upstreaming of input_ids to get_audio_features, and remove audio_rotary_dim. * Switch to MoonshineRotaryEmbedding, and cleanup. * Remove hardcoded MusicFlamingo partial_rotary_factor * Update fixtures * Compile re.sub * Update src/transformers/models/musicflamingo/modular_musicflamingo.py Co-authored-by: Arthur <[email protected]> * Update src/transformers/models/musicflamingo/modular_musicflamingo.py Co-authored-by: Arthur <[email protected]> * Style * Update fixtures. * Conditional torch import for processor. --------- Co-authored-by: Eric B <[email protected]> Co-authored-by: Eric Bezzam <[email protected]> Co-authored-by: Arthur <[email protected]>
L
Lasha Koroshinadze committed
a9c6700a5078e8a9276656a0d0b82b32958624b7
Parent: b7074b1
Committed by GitHub <[email protected]>
on 3/30/2026, 6:04:21 PM