SIGN IN SIGN UP

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

158577 0 0 Python

Add Music Flamingo (#43538)

* Music flamingo

* Fix pos embeddings

* Method arg docstrings

* Add tests & docs

* Fix AF3 dtype bug

* Fix the MF performance issue

* Fix pos embeddings

* Fix embeddings & format

* Remove external deps

* Update processor token names

* Cleanup

* Simplify RotaryEmbedding to lang-only

* Reuse AF3 config classes

* Trim+rename rotary embedding

* Call parent _init_weights first and drop rotary einsum

* Precompute rotary cache at init

* Use modular processor pattern for MusicFlamingo

* Remove audio-only inference example

* Refactor Audio Feature Casting Path

* Clarify private source repo

* Clean up modular

* Move config to modular

* Formatting

* Remove dummy

* Derive musicflamingo timing and rotary config

* Llama style rotary embeddings

* Added reproducer comments

* Expose _init_weights for modular.

* Satisfy repo checks

* Align MusicFlamingo rotary with Llama style

* Move MusicFlamingo _init_weights to encoder

* Keep old behavior

* Move MusicFlamingo rotary settings into encoder rope_parameters

* Use AutoConfig in AF3/MF

* Align MusicFlamingo RoTE with Llama RoPE conventions

* Update outdated fixtures

* init_weights without changing others

* FIx import

* Remove backward compat

* Regenerate modeling for MF

* Fix AF3 batch inference bug

* Simplify config and nit.

* Conform more to transformers convention, e.g. removing unused code paths.

* Add another possible AF3 prefix.

* Use auto_docstring and update docstrings.

* Nits

* Nit for review

* Shift RoTE to main model so that encoder can be directly used from AF3.

* Refactoring nit.

* Fix init

* Fix some failing tests

* Fix AF3 & MF and add batching tests

* Fix audio embedding masking (bad post length)

* Nits and remove since same as GLM was bug in post length computation

* Simplify MF as AF3, and style checks.

* New config after merge and modular update.

* Address music flamingo tests, and some cleanup.

* style check

* Regenerate config.

* Update fixtures.

* Nits

* Nit

* Improve RoTE config

* Refine MusicFlamingo rotary time handling

* Simplification, and update AF3 processor for better modular

* Fix torch export

* Simplify modular, including upstreaming input_ids input to get_audio_features

* Remove upstreaming of input_ids to get_audio_features, and remove audio_rotary_dim.

* Switch to MoonshineRotaryEmbedding, and cleanup.

* Remove hardcoded MusicFlamingo partial_rotary_factor

* Update fixtures

* Compile re.sub

* Update src/transformers/models/musicflamingo/modular_musicflamingo.py

Co-authored-by: Arthur <[email protected]>

* Update src/transformers/models/musicflamingo/modular_musicflamingo.py

Co-authored-by: Arthur <[email protected]>

* Style

* Update fixtures.

* Conditional torch import for processor.

---------

Co-authored-by: Eric B <[email protected]>
Co-authored-by: Eric Bezzam <[email protected]>
Co-authored-by: Arthur <[email protected]>
L
Lasha Koroshinadze committed
a9c6700a5078e8a9276656a0d0b82b32958624b7
Parent: b7074b1
Committed by GitHub <[email protected]> on 3/30/2026, 6:04:21 PM