Update MagpieTTS model with latest changes#15031
Conversation
Signed-off-by: Jason <jasoli@nvidia.com>
There was a problem hiding this comment.
Pull Request Overview
This PR updates the MagpieTTS model with the latest development changes, including enhanced transformer architecture, new preference optimization methods, improved testing infrastructure, and expanded utility modules for audio codec processing and evaluation.
Key Changes:
- Introduced online (GRPO) and offline (DPO/RPO) preference optimization training modes
- Enhanced transformer architecture with improved attention mechanisms and masking support
- Added comprehensive evaluation scripts and metrics (FCD, UTMOSv2)
- Expanded audio codec modules with new quantizers and encoders
Reviewed Changes
Copilot reviewed 53 out of 54 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/functional_tests/*.sh | New functional test scripts for MagpieTTS inference and training modes |
| tests/collections/tts/modules/test_transformer_2501.py | Added mask parameters and batched inference tests for transformer |
| tests/collections/tts/modules/test_fcd_metric.py | New tests for Frechet Codec Distance metric |
| tests/collections/common/test_lhotse_*.py | Tests for Lhotse data filtering and duplicate removal |
| scripts/magpietts/*.py | New evaluation, inference, and data processing scripts |
| scripts/magpietts/README_magpie_po.md | Documentation for preference optimization workflows |
| requirements/requirements_tts.txt | Added UTMOSv2 dependency |
| nemo/utils/nemo_logging.py | Added stacklevel parameter to logging calls and docstrings |
| nemo/collections/tts/parts/utils/helpers.py | Enhanced masking with pad_to_factor and attention prior visualization |
| nemo/collections/tts/parts/utils/callbacks.py | Removed experimental decorator |
| nemo/collections/tts/parts/preprocessing/*.py | Removed experimental decorators and improved formatting |
| nemo/collections/tts/modules/*.py | New modules for UTMOSv2, FCD metric, and MagpieTTS components |
| nemo/collections/tts/modules/transformer_2501.py | Enhanced with masking support and improved attention mechanisms |
| nemo/collections/tts/modules/encodec_modules.py | Added properties for codebook metadata |
| nemo/collections/tts/modules/audio_codec_modules.py | Extensive additions including new encoders, decoders, and quantizers |
| nemo/collections/tts/models/magpietts_preference_optimization.py | New preference optimization model implementations |
| nemo/collections/tts/models/init.py | Updated imports for renamed MagpieTTS models |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Jason <jasoli@nvidia.com>
Signed-off-by: Jason <jasoli@nvidia.com>
…in parakeet inference to test segmentation fault Signed-off-by: Jason <jasoli@nvidia.com>
tests/functional_tests/L2_TTS_Fast_dev_runs_Magpietts_DecoderContext.sh
Outdated
Show resolved
Hide resolved
Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
... and remove some debug code. Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com>
|
[🤖]: Hi @blisc 👋, We wanted to let you know that a CICD pipeline for this PR just finished successfully. So it might be time to merge this PR or get some approvals. |
There was a problem hiding this comment.
Pull Request Overview
Copilot reviewed 59 out of 60 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
[🤖]: Hi @blisc 👋, We wanted to let you know that a CICD pipeline for this PR just finished successfully. So it might be time to merge this PR or get some approvals. |
|
hi @pablo-garay, @ko3n1g, @thomasdhc. This PR is nearly ready for merge. If all looks good to you, could we get your approval? thank you - Roy |
|
[🤖]: Hi @blisc 👋, We wanted to let you know that a CICD pipeline for this PR just finished successfully. So it might be time to merge this PR or get some approvals. |
subhankar-ghosh
left a comment
There was a problem hiding this comment.
LGTM, left some minor comments.
Signed-off-by: Fejgin, Roy <rfejgin@nvidia.com> Signed-off-by: Roy Fejgin <rfejgin@nvidia.com>
shehzeen
left a comment
There was a problem hiding this comment.
Have reviewed preference optimization changes. They all look ok to me. My changes from 2508 branch are here with a few more checks.
|
CI issue known & due to which would be reverted. Author mentioned fine to fast-merge |
What does this PR do ?
Updates MagpieTTS with latest dev changes.
Collection: tts
Changelog