[DRAFT][TTS] Magpietts Simple API and loading audiocodec from Huggingface#15172
Merged
subhankar-ghosh merged 63 commits intomainfrom Dec 17, 2025
Merged
[DRAFT][TTS] Magpietts Simple API and loading audiocodec from Huggingface#15172subhankar-ghosh merged 63 commits intomainfrom
subhankar-ghosh merged 63 commits intomainfrom
Conversation
… example Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com>
Removed multiple long manifest configurations from evalset_config.py. Signed-off-by: Subhankar Ghosh <subhankarg@nvidia.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
…to magpietts_opensource
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
…to magpietts_opensource
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
…to magpietts_opensource
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
…o/NeMo into magpietts_opensource_longform
Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com>
blisc
reviewed
Dec 10, 2025
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
…o/NeMo into magpietts_opensource_longform
Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com>
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
Collaborator
Author
|
fixed the UTMOS requirements and the json in magpietts_inference. Changed to a better way of using has_text_context in magpietts.py. |
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
rfejgin
reviewed
Dec 16, 2025
Collaborator
There was a problem hiding this comment.
This may be a good time to remove feature_dir, I believe we don't need that anymore (@paarthneekhara could you confirm?) and it would simplify both the JSON file and the code. But we'd have to remove it in a few places in the code.
rfejgin
reviewed
Dec 16, 2025
Collaborator
rfejgin
left a comment
There was a problem hiding this comment.
No major comments on my end, but see one inline about the format of evalset_config.json
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
blisc
previously approved these changes
Dec 16, 2025
Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com>
blisc
approved these changes
Dec 16, 2025
Contributor
|
[🤖]: Hi @subhankar-ghosh 👋, We wanted to let you know that a CICD pipeline for this PR just finished successfully. So it might be time to merge this PR or get some approvals. |
AkCodes23
pushed a commit
to AkCodes23/NeMo
that referenced
this pull request
Jan 28, 2026
…face (NVIDIA-NeMo#15172) * Modularize magpie inference code, move inference code from scripts to example Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Modify magpie CI with inference changes Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Renaming magpietts inference scripts from magpie to magpietts Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * infer_batch returns dataclass object Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Fixed context embedding without context encoder Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Remove unnecessary configurations Removed multiple long manifest configurations from evalset_config.py. Signed-off-by: Subhankar Ghosh <subhankarg@nvidia.com> * Removing unused imports Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Copilot suggested changes Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Copilot suggested changes Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * do_tts method, load audiocodec from huggingface Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Move inference helper modules from examples to tts collection Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Review changes Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Changes suggested in compute_mean_with_confidence_interval Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Linting issue Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * do_tts method, load audiocodec from huggingface Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * register_tokenizer_artifacts to store tokenizer files in .nemo file Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Modularize magpie inference code, move inference code from scripts to example Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Renaming magpietts inference scripts from magpie to magpietts Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Removing unused imports Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Remove unnecessary configurations Removed multiple long manifest configurations from evalset_config.py. Signed-off-by: Subhankar Ghosh <subhankarg@nvidia.com> * Copilot suggested changes Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Move inference helper modules from examples to tts collection Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Changes suggested in compute_mean_with_confidence_interval Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * do_tts method, load audiocodec from huggingface Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * register_tokenizer_artifacts to store tokenizer files in .nemo file Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * rebase with main issues Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * changed datasets to json input, moved json file to examples/tts Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Remove unwanted dataconfig. Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * optional utmos import, text_normalization cache and check, test updated Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Update nemo/collections/tts/models/magpietts.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Subhankar Ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Update nemo/collections/tts/models/magpietts.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Subhankar Ghosh <subhankar2321@gmail.com> * Linting errors Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Refactored prepare_context_tensors, removed dummy context audio/text from do_tts Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * remove utmos, make dataset path required Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * remove unused imports Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Enable loading MagpieTTS from HF Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Support speaker index in do_tts api Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> --------- Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> Signed-off-by: Subhankar Ghosh <subhankarg@nvidia.com> Signed-off-by: Subhankar Ghosh <subhankar2321@gmail.com> Co-authored-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Akhil Varanasi <akhilvaranasi23@gmail.com>
nune-tadevosyan
pushed a commit
to nune-tadevosyan/NeMo
that referenced
this pull request
Mar 13, 2026
…face (NVIDIA-NeMo#15172) * Modularize magpie inference code, move inference code from scripts to example Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Modify magpie CI with inference changes Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Renaming magpietts inference scripts from magpie to magpietts Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * infer_batch returns dataclass object Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Fixed context embedding without context encoder Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Remove unnecessary configurations Removed multiple long manifest configurations from evalset_config.py. Signed-off-by: Subhankar Ghosh <subhankarg@nvidia.com> * Removing unused imports Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Copilot suggested changes Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Copilot suggested changes Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * do_tts method, load audiocodec from huggingface Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Move inference helper modules from examples to tts collection Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Review changes Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Changes suggested in compute_mean_with_confidence_interval Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Linting issue Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * do_tts method, load audiocodec from huggingface Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * register_tokenizer_artifacts to store tokenizer files in .nemo file Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Modularize magpie inference code, move inference code from scripts to example Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Renaming magpietts inference scripts from magpie to magpietts Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Removing unused imports Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Remove unnecessary configurations Removed multiple long manifest configurations from evalset_config.py. Signed-off-by: Subhankar Ghosh <subhankarg@nvidia.com> * Copilot suggested changes Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Move inference helper modules from examples to tts collection Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Changes suggested in compute_mean_with_confidence_interval Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * do_tts method, load audiocodec from huggingface Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * register_tokenizer_artifacts to store tokenizer files in .nemo file Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * rebase with main issues Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * changed datasets to json input, moved json file to examples/tts Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Remove unwanted dataconfig. Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * optional utmos import, text_normalization cache and check, test updated Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Update nemo/collections/tts/models/magpietts.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Subhankar Ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * Update nemo/collections/tts/models/magpietts.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Subhankar Ghosh <subhankar2321@gmail.com> * Linting errors Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Refactored prepare_context_tensors, removed dummy context audio/text from do_tts Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Apply isort and black reformatting Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> * remove utmos, make dataset path required Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * remove unused imports Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Enable loading MagpieTTS from HF Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> * Support speaker index in do_tts api Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> --------- Signed-off-by: subhankar-ghosh <subhankar2321@gmail.com> Signed-off-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> Signed-off-by: Subhankar Ghosh <subhankarg@nvidia.com> Signed-off-by: Subhankar Ghosh <subhankar2321@gmail.com> Co-authored-by: subhankar-ghosh <subhankar-ghosh@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Important
The
Update branchbutton must only be pressed in very rare occassions.An outdated branch is never blocking the merge of a PR.
Please reach out to the automation team before pressing that button.
What does this PR do ?
Add a one line overview of what this PR aims to accomplish.
Collection: [Note which collection this PR will affect]
Changelog
Usage
# Add a code snippet demonstrating how to use thisGitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information