Skip to content

Support qwen3-vl for THD format and CP#1943

Merged
ko3n1g merged 39 commits intoNVIDIA-NeMo:mainfrom
wplf:jinliang/qwen3-vl
Feb 10, 2026
Merged

Support qwen3-vl for THD format and CP#1943
ko3n1g merged 39 commits intoNVIDIA-NeMo:mainfrom
wplf:jinliang/qwen3-vl

Conversation

@wplf
Copy link
Copy Markdown
Contributor

@wplf wplf commented Jan 14, 2026

What does this PR do ?

We will adopt @ISEEKYAN's work for qwen3-vl from mbridge into megatron-bridge, incorporating his additions such as THD format and CP support.

For now, THD format and BSHD format training is ready.

  • bshd example script
python -m torch.distributed.run --nproc_per_node=8 \                                                                                                                                                                                                                                                                                           users/jinliangl cw-dfw-cs-001-vscode-01
    finetune_qwen_vl.py \
    --dataset-type hf \
    --data-path llava_video_178k \
    --recipe qwen3_vl_30b_a3b_finetune_config \
    --config-file conf/qwen3_vl_30b_a3b_pretrain_mfsdp_override_example.yaml dataset.pack_sequences_in_batch=false
  • thd example script
cd $HOME2/repos/Megatron-Bridge/examples/recipes/qwen_vl
python -m torch.distributed.run --nproc_per_node=8 \
    finetune_qwen_vl.py \
    --dataset-type hf \
    --data-path llava_video_178k \
    --recipe qwen3_vl_8b_finetune_config  \
    --config-file conf/qwen3_vl_pretrain_override_example.yaml dataset.pack_sequences_in_batch=false

Model forward Validation

image

The output from Megatron-Bridge's Qwen3VL is now bitwise identical to that of M-Bridge's Qwen3VL.

MOE model

image

Dense model

image

Remain to do

  • Add BSHD format
  • Add THD format
  • Add CP support for BSHD format and THD foramt
  • Support Vision module's HF2Mcore and Mcore2HF ckpt conversion and verify it
  • BSHD format forward output align with mbridge's forward output
  • THD format forward output align with BSHD's forward output
  • THD E2E training validation
  • BSHD + CP E2E training validation
  • THD + CP E2E training validation
  • Vision model DP and text model CP support and validation
  • Code review and refractor

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced vision-language model support with improved architecture for multimodal processing
    • Added profiling capabilities including memory snapshots and performance tracking
    • Extended logging and monitoring options for training visibility
  • Configuration Updates

    • Updated example configurations with expanded training parameters, optimization settings, and vision model options

Signed-off-by: jinliangl <jinliangl@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Jan 14, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@wplf wplf marked this pull request as draft January 14, 2026 12:04
@wplf wplf changed the title qwen3-vl migration [wip] Support qwen3-vl for THD format and CP [wip] Jan 14, 2026
Signed-off-by: jinliangl <jinliangl@nvidia.com>
@wplf wplf changed the title Support qwen3-vl for THD format and CP [wip] [Draft] support qwen3-vl for THD format and CP Jan 15, 2026
@wplf wplf marked this pull request as ready for review January 22, 2026 10:27
@wplf wplf changed the title [Draft] support qwen3-vl for THD format and CP Support qwen3-vl for THD format and CP Jan 22, 2026
@cuichenx cuichenx self-requested a review January 22, 2026 23:04
Copy link
Copy Markdown
Contributor

@cuichenx cuichenx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, accidentally clicked approve. Please check my comments above

wplf added 4 commits January 23, 2026 14:43
Signed-off-by: jinliangl <jinliangl@nvidia.com>
…ision_model=true to enable it

Signed-off-by: jinliangl <jinliangl@nvidia.com>
Signed-off-by: jinliangl <jinliangl@nvidia.com>
@wplf wplf force-pushed the jinliang/qwen3-vl branch from a3b8dda to 8ae2c86 Compare February 9, 2026 10:45
@yaoyu-33
Copy link
Copy Markdown
Contributor

yaoyu-33 commented Feb 9, 2026

/ok to test 1cfa2dc

@wplf
Copy link
Copy Markdown
Contributor Author

wplf commented Feb 10, 2026

/ok to test 1cfa2dc

@wplf
Copy link
Copy Markdown
Contributor Author

wplf commented Feb 10, 2026

/ok to test e18e64d

@shifangx
Copy link
Copy Markdown
Contributor

/ok to test e18e64d

shifangx
shifangx previously approved these changes Feb 10, 2026
@shifangx
Copy link
Copy Markdown
Contributor

/ok to test 3eb6ebc

1 similar comment
@wplf
Copy link
Copy Markdown
Contributor Author

wplf commented Feb 10, 2026

/ok to test 3eb6ebc

Copy link
Copy Markdown
Contributor

@ko3n1g ko3n1g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests will be submitted soon in a followup

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants