Qwen2 5 vl new vit#1
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
There was a problem hiding this comment.
Thanks for this great PR! I left a few comments and please take a look at them!
In addition, it would be great if you can show some results on
- Correctness verification on the ViT: We don't need to add a unit test on it but we should at least check if the embeddings generated from the same image match with the
transformersimplementation, in both TP=1 and TP>1 cases. - Speed verification: our implementation should be at least not slower than
transformersimplementation.
|
@yixqiao Thank you for the work! I will take it over from here. |
Add the new ViT class in vLLM to Qwen 2.5 VL, removing the huggingface pretrained dependency.
Includes changes to MLP, window-based partial attention, RMSNorm, when compared to 2 VL. Enables parallelized operations where appropriate.