[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels by fenghuizhang · Pull Request #12482 · vllm-project/vllm

fenghuizhang · 2025-01-27T19:13:50Z

Pipe attn_logits_soft_cap through paged_attention, this will unblock some of our models' adoption.

Note that the changed code currently doesn't have unit tests. I will add one later.

This is the same as #12294. We created a new PR as I messed up the sign off and previous commit chain.

Thanks,

github-actions · 2025-01-27T19:14:05Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

Signed-off-by: Fenghui Zhang <fhzhang@google.com>

miladm · 2025-01-27T21:27:55Z

cc @vanbasten23

alexm-redhat

@fenghuizhang thanks for doing this! LGTM

alexm-redhat · 2025-01-28T19:41:05Z

+        context_lens,
+        block_tables,
+        pages_per_compute_block,
+        megacore_mode=megacore_mode,


megacore_mode is checked internally for None?

Yep, we fixed it on the ptxla side. No workaround needed here.

vllm-project#12482) Signed-off-by: Fenghui Zhang <fhzhang@google.com>

vllm-project#12482) Signed-off-by: Fenghui Zhang <fhzhang@google.com> Signed-off-by: Isotr0py <2037008807@qq.com>

vllm-project#12482) Signed-off-by: Fenghui Zhang <fhzhang@google.com>

mergify Bot added the ci/build label Jan 27, 2025

[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels

ef40b6f

Signed-off-by: Fenghui Zhang <fhzhang@google.com>

fenghuizhang force-pushed the fork/main branch from 30a6814 to ef40b6f Compare January 27, 2025 21:16

alexm-redhat approved these changes Jan 28, 2025

View reviewed changes

mgoin approved these changes Jan 28, 2025

View reviewed changes

mgoin added tpu Related to Google TPUs ready ONLY add when PR is ready to merge/full CI is needed labels Jan 28, 2025

alexm-redhat enabled auto-merge (squash) January 28, 2025 22:35

alexm-redhat merged commit 80fcc3e into vllm-project:main Jan 28, 2025

rasmith pushed a commit to rasmith/vllm that referenced this pull request Jan 30, 2025

[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels (

d76a704

vllm-project#12482) Signed-off-by: Fenghui Zhang <fhzhang@google.com>

Isotr0py pushed a commit to Isotr0py/vllm that referenced this pull request Feb 2, 2025

[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels (

af774e2

vllm-project#12482) Signed-off-by: Fenghui Zhang <fhzhang@google.com> Signed-off-by: Isotr0py <2037008807@qq.com>

NickLucche pushed a commit to NickLucche/vllm that referenced this pull request Feb 7, 2025

[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels (

8e691c1

vllm-project#12482) Signed-off-by: Fenghui Zhang <fhzhang@google.com>

GWS0428 pushed a commit to GWS0428/VARserve that referenced this pull request Feb 12, 2025

[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels (

1fcc2da

vllm-project#12482) Signed-off-by: Fenghui Zhang <fhzhang@google.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels#12482

[Kernel] Pipe attn_logits_soft_cap through paged attention TPU kernels#12482
alexm-redhat merged 1 commit intovllm-project:mainfrom
fenghuizhang:fork/main

fenghuizhang commented Jan 27, 2025 •

edited by github-actions Bot

Loading

Uh oh!

github-actions Bot commented Jan 27, 2025

Uh oh!

miladm commented Jan 27, 2025

Uh oh!

alexm-redhat left a comment

Uh oh!

alexm-redhat Jan 28, 2025

Uh oh!

fenghuizhang Jan 28, 2025

Uh oh!

alexm-redhat Jan 28, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

fenghuizhang commented Jan 27, 2025 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jan 27, 2025

Uh oh!

miladm commented Jan 27, 2025

Uh oh!

alexm-redhat left a comment

Choose a reason for hiding this comment

Uh oh!

alexm-redhat Jan 28, 2025

Choose a reason for hiding this comment

Uh oh!

fenghuizhang Jan 28, 2025

Choose a reason for hiding this comment

Uh oh!

alexm-redhat Jan 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fenghuizhang commented Jan 27, 2025 •

edited by github-actions Bot

Loading