Add Production Metrics in Prometheus format#1890
Conversation
Add /metrics for openai endpoint with the metrics that were already logged.
|
That looks great! Sorry for the delay |
WoosukKwon
left a comment
There was a problem hiding this comment.
@simon-mo Thanks for the awesome work! Left some minor comments.
|
@WoosukKwon updated! |
|
@simon-mo A dumb question: Can you provide an example script that I can test the PR? |
|
|
I see. I was testing |
|
It's probably not something you want to expose to users, nor is it a part of OpenAI API spec. It's better to leave it at |
|
@simon-mo WDYT?
|
|
|
|
Hi @simon-mo , it seems like not working properly when |
|
Thank you for the PR! I noticed that Here are a few considerations:
I'm curious to understand the rationale behind choosing |
@ichernev @simon-mo @WoosukKwon @Yard1 Could you please share your thoughts on this matter? |
|
I don't have particular preference. It seems aioprometheus has both good integration and an all in one lightweight package. |
…llm-project#1890) Optimize number of index selections of sin/cos cache. - vLLM version: v0.10.0 - vLLM main: vllm-project@656c24f Signed-off-by: whx-sjtu <2952154980@qq.com>
This one builds on #1662. It adds
aioprometheusas a dependency which is very lightweight. It exposes the metrics as we perform the regular logging pass in engine step. The memory usage is very small and constant (there is no history, only current state). The metrics is designed to be scraped and stored by external service.We currently just add metrics in engine step. Follow up: #1870
Here's an example of the metrics endpoint output.