Currently, dstack Models work with Chat Completions API, but since March OpenAI has introduced the Responses API (migration guide). OpenAI says that the Chat Completions API will not be deprecated, but Responses API is the recommended default, with new features and improved efficiency. Ecosystem projects (vLLM, Ollama, etc.) are actively adding Responses API support, mainly because Codex does not work with Chat Completions.
In our docs, chat model is said to have OpenAI-compatible endpoint, which is confusing since there are different OpenAI formats with Responses API being the new default.
|
If the service is running a chat model with an OpenAI-compatible interface, |
|
set the [`model`](#model) property to make the model accessible via `dstack`'s |
|
global OpenAI-compatible endpoint, and also accessible via `dstack`'s UI. |
Consider adding Responses API support to dstack if we continue to provide OpenAI-compatible endpoints.
Currently, dstack Models work with Chat Completions API, but since March OpenAI has introduced the Responses API (migration guide). OpenAI says that the Chat Completions API will not be deprecated, but Responses API is the recommended default, with new features and improved efficiency. Ecosystem projects (vLLM, Ollama, etc.) are actively adding Responses API support, mainly because Codex does not work with Chat Completions.
In our docs, chat model is said to have OpenAI-compatible endpoint, which is confusing since there are different OpenAI formats with Responses API being the new default.
dstack/docs/docs/concepts/services.md
Lines 163 to 165 in 38e66bc
Consider adding Responses API support to dstack if we continue to provide OpenAI-compatible endpoints.