Llava has various quantized models in gguf format, so it can be used with Llama.cpp. https://github.com/ggerganov/llama.cpp/pull/3436 Is this possible?
Llava has various quantized models in gguf format, so it can be used with Llama.cpp.
ggml-org/llama.cpp#3436
Is this possible?