Kimi CLI
Connect Kimi CLI to Yolo Router with OpenAI-compatible models.
- Kimi CLI is installed and available as
kimi. - A working Yolo Router endpoint, for example
https://api.yolorouter.com. - A Yolo Router API Key generated from the console.
- The model ID you want to use, for example
deepseek-v4-pro. It must exactly match the model ID in the Yolo Router console.
Kimi CLI supports manually configured third-party providers. For OpenAI Chat Completions compatible services such as Yolo Router, use provider type
openai_legacyand setbase_urlto the/v1endpoint.
Setup
Check Kimi CLI
Run:
kimi --versionIf it is not installed yet, install Kimi CLI using the official installation flow before continuing.
Set API Key environment variables
Because the provider type below is openai_legacy, Kimi CLI reads OPENAI_API_KEY and OPENAI_BASE_URL for environment overrides.
export OPENAI_API_KEY="sk-yolo-..."
export OPENAI_BASE_URL="https://api.yolorouter.com/v1"$env:OPENAI_API_KEY = "sk-yolo-..."
$env:OPENAI_BASE_URL = "https://api.yolorouter.com/v1"set OPENAI_API_KEY=sk-yolo-...
set OPENAI_BASE_URL=https://api.yolorouter.com/v1Persist them in your shell profile or system environment for regular use. Do not commit the real API Key to your repository.
Edit the config file
Kimi CLI's default config file is:
~/.kimi/config.tomlOn Windows this is usually:
%USERPROFILE%\.kimi\config.tomlYou can also type /config inside Kimi CLI to open the config file quickly.
Add the Yolo Router provider
Add or merge the following config:
default_model = "deepseek-v4-pro"
[providers.yolorouter]
type = "openai_legacy"
base_url = "https://api.yolorouter.com/v1"
api_key = ""
[models.deepseek-v4-pro]
provider = "yolorouter"
model = "deepseek-v4-pro"
max_context_size = 128000
capabilities = ["thinking"]
display_name = "DeepSeek V4 Pro"
[models.deepseek-v4-flash]
provider = "yolorouter"
model = "deepseek-v4-flash"
max_context_size = 128000
display_name = "DeepSeek V4 Flash"Field notes:
default_model: the model Kimi CLI uses by default; it must be a key defined under[models.*]type: useopenai_legacyfor OpenAI Chat Completions compatible servicesbase_url: your Yolo Router OpenAI-compatible endpoint, with/v1api_key: left empty in this example because the real Key is injected throughOPENAI_API_KEYmodels.*.provider: must point to theyolorouterprovider abovemodels.*.model: must exactly match the model ID in the Yolo Router consolecapabilities = ["thinking"]: only set this for models that support reasoning/thinking mode; remove it for normal models
Start and select a model
Open your project directory and run:
kimiInside Kimi CLI, type:
/modelSelect deepseek-v4-pro or another Yolo Router model you configured.
You can also select the model at startup:
kimi --model deepseek-v4-proTest the connection
In interactive mode, send:
Introduce the model you are using in one sentence.Or run a one-off test with Print mode:
kimi --print "Introduce the model you are using in one sentence"If Kimi CLI returns a normal response, the integration is working. You can also check the Yolo Router console for the corresponding request log.
Troubleshooting
Yolo Router models do not appear in /model
Check that [models.deepseek-v4-pro] is written to ~/.kimi/config.toml, and that provider = "yolorouter" matches [providers.yolorouter]. Restart Kimi CLI after changing config.
401 or authentication error
Check that OPENAI_API_KEY is set to your Yolo Router API Key and has no extra spaces. If you are not using environment variables, you can write the Key directly into [providers.yolorouter] as api_key.
404 or endpoint not found
Check that base_url or OPENAI_BASE_URL is https://api.yolorouter.com/v1; do not use only https://api.yolorouter.com.
Model not found
Confirm that models.*.model, default_model, and --model exactly match the model ID in the Yolo Router console. Do not use a display name, alias, or value with extra spaces.
Thinking mode behaves unexpectedly
If the model does not support reasoning/thinking, remove capabilities = ["thinking"] from that model. If the model supports reasoning but behaves unexpectedly, first test basic connectivity with thinking mode disabled.