Agent Tools

Kimi CLI

Connect Kimi CLI to Yolo Router with OpenAI-compatible models.

iBefore you start
  • Kimi CLI is installed and available as kimi.
  • A working Yolo Router endpoint, for example https://api.yolorouter.com.
  • A Yolo Router API Key generated from the console.
  • The model ID you want to use, for example deepseek-v4-pro. It must exactly match the model ID in the Yolo Router console.

Kimi CLI supports manually configured third-party providers. For OpenAI Chat Completions compatible services such as Yolo Router, use provider type openai_legacy and set base_url to the /v1 endpoint.

Setup

Check Kimi CLI

Run:

kimi --version

If it is not installed yet, install Kimi CLI using the official installation flow before continuing.

Set API Key environment variables

Because the provider type below is openai_legacy, Kimi CLI reads OPENAI_API_KEY and OPENAI_BASE_URL for environment overrides.

export OPENAI_API_KEY="sk-yolo-..."
export OPENAI_BASE_URL="https://api.yolorouter.com/v1"
$env:OPENAI_API_KEY = "sk-yolo-..."
$env:OPENAI_BASE_URL = "https://api.yolorouter.com/v1"
set OPENAI_API_KEY=sk-yolo-...
set OPENAI_BASE_URL=https://api.yolorouter.com/v1

Persist them in your shell profile or system environment for regular use. Do not commit the real API Key to your repository.

Edit the config file

Kimi CLI's default config file is:

~/.kimi/config.toml

On Windows this is usually:

%USERPROFILE%\.kimi\config.toml

You can also type /config inside Kimi CLI to open the config file quickly.

Add the Yolo Router provider

Add or merge the following config:

default_model = "deepseek-v4-pro"

[providers.yolorouter]
type = "openai_legacy"
base_url = "https://api.yolorouter.com/v1"
api_key = ""

[models.deepseek-v4-pro]
provider = "yolorouter"
model = "deepseek-v4-pro"
max_context_size = 128000
capabilities = ["thinking"]
display_name = "DeepSeek V4 Pro"

[models.deepseek-v4-flash]
provider = "yolorouter"
model = "deepseek-v4-flash"
max_context_size = 128000
display_name = "DeepSeek V4 Flash"

Field notes:

  • default_model: the model Kimi CLI uses by default; it must be a key defined under [models.*]
  • type: use openai_legacy for OpenAI Chat Completions compatible services
  • base_url: your Yolo Router OpenAI-compatible endpoint, with /v1
  • api_key: left empty in this example because the real Key is injected through OPENAI_API_KEY
  • models.*.provider: must point to the yolorouter provider above
  • models.*.model: must exactly match the model ID in the Yolo Router console
  • capabilities = ["thinking"]: only set this for models that support reasoning/thinking mode; remove it for normal models

Start and select a model

Open your project directory and run:

kimi

Inside Kimi CLI, type:

/model

Select deepseek-v4-pro or another Yolo Router model you configured.

You can also select the model at startup:

kimi --model deepseek-v4-pro

Test the connection

In interactive mode, send:

Introduce the model you are using in one sentence.

Or run a one-off test with Print mode:

kimi --print "Introduce the model you are using in one sentence"

If Kimi CLI returns a normal response, the integration is working. You can also check the Yolo Router console for the corresponding request log.

Troubleshooting

Yolo Router models do not appear in /model

Check that [models.deepseek-v4-pro] is written to ~/.kimi/config.toml, and that provider = "yolorouter" matches [providers.yolorouter]. Restart Kimi CLI after changing config.

401 or authentication error

Check that OPENAI_API_KEY is set to your Yolo Router API Key and has no extra spaces. If you are not using environment variables, you can write the Key directly into [providers.yolorouter] as api_key.

404 or endpoint not found

Check that base_url or OPENAI_BASE_URL is https://api.yolorouter.com/v1; do not use only https://api.yolorouter.com.

Model not found

Confirm that models.*.model, default_model, and --model exactly match the model ID in the Yolo Router console. Do not use a display name, alias, or value with extra spaces.

Thinking mode behaves unexpectedly

If the model does not support reasoning/thinking, remove capabilities = ["thinking"] from that model. If the model supports reasoning but behaves unexpectedly, first test basic connectivity with thinking mode disabled.

On this page