A CLI tool for interacting with remote LLM APIs with support for multiple backends and a flexible configuration system.
$ echo "red and yellow" | clai --prompt "Does mixing these colors yield orange?"
Yes, mixing red and yellow typically yields orange.
$ echo "red and yellow" | clai --bool --prompt "Mixing these colors yields orange."
{"answer":true,"reason":"Mixing red and yellow colors indeed yields orange, which is a basic principle of color theory. The context provided is sufficient as it directly states the colors involved an the resulting color."}
$ echo $?
0
From the root of the repository:
python -m pip install .
The CLI requires a config file in which the various LLM backends are
configured. The location of the config file can be set through --config
.
There is no default location for the config file, so you will have to
explicitly set it. You can avoid the need to define it by setting
the location of the config file in the environment variable CLAI_CONFIG
.
The config file has the following format:
backends:
openai:
default:
token: ${{CLAI_OPENAI_TOKEN}}
max_tokens: 4096
model: gpt-4o
system: You are a helpful assistant.
azure_openai:
default:
token: ${{CLAI_AZURE_OPENAI_TOKEN}}
endpoint: "https://api.chatgpt.mycorp.com/"
api_version: "2024-10-21"
model: "gpt-4o"
max_tokens: 4096
system: You are a helpful assistant.
mistral:
default:
token: ${{CLAI_MISTRAL_TOKEN}}
max_tokens: 4096
model: mistral-small-latest
system: You are a helpful assistant.
The config file contains the connection details for various backends.
Multiple instances per backend are possible and can be referred to using the
--backend
and --instance
parameters:
clai --prompt "Greetings!" --backend openai --instance default
Both --backend
and --instance
can be set by assigning a value to
environment variables CLAI_BACKEND
and CLAI_INSTANCE
respectively.
See the backends section below for a detailed explanation about the various parameters per backend.
Values in the configuration file, such as ${{ENV_VAR}}
, will be replaced
with the corresponding environment variable. This approach helps avoid
storing sensitive data in the configuration file, though it is not limited to
this use.
The CLI has a number of conveniences built-in.
STDIN input is concatenated after the --prompt
value:
cat script.py | clai --prompt "Explain why this code produces the following error: can only concatenate tuple (not str) to tuple"
Enabling --bool
ensures that the LLM returns a JSON formatted response that
includes a boolean answer
of either true
or false
and the reason
for
that conclusion. Additionally, the exit code will be 0
for true and 1
for
false.
false example:
$ echo "red and blue" | clai --bool --prompt "Mixing these colors yields orange."
{"answer":false,"reason":"Mixing red and blue yields purple, not orange. The context was sufficient as it clearly stated the colors to be mixed."}
$ echo $?
1
true example:
$ echo "red and yellow" | clai --bool --prompt "Mixing these colors yields orange."
{"answer":true,"reason":"Mixing red and yellow colors indeed yields orange, which is a basic principle of color theory. The context provided is sufficient as it directly states the colors involved an the resulting color."}
$ echo $?
0
The following parameters are supported:
token
: The OpenAI API tokenmax_tokens
: The maximum number of tokens the prompt is allowed to generatemodel
: The model to usesystem
: The system prompttemperature
: The prompt temperature (default 0)
https://learn.microsoft.com/en-us/azure/ai-services/openai/
token
: The Azure OpenAI API tokenendpoint
: The endpoint to connect toapi_version
: The API version to usemodel
: The model to usebase_model
(Default:None
): In order to tokenize the input, we need the model name to tokenize the input properly. Whenmodel
has a non-standard name, it's not possible to infer the actual OpenAI model on which this is based.max_tokens
: The maximum number of tokens the prompt is allowed to generatesystem
: The system prompttemperature
: The prompt temperature (default: 0)
The following parameters are supported:
token
: The Mistral API tokenmax_tokens
: The maximum number of tokens the prompt is allowed to generate. (CAVEAT: The token calculation is not accurate since there no equivalent for tiktoken which allows to tokenize locally without external dependencies)model
: The model to usesystem
: The system prompttemperature
: The prompt temperature (default 0)