Documentation of server command line parameters.

I run `python3 -m llama_cpp.server` in order to call the API from my scripts.

I'd like to implement prompt caching (like I can do in llama-cpp), but the command line options that work for llama-cpp server don't work for this project.

I search the docs and couldn't find docs on the command line options that would work.

After an error trying random command line options, I did get this output on the command line:

```
/home/arthur/.local/lib/python3.11/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
  warnings.warn(
usage: __main__.py [-h] [--model MODEL] [--model_alias MODEL_ALIAS] [--n_ctx N_CTX] [--n_gpu_layers N_GPU_LAYERS] [--tensor_split TENSOR_SPLIT]
                   [--rope_freq_base ROPE_FREQ_BASE] [--rope_freq_scale ROPE_FREQ_SCALE] [--seed SEED] [--n_batch N_BATCH] [--n_threads N_THREADS]
                   [--f16_kv F16_KV] [--use_mlock USE_MLOCK] [--use_mmap USE_MMAP] [--embedding EMBEDDING] [--low_vram LOW_VRAM]
                   [--last_n_tokens_size LAST_N_TOKENS_SIZE] [--logits_all LOGITS_ALL] [--cache CACHE] [--cache_type CACHE_TYPE] [--cache_size CACHE_SIZE]
                   [--vocab_only VOCAB_ONLY] [--verbose VERBOSE] [--host HOST] [--port PORT] [--interrupt_requests INTERRUPT_REQUESTS] [--n_gqa N_GQA]
                   [--rms_norm_eps RMS_NORM_EPS] [--mul_mat_q MUL_MAT_Q]
```

From which I can see these look like what I'm looking for:

```
[--cache CACHE] [--cache_type CACHE_TYPE] [--cache_size CACHE_SIZE]
```


However:

1. I have no idea what the format for CACHE, CACHE_TYPE and CACHE_SIZE or, or the precise meaning/effect of each option.
2. I would be very interrested in knowing what the othe options mean also.

Is there any documentation anywhere of what these mean/how to use them?

( following the exact same format/names as llamma cpp might be a good idea wherever possible btw, it would have enabled me to get this to work without bothering you, as using the llama cpp formats/options is the first thing I tried)..

Thanks a lot for any possible help.

Best regards.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Documentation of server command line parameters. #635

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Documentation of server command line parameters. #635

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions