Description
Is your feature request related to a problem? Please describe.
Since the speedy evolement of LLMs and their various use of prompts, we need a way to specify a prompt from runtime, not only by code, eg mistral->mistral+orca->mistral-zephyr within a week
Describe the solution you'd like
An endpoint to override a prompt or add a runtime prompt for the model I am currently evaluating
Describe alternatives you've considered
submitting a pull request for each end ever LLM I encounter OR polling the "huggingface.co/docs/transformers/main/en/chat_templating"-feature for the model in question, but that introduces one more dependency
Additional context
An example: First, I run everything in a docker/k8s setup. I call upon an endpoint to specify the prompt, if not already specified in code. That populates a runtime version of
´´´
@register_chat_format("mistral")
def format_mistral(
messages: List[llama_types.ChatCompletionRequestMessage],
**kwargs: Any,
) -> ChatFormatterResponse:
_roles = dict(user="[INST] ", assistant="[/INST]")
_sep = " "
system_template = """{system_message}"""
system_message = _get_system_message(messages)
system_message = system_template.format(system_message=system_message)
_messages = _map_roles(messages, _roles)
_messages.append((_roles["assistant"], None))
_prompt = _format_no_colon_single(system_message, _messages, _sep)
return ChatFormatterResponse(prompt=_prompt)
´´´
that I can use for calling the model.