Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to use any Azure Models other than OpenAI ones #7275

Open
semidark opened this issue Dec 17, 2024 Discussed in #7098 · 7 comments
Open

Unable to use any Azure Models other than OpenAI ones #7275

semidark opened this issue Dec 17, 2024 Discussed in #7098 · 7 comments

Comments

@semidark
Copy link

Discussed in #7098

Originally posted by semidark December 9, 2024
Hi Folks,

I just got started with LiteLLM as a proxy because I needed an OpenAI-compatible API for my Azure AI Service. I began by working with OpenAI models (gpt-4o and gpt-4o-mini), and everything worked perfectly. So I thought, why not try some other models? I attempted to use Mistral-large.

Here’s my configuration:

model_list:
  - model_name: gpt-4o
    litellm_params:
      model: azure/gpt-4o
      api_base: "https://censored.openai.azure.com/"
      api_key: "censored"
      api_version: "2024-05-01-preview"
  - model_name: gpt-4o-mini
    litellm_params:
      model: azure/gpt-4o-mini
      api_base: "https://censored.openai.azure.com/"
      api_key: "censored"
      api_version: "2024-05-01-preview"
  - model_name: Mistral-large
    litellm_params:
      model: azure_ai/Mistral-large
      api_base: "https://censored.services.ai.azure.com/"
      api_key: "censored"
      api_version: "2024-05-01-preview"

I tried a few variations of the model string (mistral-large,Mistral-large-latest,mistral-large), but nothing worked. I always receive the following error:

23:01:35 - LiteLLM Proxy:DEBUG: proxy_server.py:3494 - An error occurred: litellm.NotFoundError: NotFoundError: Azure_aiException - Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}
Received Model Group=Mistral-large
Available Model Group Fallbacks=None
Model: azure_ai/Mistral-large
API Base: `https://censored.services.ai.azure.com/`
Messages: `[{'role': 'user', 'content': '### Task:\nGenerate 1-3 broad tags categorizing the main themes of the'}`
model_group: `Mistral-large`
deployment: `azure_ai/Mistral-large`

I checked the Metrics of the deployment under ai.azure.com and it has a few succesfull requests, bot no input nor output tokens measured.

Any ideas on what I might be doing wrong here?

@emerzon
Copy link
Contributor

emerzon commented Dec 17, 2024

Are you using the correct endpoints?

non OpenAI models endpoint normally have an endpoint in the format <name>.<region>.models.ai.azure.com

You need to create this endpoint for each model in Azure AI Foundry.

@semidark
Copy link
Author

semidark commented Dec 18, 2024

I obtained my endpoint from the overview at https://ai.azure.com/, specifically from the project I created for this task. The project has the capabilities: Azure AI Inference, Azure OpenAI, and Azure AI Services.

I attempted to use the endpoints listed there, but I received the 404 error I mentioned earlier when using the Azure AI Services endpoint:
https://ai-myusername3214234dsf12334.cognitiveservices.azure.com/

While writing this, I realized I made an error when copying the Azure AI Inference endpoint. It seems I accidentally removed the trailing /models string. When I include the /models path, I encounter a different error:

https://ai-myusername3214234dsf12334.services.ai.azure.com/models

This time, I receive a 401 error:

litellm.exceptions.AuthenticationError: litellm.AuthenticationError: AuthenticationError: Azure_aiException - Error code: 401 - {'statusCode': 401, 'message': 'Unauthorized. Access token is missing, invalid, audience is incorrect (https://cognitiveservices.azure.com), or has expired.'}

I'm not sure if this counts as progress. I have quadruple-checked the API key configured in the litellm config.

@emerzon
Copy link
Contributor

emerzon commented Dec 18, 2024

Yes, the issue seems to be that you're trying to use a OpenAI endpoint instead of a AI Studio (or AI foundry) endpoint. Their naming is confusing and they keep changing them.

You will need an endpoint ending with models.ai.azure.com.

You will also have one endpoint per model.

@semidark
Copy link
Author

semidark commented Dec 19, 2024

Hi @emerzon, thank you for your patience. I'm sure I've never tried to use the Azure OpenAI endpoint for other models like Mistral-large. I understand these are two separate components in liteLLM and Azure.

My endpoint looks something like this: https://ai-myusername3214234dsf12334.services.ai.azure.com/models (just an example). As I mentioned earlier, I'm now receiving a 401 error instead of the previous 404 error.

I've explored all the options in the Azure Web Consoles (https://ai.azure.com and https://portal.azure.com), but they consistently provide the same endpoint URL and API Key everywhere. To test whether I have a fundamental misunderstanding of the configuration or setup of liteLLM, I added the Amazon Bedrock serverless inference endpoints without any issues.

I'm really stuck with the Azure AI backend.

@emerzon
Copy link
Contributor

emerzon commented Dec 19, 2024

Hi @semidark - My bad. I just checked out Azure and noticed that now they offer global endpoints for Mistral models.
This was not the case when I deployed, that's why I ended up with regional endpoints like <name>.<region>.models.ai.azure.com

My current regional endpoints work fine with LiteLLM.
It's possible that there are some differences on how the global endpoint should be used, so I guess this needs further investigation.

@semidark
Copy link
Author

semidark commented Dec 20, 2024

Ah, I see. So it really is a bug. Good to know. How can I assist with the investigation? As mentioned in my original post, I've just started working with litellm. The bug is probably located somewhere around here: https://github.com/BerriAI/litellm/tree/888b3a25afa514847deae0307d4b7cc495206564/litellm/llms/azure.

Additionally, this seems to be the official documentation for creating and using Serverless Inference Endpoints: https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-models-serverless?view=azureml-api-2&tabs=azure-studio

@danrr
Copy link

danrr commented Dec 20, 2024

I think I ran into this issue just now. I might have found the issue: the handler for azure_ai models defers to openai which initializes a regular OpenAI client which passes a bearer token auth header, rather than the AzureOpenAi client which passes the api-key header.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@danrr @semidark @emerzon and others