Skip to content

Conversation

@Sameerlite
Copy link
Collaborator

@Sameerlite Sameerlite commented Oct 27, 2025

Title

Support for Custom Vertex AI Models via PSC Endpoint with api_base

Relevant issues

Fixes LIT-1096

Pre-Submission checklist

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix
🆕 New Feature

Changes

This PR adds comprehensive support for Vertex AI Private Service Connect (PSC) endpoints, allowing users to use custom api_base URLs for both completion and embedding requests. This enables access to privately deployed Vertex AI models through internal network endpoints.

Key Features Added

  1. PSC Endpoint URL Construction: Enhanced _check_custom_proxy() to properly construct full PSC URLs with the format:

    {api_base}/v1/projects/{project}/locations/{location}/endpoints/{model}:{endpoint}
    
  2. Numeric Model ID Support: Modified routing logic to ensure numeric endpoint IDs (common for custom deployments) properly use the HTTP-based handler that respects api_base.

  3. Comprehensive Parameter Passing: Updated all Vertex AI handlers to pass necessary parameters (vertex_project, vertex_location, vertex_api_version) for proper PSC URL construction.

  4. Bug Fix: Fixed a pre-existing JSON serialization bug in Vertex AI embeddings where non-serializable objects were being passed to TypedDict constructors.

Technical Changes

Core URL Construction (litellm/llms/vertex_ai/vertex_llm_base.py)

  • Enhanced _check_custom_proxy() to detect PSC endpoints and construct full URL paths
  • Added logic to handle both PSC endpoints and standard proxy configurations
  • Updated function signatures to accept additional Vertex AI parameters

Routing Logic (litellm/llms/vertex_ai/common_utils.py)

  • Modified get_vertex_ai_model_route() to route numeric model IDs with api_base to the HTTP-based handler
  • Ensures PSC endpoints use the correct code path that respects custom api_base

Handler Updates

Updated all Vertex AI handlers to pass required parameters:

  • vertex_gemma_models/main.py
  • vertex_model_garden/main.py
  • context_caching/vertex_ai_context_caching.py
  • batches/handler.py

Bug Fix (litellm/llms/vertex_ai/vertex_embeddings/transformation.py)

  • Fixed JSON serialization issue by filtering optional_params to only include valid TypedDict fields
  • Prevents ClientSession and other non-serializable objects from being passed to JSON serialization

Usage Example

import litellm

# PSC endpoint configuration
response = litellm.completion(
    model="vertex_ai/1234567890",  # Numeric endpoint ID
    messages=[{"role": "user", "content": "Hello"}],
    api_base="http://10.96.32.8",  # PSC endpoint
    vertex_project="my-project-id",
    vertex_location="us-central1"
)

# Embeddings also supported
response = litellm.embedding(
    model="bge-small-en-v1.5",
    input=["Hello", "World"],
    api_base="http://10.96.32.8"
)

Or specify in config.yaml:

model_list:
  - model_name: bge-small-en-v1.5
    litellm_params:
      model: vertex_ai/1234567890 
      api_base: http://10.96.32.8  # Your PSC IP
      vertex_project: my-project-id  #optional
      vertex_location: us-central1 #optional
image

@vercel
Copy link

vercel bot commented Oct 27, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
litellm Ready Ready Preview Comment Oct 27, 2025 4:06am

@Sameerlite Sameerlite changed the title Litellm pcs bge endpoint support Support for Custom Vertex AI Models via PSC Endpoint with api_base Oct 27, 2025
Copy link
Contributor

@ishaan-jaff ishaan-jaff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sameerlite
Copy link
Collaborator Author

Sameerlite commented Oct 28, 2025

@ishaan-jaff there’s no strict format for the api_base when using a PSC endpoint. It can be something like custom_api_base or even an internal IP (e.g., 10.x.x.x).
Because of this variability, it’s not possible to reliably detect whether an api_base is a PSC endpoint or a standard public endpoint.

Difference between normal and PSC endpoints:

  • Normal endpoint:

    api_base:endpoint
    
  • PSC endpoint:

    https://api_base/v1/projects/pathrise-convert-1606954137718/locations/us-central1/publishers/google/models/text-bison:predict
    

To handle this cleanly, we can introduce a boolean variable such as vertex_pcs_endpoint.
If set to true, it will explicitly indicate that the provided api_base refers to a PSC endpoint, allowing us to adjust the request logic accordingly.
Otherwise, tests like above will keep breaking due to the endpoint format differences.

Should I follow the approach mentioned or you have a better solution?

@ishaan-jaff ishaan-jaff changed the base branch from main to litellm_bge_staging_branch October 29, 2025 00:51
@ishaan-jaff ishaan-jaff merged commit 525b79b into litellm_bge_staging_branch Oct 29, 2025
47 of 53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants