mulesoft · glenn-rodgers-sf · Mar 30, 2026 · Mar 23, 2026 · Mar 23, 2026 · Mar 25, 2026
@@ -30,6 +30,13 @@
 * xref:flex-gateway-managed-set-up.adoc[]
 * xref:flex-gateway-managed-ingress-egress.adoc[]
 
+// LLM Proxy
+* xref:flex-gateway-llm-proxy.adoc[]
+** xref:flex-gateway-llm-proxy-create-llm-proxy.adoc[]
+** xref:flex-gateway-llm-proxy-request.adoc[]
+** xref:flex-gateway-llm-proxy-token-reports.adoc[]
+** xref:flex-gateway-llm-proxy-policies.adoc[]
+
 // Setting Up Self-Managed Flex Gateway
 * xref:flex-gateway-set-up.adoc[]
 ** xref:flex-install.adoc[Download Flex Gateway]
@@ -89,6 +96,7 @@
 *** xref:policies-included-a2a-prompt-decorator.adoc[A2A Prompt Decorator]
 *** xref:policies-included-a2a-schema-validation.adoc[A2A Schema Validation]
 *** xref:policies-included-a2a-token-rate-limit.adoc[A2A Token Based Rate Limit]
+*** xref:policies-included-llm-token-rate-limit.adoc[LLM Token Based Rate Limit]
 *** xref:policies-included-agent-connection-telemetry.adoc[Agent Connection Telemetry Policy]
 *** xref:policies-included-basic-auth-ldap.adoc[Basic Authentication: LDAP]
 *** xref:policies-included-basic-auth-simple.adoc[Basic Authentication: Simple]
@@ -119,6 +127,7 @@
 *** xref:policies-included-openid-token-enforcement.adoc[OpenID Connect OAuth 2.0 Token Enforcement]
 *** xref:policies-included-rate-limiting.adoc[Rate Limiting]
 *** xref:policies-included-rate-limiting-sla.adoc[Rate Limiting: SLA-Based]
+*** xref:policies-included-regex-prompt-guard.adoc[Regex Prompt Guard]
 *** xref:policies-included-response-timeout.adoc[Response Timeout]
 *** xref:policies-included-schema-validation.adoc[Schema Validation]
 *** xref:policies-included-soap-schema-validation.adoc[SOAP Schema Validation]

@@ -2,15 +2,15 @@
 [[connected-mode]]
 === Managed Flex Gateway and Flex Gateway Connected Mode
 
-When you apply the policy to your API instance from the UI, the following parameters are displayed:
+When you apply the policy from the UI, the following parameters are displayed:
 
 // end::ui[]
 
 // tag::configFile[]
 [[local-mode]]
 === Flex Gateway Local Mode
 
-In Local Mode, you apply the policy to your API via declarative configuration files. Refer to the following policy definition and table of parameters:
+When you apply the policy via declarative configuration files, Refer to the following policy definition and table of parameters:
 
 // end::configFile[]
 

@@ -0,0 +1,151 @@
+= Creating an LLM Proxy
+ifndef::env-site,env-github[]
+include::_attributes.adoc[]
+endif::[]
+:imagesdir: ../assets/images
+
+You can configure the LLM Proxy to use different models and different routes.
+
+NOTE: A large Flex Gateway supports up to 50 LLM Proxies.
+
+[[before-you-begin]]
+== Before You Begin
+
+. Deploy a Flex Gateway version 1.11.4 or later where you want to deploy your LLM Proxy.
++
+See xref:flex-gateway-managed-set-up.adoc[].
+. Ensure you have the API Manager *API Creator* permission.
+. Retrieve your API keys from your LLM Providers.
+
+[[create-an-llm-proxy]]
+== Create an LLM Proxy
+
+. From API Manager, click *LLM Proxies*.
+. Click *+ Add LLM Proxy*.
+. Configure the *Inbound Endpoint* of the LLM Proxy:
+.. Define a *LLM Proxy Name*.
+.. Select an endpoint *Format*:
++
+** OpenAI: Select the OpenAI API format to send requests to all supported LLM Providers (including Gemini).
+** Gemini: Select the Gemini API format to send requests to only Gemini.
+.. Define a *Base path*.
+.. Select *Advanced options* if necessary.
+.. Click *Next*.
+. Select a Flex Gateway to deploy the server instance to from *Select a gateway*.
+. Configure the routes that comprise the *Outbound Endpoint*:
+.. Select your *LLM Provider*.
+.. Ensure the *URL* for your provider is correct. Edit if necessary.
+.. Configure access details for the provider endpoint.
+.. Select a *Target Model* to override the model version specified in the payload. Selecting *Not Applicable* sends the request to the specified model. A *Target Model* is required for semantic routing.
++
+[NOTE]
+====
+To configure a target model for Amazon Bedrock Claude Modes, you must enter the provider and model ID formatted as `[provider_prefix]/[internal_model_id]`.
+
+To learn how to find the model ID, see xref:flex-gateway-llm-proxy-request.adoc#amazon-bedrock-model-names[Amazon Bedrock Model Names].
+====
+
+.. Click *Add LLM Route* to add additional routes. Complete the previous steps to configure the new route.
++
+NOTE: Each LLM Provider can support one route.
+. If adding multiple routes, select a *Routing strategy*. To configure your routing strategy, see:
+.. <<configure-model-based-routing>>
+.. <<configure-semantic-routing>>
+. 
+. Click *Save & Deploy*.
+
+[[configure-model-based-routing]]
+== Configure Model-Based Routing
+
+. Configure multiple routes. Click *Add LLM Route* to create new routes.
+. Select *Model-based* for *Routing strategy*.
+. Choose to enable a *Fallback route* for the request to be sent to if the provider or model is incorrectly sepcified. If enabling a fallback route:
+.. Select a *Route* to fallback to.
+.. Select a target model for the fallback route to use.
+. If no fall back route is configured and a route fails, a error response is returned.
+. Return to <<create-an-llm-proxy>> step 7 to finish configuring your LLM Proxy.
+
+[[configure-semantic-routing]]
+== Configure Semantic Routing
+
+For semantic routing, define and apply prompt topics to each route. Define deny list topics to block certain requests.
+
+To configure semantic routing:
+
+. Configure multiple routes. Click *Add LLM Route* to create new routes.
+. Select *Semantic* for *Routing strategy*.
+. If you haven't already, click *Configure Semantic Service*.
++
+To create a semantic service, see <<create-a-semantic-service>>.
+. Select a *Target Model* for each route.
+. Define a prompt topics for the routes:
+.. Click the *Select prompt topics*.
+.. Click *+ Create prompt topic*.
+.. Define a *Prompt topic name*.
+.. Define a *Prompt utterances* or click *Upload utterances* to upload a plain text file containing your prompt utterances.
+.. Click *Create*.
+.. Create multiple prompt topics for each route as needed.
+. Configure a *Fallback route* for the request to be sent to if it doesn't match a semantic route:
+.. Specify an accuracy threshold. When the accuracy of the semantic match is less than this threshold, traffic is sent to the fallback route.
+.. Select a *Route* to fallback to.
+.. Select a *Target model* for the fallback route to use.
+. Create a *Semantic prompt guard* to block users from asking the server about specific topics:
+.. Click *+ Create deny list*.
+.. Define a *Prompt topic name*.
+.. Define a *Prompt utterances* or click *Upload utterances* to upload a plain text file containing your prompt utterances.
+.. Click *Create*.
+.. Create multiple deny list topics to better protect your LLM Proxy.
++
+NOTE: Creating a semantic prompt guard automatically applies the Semantic Prompt Guard policy.
+. Return to <<create-an-llm-proxy>> step 7 to finish configuring your LLM Proxy.
+
+=== Semantic Routing Limits
+
+[%header%autowidth.spread,cols="a,a"]
+|===
+| Limit | Value
+| Prompt topics (across all routes of an LLM Proxy) | 6
+| Utterances per prompt topic | 10
+| Deny list topics | 6
+| Utterances per deny list topic | 10
+| Maximum characters per utterance | 500
+|===
+
+[[create-a-semantic-service]]
+=== Create and Edit a Semantic Service
+
+A semantic service compares the request to the defined prompt topic utterances and sends the request to the route that best matches it. The semantic service also compares the request to deny list topic utterances to block certain requests. Only one semantic service is support for each environment.
+
+To define a semantic service:
+
+. From API Manager, click *Semantic Service Setup*.
+. Click *+ Create Semantic Service*.
+. Configure the semantic service parameters:
+** *Embedding Service Provider*: The provider of the embedding model. *OpenAI* or *Hugging Face*.
+** *URL*: The URL of the embedding service.
+** *Model*: The embedding model to use.
+** *Auth key*: The API authentication key for the embedding service.
+. Click *Deploy*.
+
+To edit a semantic service:
+
+. From *Semantic Service Setup*, click the three-dots menu (image:three-dots-menu.png[3%,3%]) of the semantic service you want to edit.
+. Make the necessary edits.
+. Click *Redeploy*.
+
+== Edit and Delete an LLM Proxy
+
+To edit an LLM Proxy:
+
+. From API Manager, click *LLM Proxies*.
+. Click the name of the LLM Proxy you want to edit.
+. Click *Configuration*.
+. Switch between the *Inbound*, *Gateway*, and *Outbound* configurations to make the necessary edits.
+. Click *Save & Deploy*.
+
+To delete an LLM Proxy:
+
+. From API Manager, click *LLM Proxies*.
+. Click the three-dots menu (image:three-dots-menu.png[3%,3%]) of the LLM Proxy you want to delete.
+. Click *Delete LLM Proxy*.
+. Click *Yes, Delete*.
@@ -0,0 +1,58 @@
+= LLM Proxy Policies 
+ifndef::env-site,env-github[]
+include::_attributes.adoc[]
+endif::[]
+:imagesdir: ../assets/images
+
+By default, LLM Proxy applies these policies:
+
+* Client ID Enforcement
+* LLM Proxy Core Policy
+* Model Based Routing Policy or Semantic Routing Policy (policy name dependent on embedded service provider)
+
+You don't need to modify these policies.
+
+LLM Proxy supports most xref:policies-included-directory.adoc[included policies], but doesn't support outbound policies.
+
+These policies are specfic and useful for LLM Proxies:
+
+* xref:policies-included-regex-prompt-guard.adoc[]
+* xref:policies-included-llm-token-rate-limit.adoc[]
+
+== Apply Polcies to LLM Proxies
+
+. From API Manager, click *LLM Proxies*.
+. Click the name of the LLM Proxy you want to apply a policy to.
+. Click *AI Policies*.
+. Click *+ Add inbound policy*.
+. Select the policy to apply.
+. Configure the required parameters.
++
+For policy configuration parameters, see xref:policies-included-directory.adoc[].
+. If necessary, configure *Advanced options*.
+. Click *Apply*.
+
+== LLM Proxy Authentication Policy
+
+By default, the LLM Proxy has the Client ID Enforcement policy applied. 
+This is required because the Client ID Enforcement populates the `Authentication.clientName` variable in the `Authentication` object that is used as a unique identifier for LLM Metrics.
+
+To remove the Client ID Enforcement policy, ensure that you either:
+
+* Apply a policy that populates `Authentication.clientName`:
++
+** xref:policies-included-client-id-enforcement.adoc[]
+** xref:policies-included-rate-limiting-sla.adoc[]
+** xref:policies-included-oauth-token-introspection.adoc[] (If Client ID enforcement is configured, `skipClientIdValidation=false`)
+** xref:policies-included-openid-token-enforcement.adoc[] (If Client ID enforcement is configured, `skipClientIdValidation=false`)
+** xref:policies-included-jwt-validation.adoc[] (If Client ID enforcement is configured, `skipClientIdValidation=false`)
+** A custom policy that populates `Authentication.clientName`
+
+* Edit the dataweave variable in LLM Proxy Core to extract a different unique identifier, such as `clientid`, `userid`, or `departmentid`.
++
+NOTE: You can't filter by this unique identifier in Usage Reports.
+
+== See Also
+
+* xref:flex-gateway-secure-apis.adoc[]
+* xref:policies-included-directory.adoc[]