`IndirectAttackEvaluator` not uploading/displaying results in AI Foundry correctly

- **Package Name**: `azure.ai.evaluation`
- **Package Version**: `1.15.3`
- **Operating System**: MacOS
- **Python Version**: `3.12`

**Describe the bug**
There appears to be a problem with `IndirectAttackEvaluator`. After data has been simulated with `query`/`response` pairs, and then passed/uploaded to AI Foundry, the evaluation results do not appear in the Foundry portal even though the results (returned back programmatically) prove that the evaluation ran correctly.

It's not clear whether the problem is with the SDK or Foundry. This is a blocker for all RAI evaluations that rely on indirect jailbreaking using the `IndirectAttackEvaluator` class.

**To Reproduce**
```python
import os
from typing import Any, Dict, List, Optional

from azure.ai.evaluation import IndirectAttackEvaluator, evaluate
from azure.ai.evaluation.simulator import IndirectAttackSimulator
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from openai import AzureOpenAI


azure_ai_project_endpoint = "<ai-foundry-project-endpoint>"
azure_endpoint = "<azure_endpoint>"
deployment = "gpt-5.1"
api_version = "2025-03-01-preview"

# sample application
def call_llm(
    query: str
) -> str:
    token_provider = get_bearer_token_provider(DefaultAzureCredential(), "https://cognitiveservices.azure.com/.default")
    client = AzureOpenAI(
        api_version = api_version,
        azure_endpoint = azure_endpoint,
        azure_ad_token_provider = token_provider,
    )
    result = client.responses.create(
        model = deployment,
        input = query,
    )
    return result.output_text

async def callback(
          messages: List[Dict],
          stream: bool = False,
          session_state: Any = None,
          context: Optional[dict[str, Any]] = None,
    ) -> dict:
    messages_list = messages["messages"]
    query = messages_list[-1]["content"]
    context = None

    # Send message to application and get a response
    try:
        response = call_llm(query)
    except Exception:
        response = None

    # Format response in OpenAI message protocol
    message = {"content": response, "role": "assistant", "context": context}
    messages["messages"].append(message)
    return {"messages": messages_list, "stream": stream, "session_state": session_state, "context": context}

# set up and run simulator
indirect_simulator = IndirectAttackSimulator(
    azure_ai_project = azure_ai_project_endpoint,
    credential = DefaultAzureCredential())

sim_results = await indirect_simulator(
    target=callback,
    max_conversation_turns=3,
    max_simulation_results=5,
)

# save simulated results to file
with open("indirect_jailbreak_example.jsonl", "w") as file:
    file.write(sim_results.to_eval_qr_json_lines())

# set up evaluator and evaluate the simulated jailbreak conversations
indirect_evaluator = IndirectAttackEvaluator(
    azure_ai_project = azure_ai_project_endpoint,
    credential = DefaultAzureCredential(),
)

eval_results = evaluate(
    evaluation_name = "example-indirect-jailbreak-evaluation",
    data = "indirect_jailbreak_example.jsonl",
    evaluators = {"indirect_attack": indirect_evaluator},
    azure_ai_project = azure_ai_project_endpoint,
)
```

**Expected behavior**
I expect to see the eval results/scores get reported and summarized correctly in Foundry. Currently no scores are recorded even though the object `eval_results` shows clear proof that the evaluator ran correctly.

After more testing, this class worked up through the `v1.14.0` release. The problem began in the `v1.15.0` release.

**Screenshots**

<img width="1216" height="767" alt="Image" src="https://github.com/user-attachments/assets/020cc1ec-d952-411d-9bbc-3ab1eddf2bef" />

**Additional context**
Add any other context about the problem here.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`IndirectAttackEvaluator` not uploading/displaying results in AI Foundry correctly #45639

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

IndirectAttackEvaluator not uploading/displaying results in AI Foundry correctly #45639

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`IndirectAttackEvaluator` not uploading/displaying results in AI Foundry correctly #45639