Add benchmark results for baidu/ernie-4.5-21b-a3b-thinking #127

github-actions · 2025-10-10T00:43:42Z

This PR adds benchmark results for the baidu/ernie-4.5-21b-a3b-thinking model.

The following files have been updated:

src/benchmark/results.json - Raw benchmark results
src/benchmark/validation-results.json - Validation results against human baseline

This PR was automatically generated by the benchmark workflow.

Note: If you don't want to merge this PR, close it and the model will be added to the untested list to prevent re-processing.

@alrocar

cursor · 2025-10-10T00:43:46Z

You have run out of free Bugbot PR reviews for this billing cycle. This will reset on November 2.

To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

vercel · 2025-10-10T00:43:47Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
llm-benchmark	Ready	Preview	Comment	Oct 10, 2025 0:44am

feat: add benchmark results for baidu/ernie-4.5-21b-a3b-thinking

4bc8069

vercel bot deployed to Preview October 10, 2025 00:44 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add benchmark results for baidu/ernie-4.5-21b-a3b-thinking #127

Add benchmark results for baidu/ernie-4.5-21b-a3b-thinking #127

Uh oh!

github-actions bot commented Oct 10, 2025

Uh oh!

cursor bot commented Oct 10, 2025

Uh oh!

vercel bot commented Oct 10, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

Add benchmark results for baidu/ernie-4.5-21b-a3b-thinking #127

Are you sure you want to change the base?

Add benchmark results for baidu/ernie-4.5-21b-a3b-thinking #127

Uh oh!

Conversation

github-actions bot commented Oct 10, 2025

Uh oh!

cursor bot commented Oct 10, 2025

Uh oh!

vercel bot commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

0 participants

vercel bot commented Oct 10, 2025 •

edited

Loading