Skip to content

Conversation

@hiyuchang
Copy link
Collaborator

Description

As the title says.

Now it looks like

Step 0: {'eval/gsm8k-eval/finished_task_count': 1319, 'eval/gsm8k-eval/accuracy/mean@4': 0.34154662623199394, 'eval/gsm8k-eval/accuracy/std@4': 0.2833624429507763, 'eval/gsm8k-eval/accuracy/best@2/mean': 0.46970053070507956, 'eval/gsm8k-eval/accuracy/best@2/std': 0.2601951934227904, 'eval/gsm8k-eval/accuracy/worst@2/mean': 0.21339878695981807, 'eval/gsm8k-eval/accuracy/worst@2/std': 0.22732260946377386, 'eval/gsm8k-eval/accuracy/best@4/mean': 0.582379833206975, 'eval/gsm8k-eval/accuracy/best@4/std': 0.187753964514041, 'eval/gsm8k-eval/accuracy/worst@4/mean': 0.12428809704321456, 'eval/gsm8k-eval/accuracy/worst@4/std': 0.13311398785210515, 'eval/gsm8k-eval/format_score/mean@4': 0.04791508718726308, 'eval/gsm8k-eval/format_score/std@4': 0.060430579521288934, 'eval/gsm8k-eval/format_score/best@2/mean': 0.07519408642911297, 'eval/gsm8k-eval/format_score/best@2/std': 0.043823375514568795, 'eval/gsm8k-eval/format_score/worst@2/mean': 0.020637604245640644, 'eval/gsm8k-eval/format_score/worst@2/std': 0.06012139017838808, 'eval/gsm8k-eval/format_score/best@4/mean': 0.09088036391205459, 'eval/gsm8k-eval/format_score/best@4/std': 0.02081875469787524, 'eval/gsm8k-eval/format_score/worst@4/mean': -0.006707808946171337, 'eval/gsm8k-eval/format_score/worst@4/std': 0.04779745171526119, 'eval/gsm8k-eval/time/run_execution/mean': 1.8486472846342814, 'eval/gsm8k-eval/time/run_execution/std': 0.7957768482777444, 'eval/gsm8k-eval/time/task_execution/mean': 2.0765281061587504, 'eval/gsm8k-eval/time/task_execution/std': 0.8317681867952945, 'time/eval': 77.17239356040955}

Checklist

Please check the following items before code is ready to be reviewed.

  • Code has passed all tests
  • Docstrings have been added/updated in Google Style
  • Documentation has been updated
  • Code is ready for review

@pan-x-c
Copy link
Collaborator

pan-x-c commented Jan 20, 2026

/unittest-module-trainer

@github-actions
Copy link

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
26 18 3 5 0 0 37m 27s

Failed Tests

Failed Tests ❌ Fail Message
❌ tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer The test failed in the call phase
❌ tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer The test failed in the call phase
❌ tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer The test failed in the call phase

Skipped

Tests Status
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class skipped ⏭️
tests/trainer/trainer_test.py::AgentScopeTunerTest::test_agentscope_tuner skipped ⏭️

Tests

Test Name Status Flaky Duration
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer 3m 27s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer 3m 43s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer 1m 40s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer 1m 19s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer 47.8s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer 54.6s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer 59.0s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer 1m 58s
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer 34.8s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer 31.7s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools 30.0s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode 1m 31s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode 1m 29s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode 2m 20s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer 2m 4s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer 5m 13s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer 1m 33s
tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer 1m 47s
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer ⏭️ 554ms
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer ⏭️ 548ms
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer 2m 50s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer 48.9s
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer 1m 13s
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer ⏭️ 1ms
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class ⏭️ 1ms
tests/trainer/trainer_test.py::AgentScopeTunerTest::test_agentscope_tuner ⏭️ 1ms

Github Test Reporter by CTRF 💚

@hiyuchang
Copy link
Collaborator Author

/unittest-module-explorer

@hiyuchang
Copy link
Collaborator Author

/unittest-module-trainer

@github-actions
Copy link

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
48 47 0 1 0 0 12m 26s

Skipped

Tests Status
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v1 skipped ⏭️

Tests

Test Name Status Flaky Duration
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer 1m 31s
tests/explorer/explorer_test.py::TestExplorerGSM8KRULERNoEval::test_explorer 1m 25s
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer 3m 32s
tests/explorer/explorer_test.py::ServeTest::test_serve 1m 26s
tests/explorer/proxy_test.py::RecorderTest::test_recorder 66ms
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow 6.1s
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations 5.7s
tests/explorer/scheduler_test.py::SchedulerTest::test_dynamic_timeout 13.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results 20.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_0 5.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_1 5.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_0 5.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_1 5.3s
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution 6.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow 5.7s
tests/explorer/scheduler_test.py::SchedulerTest::test_over_rollout_min_wait 9.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods 15.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop 10.1s
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks 8.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid 25.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all 8.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch 14.3s
tests/explorer/scheduler_test.py::TestRunnerStateCollection::test_runner_state_collection 10.5s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_0 1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_1 602ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_0 1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_1 1.0s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error 1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps 1.0s
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow 28ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow 16ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow 129ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow 3ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow 10ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow 7ms
tests/explorer/workflow_test.py::WorkflowTest::test_rm_gallery_workflow 138ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_0 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_1 101ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_0 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_1 201ms
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_0::test_multi_turn_workflow 20.6s
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_1::test_multi_turn_workflow 20.6s
tests/explorer/workflow_test.py::TestWorkflowStateRecording::test_workflow_state_recording 4.0s
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v0 725ms
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v1 ⏭️ 2ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner 153ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner_get_state 8.0s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_with_openai 22.8s

Github Test Reporter by CTRF 💚

@hiyuchang
Copy link
Collaborator Author

/unittest-module-trainer

@github-actions
Copy link

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
26 18 3 5 0 0 39m 5s

Failed Tests

Failed Tests ❌ Fail Message
❌ tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer The test failed in the call phase
❌ tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer The test failed in the call phase
❌ tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer The test failed in the call phase

Skipped

Tests Status
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class skipped ⏭️
tests/trainer/trainer_test.py::AgentScopeTunerTest::test_agentscope_tuner skipped ⏭️

Tests

Test Name Status Flaky Duration
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer 3m 32s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer 3m 43s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer 1m 43s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer 51.6s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer 1m 19s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer 51.1s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer 58.9s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer 2m 2s
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer 35.3s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer 29.8s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools 29.4s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode 1m 35s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode 1m 31s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode 2m 19s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer 2m 35s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer 5m 22s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer 1m 32s
tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer 1m 49s
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer ⏭️ 550ms
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer ⏭️ 2.4s
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer 3m 32s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer 47.5s
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer 1m 12s
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer ⏭️ 1ms
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class ⏭️ 1ms
tests/trainer/trainer_test.py::AgentScopeTunerTest::test_agentscope_tuner ⏭️ 1ms

Github Test Reporter by CTRF 💚

@hiyuchang
Copy link
Collaborator Author

/unittest-module-trainer

@hiyuchang
Copy link
Collaborator Author

/unittest-submodule-trainer/trainer_test

@hiyuchang
Copy link
Collaborator Author

/unittest-module-trainer

@github-actions
Copy link

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
26 21 0 5 0 0 38m 28s

Skipped

Tests Status
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class skipped ⏭️
tests/trainer/trainer_test.py::AgentScopeTunerTest::test_agentscope_tuner skipped ⏭️

Tests

Test Name Status Flaky Duration
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer 3m 47s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer 4m 3s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer 1m 39s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer 1m 22s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer 47.6s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer 53.7s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer 56.5s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer 1m 58s
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer 34.9s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer 32.3s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools 29.9s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode 1m 31s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode 1m 36s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode 2m 17s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer 2m 3s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer 5m 33s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer 1m 33s
tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer 1m 49s
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer ⏭️ 556ms
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer ⏭️ 550ms
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer 2m 50s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer 45.8s
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer 1m 13s
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer ⏭️ 1ms
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class ⏭️ 1ms
tests/trainer/trainer_test.py::AgentScopeTunerTest::test_agentscope_tuner ⏭️ 1ms

Github Test Reporter by CTRF 💚

@hiyuchang
Copy link
Collaborator Author

/unittest-module-trainer

@hiyuchang
Copy link
Collaborator Author

/unittest-module-explorer

@hiyuchang
Copy link
Collaborator Author

/unittest-module-explorer

@hiyuchang
Copy link
Collaborator Author

/unittest-module-trainer

@github-actions
Copy link

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
49 48 0 1 0 0 13m 59s

Skipped

Tests Status
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v1 skipped ⏭️

Tests

Test Name Status Flaky Duration
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer 1m 35s
tests/explorer/explorer_test.py::TestExplorerEvalDetailedStats::test_explorer 1m 25s
tests/explorer/explorer_test.py::TestExplorerGSM8KRULERNoEval::test_explorer 1m 26s
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer 3m 32s
tests/explorer/explorer_test.py::ServeTest::test_serve 1m 25s
tests/explorer/proxy_test.py::RecorderTest::test_recorder 64ms
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow 6.2s
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations 5.8s
tests/explorer/scheduler_test.py::SchedulerTest::test_dynamic_timeout 13.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results 21.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_0 6.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_1 5.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_0 5.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_1 5.4s
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution 6.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow 5.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_over_rollout_min_wait 9.5s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods 16.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop 10.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks 9.0s
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid 25.7s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all 8.6s
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch 14.4s
tests/explorer/scheduler_test.py::TestRunnerStateCollection::test_runner_state_collection 10.5s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_0 1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_1 602ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_0 1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_1 1.0s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error 1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps 1.0s
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow 27ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow 17ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow 135ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow 3ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow 11ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow 7ms
tests/explorer/workflow_test.py::WorkflowTest::test_rm_gallery_workflow 137ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_0 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_1 101ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_0 1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_1 201ms
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_0::test_multi_turn_workflow 21.0s
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_1::test_multi_turn_workflow 20.8s
tests/explorer/workflow_test.py::TestWorkflowStateRecording::test_workflow_state_recording 4.0s
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v0 713ms
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v1 ⏭️ 1ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner 136ms
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner_get_state 8.1s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_with_openai 22.8s

Github Test Reporter by CTRF 💚

@github-actions
Copy link

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
26 21 0 5 0 0 39m 5s

Skipped

Tests Status
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer skipped ⏭️
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class skipped ⏭️
tests/trainer/trainer_test.py::AgentScopeTunerTest::test_agentscope_tuner skipped ⏭️

Tests

Test Name Status Flaky Duration
tests/trainer/trainer_test.py::TestTrainerCountdown_0_fsdp::test_trainer 3m 13s
tests/trainer/trainer_test.py::TestTrainerCountdown_1_megatron::test_trainer 4m 6s
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer 1m 45s
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer 50.6s
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer 49.9s
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer 54.1s
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer 1m 28s
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer 1m 59s
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer 33.1s
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer 30.6s
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools 30.0s
tests/trainer/trainer_test.py::TestFullyAsyncMode_0_fsdp::test_fully_async_mode 1m 39s
tests/trainer/trainer_test.py::TestFullyAsyncMode_1_fsdp::test_fully_async_mode 1m 32s
tests/trainer/trainer_test.py::TestFullyAsyncMode_2_megatron::test_fully_async_mode 2m 23s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_0_fsdp::test_trainer 2m 13s
tests/trainer/trainer_test.py::TestTrainerCheckpointSave_1_megatron::test_trainer 5m 23s
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer 1m 36s
tests/trainer/trainer_test.py::TestServeWithTrainer::test_serve_with_trainer 1m 50s
tests/trainer/trainer_test.py::TestMultiModalGRPO::test_trainer ⏭️ 556ms
tests/trainer/trainer_test.py::TestMultiModalSFT::test_trainer ⏭️ 552ms
tests/trainer/trainer_test.py::TestTrainerLoRA::test_trainer 3m 35s
tests/trainer/trainer_test.py::TestOverRollout::test_trainer 49.6s
tests/trainer/trainer_test.py::TestTrainerPromptTruncation::test_trainer 1m 13s
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer ⏭️ 1ms
tests/trainer/trainer_test.py::TestTinkerTrainer::test_trainer_class ⏭️ 1ms
tests/trainer/trainer_test.py::AgentScopeTunerTest::test_agentscope_tuner ⏭️ 1ms

Github Test Reporter by CTRF 💚

@pan-x-c pan-x-c merged commit 4cb7e4a into agentscope-ai:main Jan 23, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants