Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Score revamp #345

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
48 changes: 28 additions & 20 deletions spec/2023-07-draft.md
Original file line number Diff line number Diff line change
Expand Up @@ -1103,51 +1103,59 @@ The default value of `aggregation` is `sum` for the `secret` group and `pass-fai

#### Maximum Score Inference

The `secret` group, its subgroups, and every test case in these groups have a maximum possible score.
The `secret` group, and every subgroup and test case in a group with `sum` or `min` aggregation, have a maximum possible score.
The `secret` group's score may be any positive integer or `unbounded`.
Subgroups of `secret` may only have `unbounded` maximum score if `secret` is unbounded.
The default value of `score` for the `secret` group is 100.

The default `score` for other test data groups is inferred from the `score` value of its parent and siblings,
as is the maximum score of each test case in the group:
The default `score` for subgroups and test cases of groups with `sum` or `min` aggregation is inferred from the `score` value of that group and its children:

Group Maximum Score | Aggregation Type | Maximum Score of Test Case / Subgroup
Group Maximum Score | Aggregation Type | Default Maximum Score of Test Case / Subgroup
------------------- | -------------------- | -------------------------------------
`unbounded` | any | `unbounded`
bounded value `M` | `sum` or `pass-fail` | `(M - S)/(A + T)`
bounded value `M` | `min` | `M - S`
`unbounded` | `sum` or `min` | `unbounded`
bounded value `M` | `sum` | `(M - S)/(A + T)`
bounded value `M` | `min` | `M`

where the group has `T` test cases, `A` subgroups without a provided `score`, and whose other subgroups have maximum scores that sum to `S`.
It is a judge error if `S > M`. This formula evenly distributes a group's leftover maximum points to its test cases and subgroups with unspecified maximum score.
where the group has `T` test cases, `A` subgroups without a provided `score`, and whose other subgroups have maximum scores that sum to `S`.
This formula evenly distributes a group's leftover maximum points to its test cases and subgroups with unspecified maximum score.
It is a judge error if `S > M` for a group with bounded maximum score and `sum` aggregation.

### Scoring Test Cases

Only test cases in test case groups with `sum` or `min` aggregation receive a score.

The score of a failed test case is always 0.
By default, the score of an accepted test case is its maximum score, computed as described above.
A custom output validator may produce a `score.txt` file for a test case:

- for test cases in a group with bounded maximum score, `score.txt` must contain a single floating-point number in the range `[0,1]`.
The score of the test case is this number _multiplied_ by the test case maximum score.
A custom output validator may produce a `score.txt` or `score_multiplier.txt` file for an accepted test case:

- for test cases in unbounded groups, `score.txt` must contain a non-negative floating-point number.
- for test cases with bounded maximum score, `score_multiplier.txt`, if produced, must contain a single floating-point number in the range `[0,1]`.
The score of the test case is this number _multiplied_ by the test case maximum score. If no `score_multiplier.txt` is produced, the test case score is its maximum score.
- for test cases with unbounded maximum score, `score.txt` must be produced and must contain a non-negative floating-point number.
The score of the test case is that number.

It is a judge error if an output validator accepts a test case in an unbounded group and does not produce a `score.txt`.
It is also a judge error if an output validator produces a `score.txt` for a test case in a group with `passs-fail` aggregation.
It is a judge error if:
- an output validator accepts a test case in an unbounded group and does not produce a `score.txt`;
- an output validator produces a `score_multiplier.txt` for a test case with unbounded maximum score;
- an output validator produces a `score.txt` for a test case with bounded maximum score;
- an output validator produces a `score.txt` or `score_multiplier.txt` for a test case in a group with `pass-fail` aggregation;
- an output valiadtor produces a `score.txt` or `score_multiplier.txt` with invalid contents.

evouga marked this conversation as resolved.
Show resolved Hide resolved
### Scoring Test Groups

The score of a test group is determined by its subgroups and test cases.
If it has no subgroups or test cases, then its score is 0.
Otherwise, the score depends on the aggregation mode, which is either `pass-fail`, `sum`, or `min`.
If a group uses `pass-fail` aggregation, the group must have bounded maximum score and all subgroups must also use pass-fail aggregation.

- If a group uses `pass-fail` aggregation, the group must have bounded maximum score and all subgroups must also use pass-fail aggregation.
If the submission receives an accept verdict for all test cases in the group and its subgroups,
the score of the group is equal to its maximum possible score.
Otherwise the group score is 0.
If a group uses `sum` aggregation, the group score is the sum of the scores of its test cases and subgroups.
If a group uses `min` aggregation, then the group score is the minimum of these scores.
- If a group uses `sum` aggregation, the group score is the sum of the scores of its test cases and subgroups.
- If a group uses `min` aggregation, then the group score is the minimum of these scores.

The submission score is the score of the `secret` group.

The submission score is the score of the `secret` group.
It is a judge error if the score of any group or subgroup exceeds its maximum score.

### Required Dependent Groups

Expand Down