Task HumanEval/092 has contradictory tests in Rust #142

geajack · 2024-05-20T15:29:19Z

The Rust version of HumanEval/092 contains the following lines:

assert_eq!(candidate(3.0, 4.0, 7.0), true);
assert_eq!(candidate(3.0, 4.0, 7.0), false);

(I think this is row 67 of the huggingface dataset for multipl-E, but I haven't checked)

This obviously makes the tests unsatisfiable. It seems like this was a type-casting issue when translating from Python, the original tests read:

assert candidate(3,4,7)==True, "This prints if this assert fails 9 (also good for debugging!)"
assert candidate(3.0,4,7)==False, "This prints if this assert fails 10 (also good for debugging!)"

The text was updated successfully, but these errors were encountered:

arjunguha · 2024-05-21T03:22:35Z

wow, thanks. yeah, we should make a decision on how to fix this. I'm going to guess that this affects other typed languages too.

arjunguha · 2024-05-21T03:23:24Z

have you see HumanEval+ btw? Does that address this?

geajack · 2024-05-21T09:40:50Z

No, I haven't looked into Eval+

arjunguha · 2024-05-21T12:32:41Z

The original Python problem barely makes sense in a typed language such as Rust:

https://github.com/nuprl/MultiPL-E/blob/main/datasets/originals/HumanEval_92_any_int.py

It's not clear to me if this should be fixed by changing the problem, removing the problem from MultiPL-E, or just left as something that fails.

Randl · 2024-05-27T06:35:48Z

I would read the problem as "the number is an integer" rather than "the type of variable is integer", i.e., I'd expect

assert candidate(3.0,4,7)==True

That, however, would mean the problem doesn't match the original HumanEval, so maybe it's better to just drop it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Task HumanEval/092 has contradictory tests in Rust #142

Task HumanEval/092 has contradictory tests in Rust #142

geajack commented May 20, 2024

arjunguha commented May 21, 2024

arjunguha commented May 21, 2024

geajack commented May 21, 2024

arjunguha commented May 21, 2024

Randl commented May 27, 2024

Task HumanEval/092 has contradictory tests in Rust #142

Task HumanEval/092 has contradictory tests in Rust #142

Comments

geajack commented May 20, 2024

arjunguha commented May 21, 2024

arjunguha commented May 21, 2024

geajack commented May 21, 2024

arjunguha commented May 21, 2024

Randl commented May 27, 2024