Accuracy benchmarks and official QA process?

I don't think this is an immediate concern, but once there is some momentum in the community I think it could make sense to have an official process and recommendation for verifying improvements and regressions?

I'm thinking something along the lines

1. Automated testing against a static stash of images with known outcomes
2. Manual QA testing checklist instructions for IRL swing testing accompanied by non-interfering third-party LMs
3. Automated performance benchmarks

I imagine this could create a powerful feedback loop for community QA contribution if the process is laid out in an easy-to-follow formula.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Accuracy benchmarks and official QA process? #17

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Accuracy benchmarks and official QA process? #17

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions