Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[meta] General translation quality improvements #216

Open
16 of 44 tasks
Tracked by #891
eu9ene opened this issue Oct 4, 2023 · 4 comments
Open
16 of 44 tasks
Tracked by #891

[meta] General translation quality improvements #216

eu9ene opened this issue Oct 4, 2023 · 4 comments
Labels
meta A collection of sub-issues that uses a tasklist quality Improving robustness and translation quality

Comments

@eu9ene
Copy link
Collaborator

eu9ene commented Oct 4, 2023

This is a meta issue for brainstorming ideas and tracking issues to improve the translation quality in general:

General improvements

  1. quality
  2. quality
  3. quality
    gregtatum
  4. bug quality
  5. quality
  6. experiment quality
    eu9ene
  7. bug quality
  8. quality
  9. quality
  10. quality
  11. quality
  12. quality
  13. quality
    eu9ene
  14. quality
    eu9ene
  15. LLM
    eu9ene
  16. quality
    gregtatum
  17. cost & perf enhancement quality
    eu9ene

Better cleaning

  1. quality
    eu9ene
  2. language-coverage quality
    eu9ene
  3. quality
    eu9ene
  4. quality
  5. quality
  6. quality
  7. enhancement quality
  8. quality
  9. quality
  10. quality
  11. good first issue quality
  12. quality
  13. quality
    eu9ene
  14. quality

More data

  1. data sources quality
  2. data sources help wanted
  3. data sources good first issue
  4. data sources good first issue
  5. data sources
    gregtatum
  6. data sources good first issue meta
  7. bug data sources
  8. bug data sources
  9. data sources
    gregtatum
  10. data sources

Evals

  1. evals
    gregtatum
  2. evals
  3. evals
@eu9ene eu9ene added meta A collection of sub-issues that uses a tasklist quality Improving robustness and translation quality labels Oct 4, 2023
@marco-c
Copy link
Collaborator

marco-c commented Oct 5, 2023

Identify other classes of quality problems by comparing translations with a “good” known one and sort by BLEU

https://github.com/neulab/compare-mt could be another alternative to investigate classes of quality problems.

@marco-c
Copy link
Collaborator

marco-c commented Oct 25, 2023

Identify other classes of quality problems by comparing translations with a “good” known one and sort by BLEU

https://github.com/neulab/compare-mt could be another alternative to investigate classes of quality problems.

Filed #228.

@marco-c
Copy link
Collaborator

marco-c commented Oct 31, 2023

See also #238.

@gregtatum gregtatum changed the title General quality improvements [meta] General translation quality improvements Dec 16, 2023
@eu9ene eu9ene self-assigned this Mar 5, 2024
@marco-c
Copy link
Collaborator

marco-c commented Apr 28, 2024

  • Using some sort of “fuzzing”/”genetic algorithm” to choose the rules (where the oracle is a LLM)

Interesting approach partially related to this: https://huggingface.co/papers/2309.08532.

@eu9ene eu9ene removed their assignment Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta A collection of sub-issues that uses a tasklist quality Improving robustness and translation quality
Projects
None yet
Development

No branches or pull requests

2 participants