Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(i18n): Translated charts and filters into Russian #29377

Merged
merged 4 commits into from
Jul 3, 2024

Conversation

goldjee
Copy link
Contributor

@goldjee goldjee commented Jun 26, 2024

SUMMARY

I have checked and fixed Russian translations for the components of the chart editor that are related to up-to-date charts. These are the ones affected:

  • Area Chart
  • Big Number
  • Big Number with Trendline
  • Box Plot
  • Bubble Chart
  • Funnel Chart
  • Gauge Chart
  • Generic Chart
  • Graph Chart
  • Handlebars
  • Heatmap
  • Histogram
  • Line Chart
  • Mixed Chart
  • Nightingale Rose Chart
  • Pie Chart
  • Pivot Table
  • Radar Chart
  • Sunburst
  • Table
  • Tree Chart
  • Treemap
  • Waterfall Chart
  • Word Cloud

Any charts that were marked as Legacy or Deprecated were left out of scope.

In addition, I have noticed that #28637 introduced "Current" option to dashboard filters. So I have extracted new strings and translated them as well. This is the reason why other i18n files are affected.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@github-actions github-actions bot added i18n Namespace | Anything related to localization i18n:spanish Translation related to Spanish language i18n:italian Translation related to Italian language i18n:french Translation related to French language i18n:chinese Translation related to Chinese language i18n:japanese Translation related to Japanese language i18n:russian Translation related to Russian language i18n:korean Translation related to Korean language i18n:dutch i18n:slovak i18n:ukrainian i18n:portuguese i18n:brazilian i18n:traditional-chinese labels Jun 26, 2024
@dosubot dosubot bot added the viz:charts Namespace | Anything related to viz types label Jun 26, 2024
Copy link

codecov bot commented Jun 26, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 83.70%. Comparing base (76d897e) to head (b3e6c03).
Report is 409 commits behind head on master.

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #29377       +/-   ##
===========================================
+ Coverage   60.48%   83.70%   +23.21%     
===========================================
  Files        1931      521     -1410     
  Lines       76236    37492    -38744     
  Branches     8568        0     -8568     
===========================================
- Hits        46114    31381    -14733     
+ Misses      28017     6111    -21906     
+ Partials     2105        0     -2105     
Flag Coverage Δ
hive 49.05% <ø> (-0.11%) ⬇️
javascript ?
mysql 77.00% <ø> (?)
postgres 77.10% <ø> (?)
presto 53.67% <ø> (-0.13%) ⬇️
python 83.70% <ø> (+20.21%) ⬆️
sqlite 76.58% <ø> (?)
unit 59.64% <ø> (+2.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rusackas
Copy link
Member

@gooroodev please review

@gooroodev
Copy link

🤔 For now, this change is too large for me to review. Consider splitting it into smaller parts.

@goldjee
Copy link
Contributor Author

goldjee commented Jun 26, 2024

@gooroodev Could you please consider reviewing this following commit history? I've tried to make the edits atomic.

I'm afraid that splitting this into several PRs would not help reviewing because in order to provide translations for new strings we have to run babel scripts. And they shake up all the language files, resulting in a huge diff.

@gooroodev
Copy link

🤔 For now, this change is too large for me to review. Consider splitting it into smaller parts.

@rusackas
Copy link
Member

I tried to have Claude (AI) review this, but it's too large. Even the smallest of the individual files' diffs was too large :P

@goldjee
Copy link
Contributor Author

goldjee commented Jun 29, 2024

@rusackas, in this case I need your advice on how I should proceed.

In accordance with the comment by @mistercrunch, I could try to get rid of the effects of ./scripts/translations/babel_update.sh by following these steps:

  1. Back up my latest version of superset/translations/ru/LC_MESSAGES/messages.po
  2. Roll back all my changes to the upstream state
  3. Restore the backed up version of messages.po
  4. Commit and force push the changes to this branch

I believe the diff for messages.po will still be huge because of strings rearrangement. But would it somehow help to review the translation? Anyway, I believe these commands would still be required to run outside of this PR:

./scripts/translations/babel_update.sh
 pybabel update -i superset/translations/messages.pot -d superset/translations --ignore-obsolete

@mistercrunch
Copy link
Member

I think the issue is while we know (actually I'm not 100% on this, but maybe +95%) that rebuilding .po files is safe and non-destructive (as in we'll never loose a translation doing this), we don't know that rebasing/merging is safe as git seems really confused with all the edits. Ideally it should anchor on the msgid, but it just doesn't, we'd almost need a custom diffing engine that's .po file aware for this.

Knowing this, in theory that means that if/when there are many competing PRs with an older base, the first one to merge could make it a significant burden on the other ones as the diff/merge conflict doesn't actually make sense and can't be resolved line-by-line. I think no-one should ever try to resolve a merge conflict in those .po files. You'd always:

  • put your po file asside
  • rebase
  • move your po file back
  • re-run pybabel update
  • submit this

Now if there were 2 big PRs open on the same language file, that wouldn't work and we'd have to get more creative as to how to merge the 2. Unlikely but a bit of a headache if/when it does. We can fix that with a bit of coordination between people who care about the same language.

About translation PRs in general, personally as a review I'd prefer if there were 2 types of PR:

  • one that just runs/updates ALL po files using ./scripts/translations/babel_update.sh, doesn't contain any new translations. In theory could be automated to run say weekly, monthly or on-demand.
  • one that would touch a single translation files to add more translations

@mistercrunch
Copy link
Member

mistercrunch commented Jul 1, 2024

About this particular PR, if you kept just that one file for Russian file I'd feel better about approving/merging it.

The issue being as reviewers we cannot really validate that this PR isn't removing some other new translations in some other language. If you did everything in the right order it should be ok, but we don't know this for a fact. I think there has been some issues of translations lost in the shuffle in the past (I'm not 100% on that, but pretty sure it did happen before), and if you keep only your one po file this PR is safer to merge.

Some greater truths here:

  • merging a single po file is always safe, or at least safer than merging a bunch of them without actually reviewing them
  • running ./scripts/translations/babel_update.sh is also always safe

…r selector, placeholders for dashboard filters
@github-actions github-actions bot removed the i18n:spanish Translation related to Spanish language label Jul 2, 2024
@github-actions github-actions bot removed i18n:italian Translation related to Italian language i18n:french Translation related to French language i18n:chinese Translation related to Chinese language i18n:japanese Translation related to Japanese language i18n:korean Translation related to Korean language i18n:dutch i18n:slovak i18n:ukrainian i18n:portuguese i18n:brazilian i18n:traditional-chinese labels Jul 2, 2024
@goldjee
Copy link
Contributor Author

goldjee commented Jul 2, 2024

Thank you for the guidance. I've excluded all the changes except those done to the superset/translations/ru/LC_MESSAGES/messages.po file.

That Python-Integration test has failed again. Could you please take a look?

@rusackas
Copy link
Member

rusackas commented Jul 2, 2024

Re-running that test. Hopefully it's just flaky. Thank you for simplifying the PR!

@rusackas
Copy link
Member

rusackas commented Jul 2, 2024

Ok, here's some AI-generated feedback. Let me know your thoughts :)

"% calculation" is translated as "Расчет долей" which means "Share calculation". It seems like an odd translation for "% calculation". I would expect something more literal like "% вычисление" or "Вычисление процентов".

"2/98 percentiles" and "9/91 percentiles" are translated as "2/98 процентиль" and "9/91 процентиль". The word should be "перцентиль" not "процентиль". Also, it should probably be plural "перцентили" since it refers to a range.

"Adaptive formatting" is translated as "Адаптивный" which just means "Adaptive". A more complete translation might be "Адаптивное форматирование".

The translations for the singular, dual and plural forms of "1 option", "1 row", etc. need review. The different forms don't seem to match the numbers. For example:
msgid "%s row"
msgid_plural "%s rows"
msgstr[0] "%s строка"
msgstr[1] "%s строки"
msgstr[2] "%s строк"

"%s строка" is singular but should match 1 row. "%s строки" is dual but should match 2-4 rows. And "%s строк" is plural but should match 5+ rows.

"Bins" is translated as "в" which just means "in". I would expect something like "Категории" or "Интервалы".

"Bar" is translated as "Гистограмма" which means "Histogram". "Столбчатая диаграмма" is a more direct translation of "Bar chart".

"Cell bars" was removed in the English text but still exists in the Russian translation as "Гистограммы в ячейках". This translation should be removed.

"Change order of columns." and "Change order of rows." are translated as imperatives ("Сменить порядок столбцов", "Сменить порядок строк") rather than infinitives like the English text. I would suggest "Изменить порядок столбцов." and "Изменить порядок строк." to match the English grammar.

"Child label position" is translated as "Положение метки дочернего элемента" which is very literal. A shorter translation like "Расположение меток листьев" might sound more natural.

"Choose already exists" and "Choose columns to read" have fuzzy translations that don't match the English and need to be updated.

"Decrease" is translated as "Уменьшить" which means "Reduce". In the context of a waterfall chart, a better translation might be "Падение" or "Снижение".

"Determined by..." is translated as "Определяет..." which means "Determines..." rather than "Determined by...". The translation should be adjusted to match the passive voice of the original, something like "Определяется..."

"Dimensions" has a translation "Измерения" which is correct, but the accompanying description is translated a bit awkwardly. I would suggest rephrasing it to sound more natural in Russian while preserving the meaning.

"Display Name" is translated as "Отображаемое имя" which is correct. However, "Display" on its own is translated as the same thing, which is incorrect. "Display" in that context likely refers to the display/appearance of something, so a word like

"Отображение" would be more appropriate.

"Display all" is marked as fuzzy and needs a proper translation. Something like "Показать все" would work.

"Layout type of graph" is translated as just "Формат графа" which means "Graph format". A better translation might be "Способ отрисовки графа" to convey the meaning of layout.

Similarly, "Layout type of tree" is translated as just "Внешний вид дерева" which means "Tree appearance". A better translation could be "Способ отрисовки дерева" to mean the layout method.

"Line" in the context of a line chart is translated as "Линейный" which means "Linear". A more appropriate translation for a line chart would be "График".

"Line chart is used to visualize..." has an overly literal and clunky translation. I would suggest rephrasing it to sound more natural in Russian while preserving the meaning.

"Loading" and "Loading..." are both translated as "Загрузка", but "Загрузка..." with the ellipsis might be more appropriate for the "Loading..." case to match the English punctuation.

The translations for the "Main" string seem fuzzy/incomplete. "Main" was translated as just "минимум" meaning "minimum", which doesn't seem right. This may need a more thorough translation.

Other than that, Claude thought you did an outstanding job :D Let me know if you think it's right or not on any of the above.

@goldjee
Copy link
Contributor Author

goldjee commented Jul 2, 2024

Thank you, this feedback is very useful. I'll try to locate these strings in the UI and figure out how to deal with them. Let me convert this PR to draft, so it won't get accidentally merged while I'm reviewing these suggestions.

@goldjee goldjee marked this pull request as draft July 2, 2024 19:54
@goldjee
Copy link
Contributor Author

goldjee commented Jul 3, 2024

I've reviewed the feedback and made some adjustments to the translation. You can find the breakdown below.

"% calculation" is translated as "Расчет долей" which means "Share calculation". It seems like an odd translation for "% calculation". I would expect something more literal like "% вычисление" or "Вычисление процентов".

Funnel Chart. I felt like that literal translation is way too... literal. My thought process was that each segment of the funnel is some part of total. That part is represented with percentage value. The key difference of the selector values is not how we calculate percentages themselves, it's what we take as a base to get the share from. That's why I have picked this adaptation.

"2/98 percentiles" and "9/91 percentiles" are translated as "2/98 процентиль" and "9/91 процентиль". The word should be "перцентиль" not "процентиль". Also, it should probably be plural "перцентили" since it refers to a range.

IIRC, originally it was "2/98 перцентиль" (singular in Russian as well). I haven't found this particular word in dictionaries (some people say these two versions are equivalent). Changed it because the original one felt too closely borrowed from the English counterpart and Russian Wikipedia page refers to this term as "процентиль". Furthermore, "percent" in Russian is "процент" and there is no apparent reason why would the child word should change its letter order.

"Adaptive formatting" is translated as "Адаптивный" which just means "Adaptive". A more complete translation might be "Адаптивное форматирование".

I think this one refers to datetime formatting dropdown. The suggested translation is too long for the UI element and doesn't fit well into the context.

The translations for the singular, dual and plural forms of "1 option", "1 row", etc. need review. The different forms don't seem to match the numbers. For example:
msgid "%s row"
msgid_plural "%s rows"
msgstr[0] "%s строка"
msgstr[1] "%s строки"
msgstr[2] "%s строк"

I'm not sure why the AI highlights this as a potential problem. These strings look alright to me, both in Poedit and in the UI itself.

"%s строка" is singular but should match 1 row. "%s строки" is dual but should match 2-4 rows. And "%s строк" is plural but should match 5+ rows.

I think this one was picked for msgid "Show". This translation is marked as fuzzy and I haven't seen the text in the UI. It looks very incorrect and I will fix it.

"Bins" is translated as "в" which just means "in". I would expect something like "Категории" or "Интервалы".

Good point, I've missed this fuzzy and some other strings for the histogram chart as well. I tend to use more specialized term for this entity when it comes to a histogram, "Бины". Will fix.

"Bar" is translated as "Гистограмма" which means "Histogram". "Столбчатая диаграмма" is a more direct translation of "Bar chart".

Quick search indicates that this one is used in legacy charts that are out of scope of this PR.

#: superset-frontend/plugins/legacy-preset-chart-nvd3/src/Bar/index.js:39
#: superset-frontend/plugins/legacy-preset-chart-nvd3/src/DistBar/index.js:42
#: superset-frontend/plugins/plugin-chart-echarts/src/MixedTimeseries/controlPanel.tsx:150
#: superset-frontend/plugins/plugin-chart-echarts/src/Timeseries/Regular/Bar/index.ts:82
#: superset-frontend/plugins/plugin-chart-echarts/src/Timeseries/Regular/Line/controlPanel.tsx:84
msgid "Bar"
msgstr "Гистограмма"

"Cell bars" was removed in the English text but still exists in the Russian translation as "Гистограммы в ячейках". This translation should be removed.

I can't find this particular string in the file.

"Change order of columns." and "Change order of rows." are translated as imperatives ("Сменить порядок столбцов", "Сменить порядок строк") rather than infinitives like the English text. I would suggest "Изменить порядок столбцов." and "Изменить порядок строк." to match the English grammar.

That was exactly as how I have translated these strings. But I've noticed I have forgot to remove full stops at their ends. Update incoming :)

"Child label position" is translated as "Положение метки дочернего элемента" which is very literal. A shorter translation like "Расположение меток листьев" might sound more natural.

Again, "Расположение меток листьев" was my version.

"Choose already exists" and "Choose columns to read" have fuzzy translations that don't match the English and need to be updated.

I agree that these ones should be updated, but they are related to superset-frontend/src/features/databases/UploadDataModel/index.tsx which I haven't touched yet.

"Decrease" is translated as "Уменьшить" which means "Reduce". In the context of a waterfall chart, a better translation might be "Падение" or "Снижение".

Not sure why the AI has picked the old version of the translation, but I have interpreted this as "Падение" and its counterpart as "Рост". It looks good on waterfall charts.

"Determined by..." is translated as "Определяет..." which means "Determines..." rather than "Determined by...". The translation should be adjusted to match the passive voice of the original, something like "Определяется..."

Looks like this one is related to superset-frontend/src/explore/components/controls/ColorSchemeControl/index.tsx, that was out of scope. The current translation ("Цветовая схема определена соответствующим дашбордом. Измените цветовую схему в свойствах дашборда.") indeed uses passive voice.

"Dimensions" has a translation "Измерения" which is correct, but the accompanying description is translated a bit awkwardly. I would suggest rephrasing it to sound more natural in Russian while preserving the meaning.

I think that the accompanying description could use some improvements indeed, but it feels natural in its current state.

"Display Name" is translated as "Отображаемое имя" which is correct. However, "Display" on its own is translated as the same thing, which is incorrect. "Display" in that context likely refers to the display/appearance of something, so a word like "Отображение" would be more appropriate.

"Отображение" is how "Display" string is currently translated.

"Display all" is marked as fuzzy and needs a proper translation. Something like "Показать все" would work.

Will fix.

"Layout type of graph" is translated as just "Формат графа" which means "Graph format". A better translation might be "Способ отрисовки графа" to convey the meaning of layout.

In my translation I've used "Формат графа" because of its brevity and placed "Способ отрисовки графа" into the tooltip. Just tried the longer version and it looked fine. I'll switch to the suggested variant.

Similarly, "Layout type of tree" is translated as just "Внешний вид дерева" which means "Tree appearance". A better translation could be "Способ отрисовки дерева" to mean the layout method.

Agreed.

"Line" in the context of a line chart is translated as "Линейный" which means "Linear". A more appropriate translation for a line chart would be "График".

Looks like that comment is related to a fuzzy string for superset-frontend/plugins/legacy-preset-chart-nvd3/src/NVD3Controls.tsx which was left out of the scope of this translation.

"Line chart is used to visualize..." has an overly literal and clunky translation. I would suggest rephrasing it to sound more natural in Russian while preserving the meaning.

Yeah, I'll change that one.

"Loading" and "Loading..." are both translated as "Загрузка", but "Загрузка..." with the ellipsis might be more appropriate for the "Loading..." case to match the English punctuation.

"Loading..." is currently translated as "Загрузка..."

The translations for the "Main" string seem fuzzy/incomplete. "Main" was translated as just "минимум" meaning "minimum", which doesn't seem right. This may need a more thorough translation.

Missed that one. It's very tough though. There is no close short analog that could describe a "main" value as a value of an indicator for current period. Moreover, when I provide any translation for this string, no matter how short it is, the table just breaks. The values in the corresponding column just disappear. I'll provide the most basic viable translation "Значение" for this string and mark it as fuzzy for now.

@goldjee goldjee marked this pull request as ready for review July 3, 2024 19:37
@mistercrunch mistercrunch merged commit 48f6fe6 into apache:master Jul 3, 2024
60 of 61 checks passed
@mistercrunch
Copy link
Member

Now that this is merged I ran a global .po update here #29476

@goldjee goldjee deleted the i18n-ru-charts branch July 3, 2024 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n:russian Translation related to Russian language i18n Namespace | Anything related to localization size/XXL viz:charts Namespace | Anything related to viz types
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants