Skip to content

Conversation

fin-w
Copy link

@fin-w fin-w commented Oct 8, 2024

This adds support for Welsh, based on trigrams generated from the 11+ million Welsh words contained in the Corpws Cenedlaethol Cymraeg Cyfoes https://corcencc.org/, which is licensed under Creative Commons Attribution Non Commercial Share Alike 4.0 International.

Citation:
Knight, D., Morris, S., Fitzpatrick, T., Rayson, P., Spasić, I., Thomas, E-M., Lovell, A., Morris, J., Evas, J., Stonelake, M., Arman, L., Davies, J., Ezeani, I., Neale, S., Needs, J., Piao, S., Rees, M., Watkins, G., Williams, L., Muralidaran, V., Tovey-Walsh, B., Anthony, L., Cobb, T., Deuchar, M., Donnelly, K., McCarthy, M. and Scannell, K. (2020). CorCenCC: Corpws Cenedlaethol Cymraeg Cyfoes – the National Corpus of Contemporary Welsh. Cardiff University, http://doi.org/10.17035/d.2020.0119878310

@fin-w
Copy link
Author

fin-w commented Oct 8, 2024

One test is failing when running cargo test and make test :

---- core::detect::tests::test_detect_with_options_with_filter_list_except stdout ----
thread 'core::detect::tests::test_detect_with_options_with_filter_list_except' panicked at src/core/detect.rs:148:9:
assertion `left == right` failed
  left: Cym
 right: Eng

I'm not sure what the correct fix is for this. I'll have a look at the test again and try to correct things though.

@fin-w fin-w force-pushed the support_cymraeg_welsh branch from 81a9361 to 09635c0 Compare October 8, 2024 17:54
@fin-w
Copy link
Author

fin-w commented Oct 8, 2024

It seems like I just had to filter out Welsh in the test to make it pass, so I've fixed that. Hopefully this is ready to merge now?

@fin-w fin-w force-pushed the support_cymraeg_welsh branch from af65af5 to fda82f3 Compare October 8, 2024 18:28
@fin-w fin-w force-pushed the support_cymraeg_welsh branch from fda82f3 to 1f9cf0c Compare October 9, 2024 23:58
@fin-w fin-w force-pushed the support_cymraeg_welsh branch from 1f9cf0c to 574b33a Compare October 10, 2024 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant