fix: unicode encoding regression #1733

hrideshmg · 2025-08-19T15:55:45Z

In raising this pull request, I confirm the following (please check boxes):

I have read and understood the contributors guide.
I have checked that another pull request for this purpose does not exist.
I have considered, and confirmed that this submission will be valuable to others.
I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
I give this submission freely, and claim no ownership to its content.
I have mentioned this change in the changelog.

My familiarity with the project is as follows (check one):

I have never used CCExtractor.
I have used CCExtractor just a couple of times.
I absolutely love CCExtractor, but have not contributed previously.
I am an active contributor to CCExtractor.

CCextractor does not currently produce the correct output when we try to encode the subtitles in unicode (by passing --unicode). This is a regression currently reported by the sample platform under the options category.

The problem seems to be caused by order differences between the C and Rust enums. The C enum has the unicode entry at position 0 but the Rust enum has it at position 3.

I'm not exactly sure why this fixes the issue because we are using explicit match by value statements when converting between Rust and C so this shouldn't make a difference but it does yield the correct output nonetheless.

ccextractor-bot · 2025-08-19T18:07:47Z

CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit 39e051b...:

Report Name	Tests Passed
Broken	13/13
CEA-708	13/14
DVB	7/7
DVD	3/3
DVR-MS	2/2
General	27/27
Hauppage	3/3
MP4	3/3
NoCC	10/10
Options	86/86
Teletext	21/21
WTV	2/13
XDS	34/34

Your PR breaks these cases:

ccextractor --out=srt --latin1 f23a544ba8...
ccextractor --out=srt --latin1 10f0f77cf4...
ccextractor --out=srt --latin1 df3b4d62d3...
ccextractor --out=srt --latin1 d7e7dbdf68...
ccextractor --out=srt --latin1 76734ac4a7...
ccextractor --out=srt --latin1 c791382c94...
ccextractor --out=srt --latin1 f673b2f916...
ccextractor --out=srt --latin1 da75bdee47...
ccextractor --out=srt --latin1 bd6f33a669...
ccextractor --out=srt --latin1 0e5e6b26be...
ccextractor --out=srt --latin1 a226cc302d...

NOTE: The following tests have been failing on the master branch as well as the PR:

ccextractor --service 1 --out=txt f17524b53f..., Last passed:
Never

Congratulations: Merging this PR would fix the following tests:

ccextractor --unicode c83f765c66..., Last passed: Never

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

ccextractor-bot · 2025-08-19T18:11:02Z

CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit 39e051b...:

Report Name	Tests Passed
Broken	13/13
CEA-708	13/14
DVB	4/7
DVD	3/3
DVR-MS	2/2
General	27/27
Hauppage	3/3
MP4	3/3
NoCC	10/10
Options	86/86
Teletext	21/21
WTV	10/13
XDS	34/34

NOTE: The following tests have been failing on the master branch as well as the PR:

ccextractor --service 1 --out=txt f17524b53f..., Last passed:
Never
ccextractor --stdout --quiet --no-fontcolor 79a51f3500..., Last passed:
Never
ccextractor --stdout --quiet --no-fontcolor 767b546f96..., Last passed:
Never
ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2..., Last passed:
Never
ccextractor --out=srt --latin1 f23a544ba8..., Last passed:
Never
ccextractor --out=srt --latin1 10f0f77cf4..., Last passed:
Test 5993
ccextractor --out=srt --latin1 df3b4d62d3..., Last passed:
Never

Congratulations: Merging this PR would fix the following tests:

ccextractor --autoprogram --out=srt --latin1 f1422b8bfe..., Last passed: Never
ccextractor --datapid 5603 --autoprogram --out=srt --latin1 --teletext 85c7fc1ad7..., Last passed: Never
ccextractor --out=spupng c83f765c66..., Last passed: Never
ccextractor --unicode c83f765c66..., Last passed: Never
ccextractor --autoprogram --out=ttxt --latin1 c0d2fba8c0..., Last passed: Never
ccextractor --autoprogram --out=ttxt --latin1 006fdc391a..., Last passed: Never
ccextractor --autoprogram --out=ttxt --latin1 e92a1d4d2a..., Last passed: Never
ccextractor --autoprogram --out=ttxt --latin1 7e4ebf7fd7..., Last passed: Never
ccextractor --autoprogram --out=ttxt --latin1 9256a60e4b..., Last passed: Never
ccextractor --autoprogram --out=ttxt --latin1 27d7a43dd6..., Last passed: Never
ccextractor --autoprogram --out=ttxt --latin1 297a44921a..., Last passed: Never
ccextractor --autoprogram --out=ttxt --latin1 efbe129086..., Last passed: Never
ccextractor --autoprogram --out=ttxt --latin1 eae0077731..., Last passed: Never
ccextractor --autoprogram --out=ttxt --latin1 e2e2b501e0..., Last passed: Never
ccextractor --autoprogram --out=ttxt --latin1 c6407fb294..., Last passed: Never
ccextractor --autoprogram --out=ttxt --latin1 --datets dcada745de..., Last passed: Never
ccextractor --autoprogram --out=srt --latin1 --tpage 398 5d5838bde9..., Last passed: Never
ccextractor --autoprogram --out=srt --latin1 --teletext --tpage 398 3b276ad8bf..., Last passed: Never

All tests passing on the master branch were passed completely.

Check the result page for more info.

fix: unicode encoding regression

1ce23ab

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: unicode encoding regression #1733

fix: unicode encoding regression #1733

Uh oh!

hrideshmg commented Aug 19, 2025 •

edited

Loading

Uh oh!

ccextractor-bot commented Aug 19, 2025

Uh oh!

ccextractor-bot commented Aug 19, 2025

Uh oh!

Uh oh!

fix: unicode encoding regression #1733

Are you sure you want to change the base?

fix: unicode encoding regression #1733

Uh oh!

Conversation

hrideshmg commented Aug 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ccextractor-bot commented Aug 19, 2025

Uh oh!

ccextractor-bot commented Aug 19, 2025

Uh oh!

Uh oh!

hrideshmg commented Aug 19, 2025 •

edited

Loading