Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support changing frame size during encoding #1069

Draft
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

tongyuantongyu
Copy link
Contributor

This is the second part for #761.

@wantehchang, I wrote a simple test ChangeSettingTest.ChangeDimension that encodes a 256x256 frame following by a 512x512 frame, with seq header dimension set to 512x512, but it crashes in the aom_codec_destroy call. Could you take a look if there's something wrong in my usage, or it's a bug in libaom?

@wantehchang
Copy link
Collaborator

Yuan: I just merged my cleanup pull request #1070. Please rebase this pull request and then I will take a look at the crash. Thanks.

@tongyuantongyu
Copy link
Contributor Author

I tried to run asan and here is the report:

=================================================================
==6200==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x121cbf221290 at pc 0x7ff65da76b28 bp 0x0026a0dedb00 sp 0x0026a0dedb48
WRITE of size 8 at 0x121cbf221290 thread T0
    #0 0x7ff65da76b27 in av1_set_mb_ssim_rdmult_scaling D:/Cpp/libavif/ext/aom/av1/encoder/encoder_utils.c:1304:47
    #1 0x7ff65da3f157 in encode_frame_to_data_rate D:/Cpp/libavif/ext/aom/av1/encoder/encoder.c:3581:5
    #2 0x7ff65da3f157 in av1_encode D:/Cpp/libavif/ext/aom/av1/encoder/encoder.c:3830:9
    #3 0x7ff65e36e4d5 in av1_encode_strategy D:/Cpp/libavif/ext/aom/av1/encoder/encode_strategy.c:1605:9
    #4 0x7ff65da48d65 in av1_get_compressed_data D:/Cpp/libavif/ext/aom/av1/encoder/encoder.c:4532:22
    #5 0x7ff65d946785 in encoder_encode D:/Cpp/libavif/ext/aom/av1/av1_cx_iface.c:3029:20
    #6 0x7ff65d938bdb in aom_codec_encode D:/Cpp/libavif/ext/aom/aom/src/aom_encoder.c:176:11
    #7 0x7ff65d90d423 in aomCodecEncodeImage D:/Cpp/libavif/src/codec_aom.c:974:33
    #8 0x7ff65d8fd2d4 in avifEncoderAddImageInternal D:/Cpp/libavif/src/write.c:915:17
    #9 0x7ff65d8fb670 in avifEncoderAddImage D:/Cpp/libavif/src/write.c:934:12
    #10 0x7ff65d8d42b3 in libavif::(anonymous namespace)::ChangeSettingTest_DISABLED_ChangeDimension_Test::TestBody() D:/Cpp/libavif/tests/gtest/avifchangesettingtest.cc:186:5
    #11 0x7ffbebd58308 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (D:\Tools\msys64\clang64\bin\libgtest.dll+0x180038308)
    #12 0x7ffbebd3ce8a in testing::Test::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18001ce8a)
    #13 0x7ffbebd3e16e in testing::TestInfo::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18001e16e)
    #14 0x7ffbebd3ec57 in testing::TestSuite::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18001ec57)
    #15 0x7ffbebd4f2ab in testing::internal::UnitTestImpl::RunAllTests() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18002f2ab)
    #16 0x7ffbebd59168 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (D:\Tools\msys64\clang64\bin\libgtest.dll+0x180039168)
    #17 0x7ffbebd4eadf in testing::UnitTest::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18002eadf)
    #18 0x7ffbefab1434 in main (D:\Tools\msys64\clang64\bin\libgtest_main.dll+0x180001434)
    #19 0x7ff65d8d13ad in __tmainCRTStartup C:/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:329:15
    #20 0x7ff65d8d14e5 in mainCRTStartup C:/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:206:9
    #21 0x7ffc008f54df  (C:\WINDOWS\System32\KERNEL32.DLL+0x1800154df)
    #22 0x7ffc0224485a  (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18000485a)

0x121cbf221297 is located 0 bytes to the right of 2071-byte region [0x121cbf220a80,0x121cbf221297)
allocated by thread T0 here:
    #0 0x7ffb7dec100d in malloc (D:\Tools\msys64\clang64\bin\libclang_rt.asan_dynamic-x86_64.dll+0x18004100d)
    #1 0x7ff65d9ab06a in aom_memalign D:/Cpp/libavif/ext/aom/aom_mem/aom_mem.c:59:22
    #2 0x7ff65d9ab06a in aom_malloc D:/Cpp/libavif/ext/aom/aom_mem/aom_mem.c:67:40
    #3 0x7ff65d9ab06a in aom_calloc D:/Cpp/libavif/ext/aom/aom_mem/aom_mem.c:72:19
    #4 0x7ff65da38265 in av1_create_compressor D:/Cpp/libavif/ext/aom/av1/encoder/encoder.c:1400:5
    #5 0x7ff65d94389e in av1_create_context_and_bufferpool D:/Cpp/libavif/ext/aom/av1/av1_cx_iface.c:2487:12
    #6 0x7ff65d94389e in encoder_init D:/Cpp/libavif/ext/aom/av1/av1_cx_iface.c:2600:13
    #7 0x7ff65d938732 in aom_codec_enc_init_ver D:/Cpp/libavif/ext/aom/aom/src/aom_encoder.c:80:11
    #8 0x7ff65d90bcb0 in aomCodecEncodeImage D:/Cpp/libavif/src/codec_aom.c:726:13
    #9 0x7ff65d8fd2d4 in avifEncoderAddImageInternal D:/Cpp/libavif/src/write.c:915:17
    #10 0x7ff65d8fb670 in avifEncoderAddImage D:/Cpp/libavif/src/write.c:934:12
    #11 0x7ff65d8d4158 in libavif::(anonymous namespace)::ChangeSettingTest_DISABLED_ChangeDimension_Test::TestBody() D:/Cpp/libavif/tests/gtest/avifchangesettingtest.cc:183:5
    #12 0x7ffbebd58308 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (D:\Tools\msys64\clang64\bin\libgtest.dll+0x180038308)
    #13 0x7ffbebd3ce8a in testing::Test::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18001ce8a)
    #14 0x7ffbebd3e16e in testing::TestInfo::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18001e16e)
    #15 0x7ffbebd3ec57 in testing::TestSuite::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18001ec57)
    #16 0x7ffbebd4f2ab in testing::internal::UnitTestImpl::RunAllTests() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18002f2ab)
    #17 0x7ffbebd59168 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (D:\Tools\msys64\clang64\bin\libgtest.dll+0x180039168)
    #18 0x7ffbebd4eadf in testing::UnitTest::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18002eadf)
    #19 0x7ffbefab1434 in main (D:\Tools\msys64\clang64\bin\libgtest_main.dll+0x180001434)
    #20 0x7ff65d8d13ad in __tmainCRTStartup C:/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:329:15
    #21 0x7ff65d8d14e5 in mainCRTStartup C:/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:206:9
    #22 0x7ffc008f54df  (C:\WINDOWS\System32\KERNEL32.DLL+0x1800154df)
    #23 0x7ffc0224485a  (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18000485a)

SUMMARY: AddressSanitizer: heap-buffer-overflow D:/Cpp/libavif/ext/aom/av1/encoder/encoder_utils.c:1304:47 in av1_set_mb_ssim_rdmult_scaling
Shadow bytes around the buggy address:
  0x042657044200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x042657044210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x042657044220: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x042657044230: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x042657044240: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x042657044250: 00 00[07]fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x042657044260: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x042657044270: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x042657044280: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x042657044290: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0426570442a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==6200==ABORTING

The function in question seems related to ssim, but if I set tune to psnr, it fails in another function:

=================================================================
==62176==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x126071520997 at pc 0x7ffb7dec0b8c bp 0x00cb7aefcb00 sp 0x00cb7aefcb40
WRITE of size 4096 at 0x126071520997 thread T0
    #0 0x7ffb7dec0b8b in __asan_memset (D:\Tools\msys64\clang64\bin\libclang_rt.asan_dynamic-x86_64.dll+0x180040b8b)
    #1 0x7ff7eabdef80 in encode_without_recode D:/Cpp/libavif/ext/aom/av1/encoder/encoder.c:2406:5
    #2 0x7ff7eabdef80 in encode_with_recode_loop_and_filter D:/Cpp/libavif/ext/aom/av1/encoder/encoder.c:3014:11
    #3 0x7ff7eabc0d05 in encode_frame_to_data_rate D:/Cpp/libavif/ext/aom/av1/encoder/encoder.c:3678:9
    #4 0x7ff7eabc0d05 in av1_encode D:/Cpp/libavif/ext/aom/av1/encoder/encoder.c:3830:9
    #5 0x7ff7eb4ee4f5 in av1_encode_strategy D:/Cpp/libavif/ext/aom/av1/encoder/encode_strategy.c:1605:9
    #6 0x7ff7eabc8d85 in av1_get_compressed_data D:/Cpp/libavif/ext/aom/av1/encoder/encoder.c:4532:22
    #7 0x7ff7eaac67a5 in encoder_encode D:/Cpp/libavif/ext/aom/av1/av1_cx_iface.c:3029:20
    #8 0x7ff7eaab8bfb in aom_codec_encode D:/Cpp/libavif/ext/aom/aom/src/aom_encoder.c:176:11
    #9 0x7ff7eaa8d443 in aomCodecEncodeImage D:/Cpp/libavif/src/codec_aom.c:974:33
    #10 0x7ff7eaa7d2f4 in avifEncoderAddImageInternal D:/Cpp/libavif/src/write.c:915:17
    #11 0x7ff7eaa7b690 in avifEncoderAddImage D:/Cpp/libavif/src/write.c:934:12
    #12 0x7ff7eaa542cb in libavif::(anonymous namespace)::ChangeSettingTest_DISABLED_ChangeDimension_Test::TestBody() D:/Cpp/libavif/tests/gtest/avifchangesettingtest.cc:186:5
    #13 0x7ffbebd58308 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (D:\Tools\msys64\clang64\bin\libgtest.dll+0x180038308)
    #14 0x7ffbebd3ce8a in testing::Test::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18001ce8a)
    #15 0x7ffbebd3e16e in testing::TestInfo::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18001e16e)
    #16 0x7ffbebd3ec57 in testing::TestSuite::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18001ec57)
    #17 0x7ffbebd4f2ab in testing::internal::UnitTestImpl::RunAllTests() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18002f2ab)
    #18 0x7ffbebd59168 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (D:\Tools\msys64\clang64\bin\libgtest.dll+0x180039168)
    #19 0x7ffbebd4eadf in testing::UnitTest::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18002eadf)
    #20 0x7ffbefab1434 in main (D:\Tools\msys64\clang64\bin\libgtest_main.dll+0x180001434)
    #21 0x7ff7eaa513ad in __tmainCRTStartup C:/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:329:15
    #22 0x7ff7eaa514e5 in mainCRTStartup C:/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:206:9
    #23 0x7ffc008f54df  (C:\WINDOWS\System32\KERNEL32.DLL+0x1800154df)
    #24 0x7ffc0224485a  (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18000485a)

0x126071520997 is located 0 bytes to the right of 1047-byte region [0x126071520580,0x126071520997)
allocated by thread T0 here:
    #0 0x7ffb7dec100d in malloc (D:\Tools\msys64\clang64\bin\libclang_rt.asan_dynamic-x86_64.dll+0x18004100d)
    #1 0x7ff7eab2b08a in aom_memalign D:/Cpp/libavif/ext/aom/aom_mem/aom_mem.c:59:22
    #2 0x7ff7eab2b08a in aom_malloc D:/Cpp/libavif/ext/aom/aom_mem/aom_mem.c:67:40
    #3 0x7ff7eab2b08a in aom_calloc D:/Cpp/libavif/ext/aom/aom_mem/aom_mem.c:72:19
    #4 0x7ff7eabb8152 in av1_create_compressor D:/Cpp/libavif/ext/aom/av1/encoder/encoder.c:1387:3
    #5 0x7ff7eaac38be in av1_create_context_and_bufferpool D:/Cpp/libavif/ext/aom/av1/av1_cx_iface.c:2487:12
    #6 0x7ff7eaac38be in encoder_init D:/Cpp/libavif/ext/aom/av1/av1_cx_iface.c:2600:13
    #7 0x7ff7eaab8752 in aom_codec_enc_init_ver D:/Cpp/libavif/ext/aom/aom/src/aom_encoder.c:80:11
    #8 0x7ff7eaa8bcd0 in aomCodecEncodeImage D:/Cpp/libavif/src/codec_aom.c:726:13
    #9 0x7ff7eaa7d2f4 in avifEncoderAddImageInternal D:/Cpp/libavif/src/write.c:915:17
    #10 0x7ff7eaa7b690 in avifEncoderAddImage D:/Cpp/libavif/src/write.c:934:12
    #11 0x7ff7eaa54170 in libavif::(anonymous namespace)::ChangeSettingTest_DISABLED_ChangeDimension_Test::TestBody() D:/Cpp/libavif/tests/gtest/avifchangesettingtest.cc:183:5
    #12 0x7ffbebd58308 in void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) (D:\Tools\msys64\clang64\bin\libgtest.dll+0x180038308)
    #13 0x7ffbebd3ce8a in testing::Test::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18001ce8a)
    #14 0x7ffbebd3e16e in testing::TestInfo::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18001e16e)
    #15 0x7ffbebd3ec57 in testing::TestSuite::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18001ec57)
    #16 0x7ffbebd4f2ab in testing::internal::UnitTestImpl::RunAllTests() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18002f2ab)
    #17 0x7ffbebd59168 in bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) (D:\Tools\msys64\clang64\bin\libgtest.dll+0x180039168)
    #18 0x7ffbebd4eadf in testing::UnitTest::Run() (D:\Tools\msys64\clang64\bin\libgtest.dll+0x18002eadf)
    #19 0x7ffbefab1434 in main (D:\Tools\msys64\clang64\bin\libgtest_main.dll+0x180001434)
    #20 0x7ff7eaa513ad in __tmainCRTStartup C:/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:329:15
    #21 0x7ff7eaa514e5 in mainCRTStartup C:/M/mingw-w64-crt-git/src/mingw-w64/mingw-w64-crt/crt/crtexe.c:206:9
    #22 0x7ffc008f54df  (C:\WINDOWS\System32\KERNEL32.DLL+0x1800154df)
    #23 0x7ffc0224485a  (C:\WINDOWS\SYSTEM32\ntdll.dll+0x18000485a)

SUMMARY: AddressSanitizer: heap-buffer-overflow (D:\Tools\msys64\clang64\bin\libclang_rt.asan_dynamic-x86_64.dll+0x180040b8b) in __asan_memset
Shadow bytes around the buggy address:
  0x047a7f7a40e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x047a7f7a40f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x047a7f7a4100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x047a7f7a4110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x047a7f7a4120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x047a7f7a4130: 00 00[07]fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x047a7f7a4140: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x047a7f7a4150: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x047a7f7a4160: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x047a7f7a4170: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x047a7f7a4180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==62176==ABORTING

Copy link
Collaborator

@wantehchang wantehchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yuan: I reviewed everything except the avifRWStreamWrite parts of src/write.c and the decoder part of the new test in avifchangesettingtest.cc. I believe your changes to src/codec_aom.c. So this looks like a libaom bug that certain arrays are allocated to the size of the initial values of cfg.g_w and cfg.g_h.

src/write.c Outdated
lastEncoder->minQuantizer = encoder->minQuantizer;
lastEncoder->maxQuantizer = encoder->maxQuantizer;
lastEncoder->minQuantizerAlpha = encoder->minQuantizerAlpha;
lastEncoder->maxQuantizerAlpha = encoder->maxQuantizerAlpha;
lastEncoder->tileRowsLog2 = encoder->tileRowsLog2;
lastEncoder->tileColsLog2 = encoder->tileColsLog2;
lastEncoder->speed = encoder->speed;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this. This line has been moved to line 334.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -1046,6 +1046,9 @@ struct avifCodecSpecificOptions;
// image in less bytes. AVIF_SPEED_DEFAULT means "Leave the AV1 codec to its default speed settings"./
// If avifEncoder uses rav1e, the speed value is directly passed through (0-10). If libaom is used,
// a combination of settings are tweaked to simulate this speed range.
// * Width and height: width and height of encoded image. Default value 0 means infer from first frame.
// For grid image, this is the size of one cell. Value must not be smaller than the largest frame
// to be encoded.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to think about what this value means for grid images.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the easiest choice: encoder->width and encoder-height overrides image->width and image->height, if present.

Alternatively this can be defined as size of the whole grid, and we verify and compute size of each cell internally.

src/write.c Outdated
imageWidth = imageMetadata->width * item->gridCols;
imageHeight = imageMetadata->height * item->gridRows;
imageWidth = imageWidth * item->gridCols;
imageHeight = imageHeight * item->gridRows;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: Should we use the *= notation?

            imageWidth *=  item->gridCols;
            imageHeight *= item->gridRows;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks better. Done.

@wantehchang
Copy link
Collaborator

Yuan: Please test this libaom patch. It is likely to be incomplete (e.g., the VMAF and BUTTERAUGLI code may also need the same changes).

diff --git a/av1/encoder/encoder.c b/av1/encoder/encoder.c
index 65d3a10..8e866a4 100644
--- a/av1/encoder/encoder.c
+++ b/av1/encoder/encoder.c
@@ -1355,8 +1355,21 @@ AV1_COMP *av1_create_compressor(AV1_PRIMARY *ppi, const AV1EncoderConfig *oxcf,
   av1_set_speed_features_framesize_independent(cpi, oxcf->speed);
   av1_set_speed_features_framesize_dependent(cpi, oxcf->speed);
 
+  int max_mi_cols = mi_params->mi_cols;
+  int max_mi_rows = mi_params->mi_rows;
+  if (oxcf->frm_dim_cfg.forced_max_frame_width) {
+    const int aligned_width =
+        ALIGN_POWER_OF_TWO(oxcf->frm_dim_cfg.forced_max_frame_width, 3);
+    max_mi_cols = aligned_width >> MI_SIZE_LOG2;
+  }
+  if (oxcf->frm_dim_cfg.forced_max_frame_height) {
+    const int aligned_height =
+        ALIGN_POWER_OF_TWO(oxcf->frm_dim_cfg.forced_max_frame_height, 3);
+    max_mi_rows = aligned_height >> MI_SIZE_LOG2;
+  }
+
   CHECK_MEM_ERROR(cm, cpi->consec_zero_mv,
-                  aom_calloc((mi_params->mi_rows * mi_params->mi_cols) >> 2,
+                  aom_calloc((max_mi_rows * max_mi_cols) >> 2,
                              sizeof(*cpi->consec_zero_mv)));
 
   cpi->mb_weber_stats = NULL;
@@ -1366,8 +1379,8 @@ AV1_COMP *av1_create_compressor(AV1_PRIMARY *ppi, const AV1EncoderConfig *oxcf,
     const int bsize = BLOCK_16X16;
     const int w = mi_size_wide[bsize];
     const int h = mi_size_high[bsize];
-    const int num_cols = (mi_params->mi_cols + w - 1) / w;
-    const int num_rows = (mi_params->mi_rows + h - 1) / h;
+    const int num_cols = (max_mi_cols + w - 1) / w;
+    const int num_rows = (max_mi_rows + h - 1) / h;
     CHECK_MEM_ERROR(cm, cpi->ssim_rdmult_scaling_factors,
                     aom_calloc(num_rows * num_cols,
                                sizeof(*cpi->ssim_rdmult_scaling_factors)));

@wantehchang
Copy link
Collaborator

I discussed this issue with my colleague Marco Paniconi, who works on real-time encoding and SVC in libaom. Marco told me that they don't use g_forced_max_frame_width and g_forced_max_frame_height. Instead, they set g_w and g_h to the fixed size of the frames (the frames come from the camera and therefore have the same size) and then let libaom downscale the frames internally for certain spatial layers. Please see examples/svc_encoder_rtc.c in libaom. We should switch to this approach.

Although AVIF allows the images in an image sequence to have different sizes, the libavif encoder does not need to support that. (The libavif decoder supports that.)

Copy link
Contributor Author

@tongyuantongyu tongyuantongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please test this libaom patch

Thaks for the patch. I confirm that after patch libaom encodes correctly.

let libaom downscale the frames internally for certain spatial layers

Actually I'm planning to support both. Use libaom internal scaler is more convenient for basic usage with only one input image, but if user wants more flexibility (e.g. using small blurred "thumbnail" as first layer) input image of different size can be both more efficient and convenient.

Earlier this year I reported two bugs in libaom's internal scaler. I see aomedia:3210 is fixed by a recent commit https://aomedia-review.googlesource.com/c/aom/+/161961. For aomedia:3203 I have an old CL, but I can't figure out how to update it because it's targeting master branch, so I created a new one instead.

@@ -1046,6 +1046,9 @@ struct avifCodecSpecificOptions;
// image in less bytes. AVIF_SPEED_DEFAULT means "Leave the AV1 codec to its default speed settings"./
// If avifEncoder uses rav1e, the speed value is directly passed through (0-10). If libaom is used,
// a combination of settings are tweaked to simulate this speed range.
// * Width and height: width and height of encoded image. Default value 0 means infer from first frame.
// For grid image, this is the size of one cell. Value must not be smaller than the largest frame
// to be encoded.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the easiest choice: encoder->width and encoder-height overrides image->width and image->height, if present.

Alternatively this can be defined as size of the whole grid, and we verify and compute size of each cell internally.

# Conflicts:
#	include/avif/avif.h
#	src/write.c
@tongyuantongyu
Copy link
Contributor Author

With the latest tip of libaom all added test cases passes.

to the fixed size of the frames (the frames come from the camera and therefore have the same size) then let libaom downscale the frames internally for certain spatial layers. Please see examples/svc_encoder_rtc.c in libaom. We should switch to this approach.

AVIF use case is different here: we want to use different images for each layer (like the low resolution blurry preview example I mentioned) - instead of sending the same frame many times to encode scalable video. So the SVC approach puts user into an awkward situation: they need to upscale the first layer to match frame size, and then libaom downscales it back. This is inefficient, and upscale-downscale roundtrip can be lossy.

@tongyuantongyu tongyuantongyu marked this pull request as ready for review January 15, 2023 13:25
jbeich pushed a commit to jbeich/aom that referenced this pull request Mar 16, 2023
Based on the libavif avifchangedimensiontest
AOMDecreasing/ChangeDimensionTest.EncodeDecode/0 in
AOMediaCodec/libavif#1069
by Yuan Tong.

Test:
test_libaom --gtest_filter=*DimensionDecreasing \
    --gtest_also_run_disabled_tests

Bug: aomedia:3348
Change-Id: I79aea87012cc3bea43d565911ac88816c860e737
@tongyuantongyu tongyuantongyu marked this pull request as draft September 13, 2023 14:25
cyh5272 pushed a commit to cyh5272/aom that referenced this pull request May 6, 2024
Based on the libavif avifchangedimensiontest
AOMDecreasing/ChangeDimensionTest.EncodeDecode/0 in
AOMediaCodec/libavif#1069
by Yuan Tong.

Test:
test_libaom --gtest_filter=*DimensionDecreasing \
    --gtest_also_run_disabled_tests

Bug: aomedia:3348
Change-Id: I8be3c092234f99402c02aa03340c916f629ce0f2
cyh5272 pushed a commit to cyh5272/aom that referenced this pull request May 6, 2024
Based on the libavif avifchangedimensiontest
AOMDecreasing/ChangeDimensionTest.EncodeDecode/0 in
AOMediaCodec/libavif#1069
by Yuan Tong.

Test:
test_libaom --gtest_filter=*DimensionDecreasing \
    --gtest_also_run_disabled_tests

Bug: aomedia:3348
Change-Id: I79aea87012cc3bea43d565911ac88816c860e737
cyh5272 pushed a commit to cyh5272/aom that referenced this pull request May 6, 2024
Based on the libavif avifchangedimensiontest
AOMDecreasing/ChangeDimensionTest.EncodeDecode/0 in
AOMediaCodec/libavif#1069
by Yuan Tong.

Test:
test_libaom --gtest_filter=*DimensionDecreasing \
    --gtest_also_run_disabled_tests

Bug: aomedia:3348
Change-Id: I79aea87012cc3bea43d565911ac88816c860e737
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants