Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improvement: Add support for keys and numbered ilst items BoxTypes. #159

Merged
merged 9 commits into from
Jan 16, 2024

Conversation

dtrejod
Copy link
Contributor

@dtrejod dtrejod commented Dec 29, 2023

@sunfish-shogi

Before this PR

There was no support for the keys box type (reference: #13). Additionally there was no support for numbered items under the ilst box type.

Before PR changes

As a demonstration, I've updated the testdata/sample_qt.mp4 file to have these new BoxTypes (atoms). The before mp4tool dump of this updated testdata/sample_qt.mp4 file is shown below. From the below output you can see lines [54-56] show the keys and ilst blocks as "unsupported"

$ mp4tool dump testdata/sample_qt.mp4 | cat -n
     1	[ftyp] Size=20 MajorBrand="qt  " MinorVersion=512 CompatibleBrands=[{CompatibleBrand="qt  "}]
     2	[free] Size=42 Data=[...] (use "-full free" to show all)
     3	[ftyp] Size=20 MajorBrand="qt  " MinorVersion=512 CompatibleBrands=[{CompatibleBrand="qt  "}]
     4	[free] Size=42 Data=[...] (use "-full free" to show all)
     5	[moov] Size=340357
     6	  [mvhd] Size=108 ... (use "-full mvhd" to show all)
     7	  [trak] Size=115889
     8	    [tkhd] Size=92 ... (use "-full tkhd" to show all)
     9	    [mdia] Size=115789
    10	      [mdhd] Size=32 Version=0 Flags=0x000000 CreationTimeV0=2082844800 ModificationTimeV0=2082844800 Timescale=24 DurationV0=14315 Language="und" PreDefined=0
    11	      [hdlr] Size=45 Version=0 Flags=0x000000 PreDefined=1835560050 HandlerType="vide" Name="VideoHandler"
    12	      [minf] Size=115704
    13	        [vmhd] Size=20 Version=0 Flags=0x000001 Graphicsmode=0 Opcolor=[0, 0, 0]
    14	        [dinf] Size=36
    15	          [dref] Size=28 Version=0 Flags=0x000000 EntryCount=1
    16	            [url ] Size=12 Version=0 Flags=0x000001
    17	        [stbl] Size=115596
    18	          [stsd] Size=148 Version=0 Flags=0x000000 EntryCount=1
    19	            [avc1] Size=132 DataReferenceIndex=1 PreDefined=0 PreDefined2=[1179012432, 512, 512] Width=424 Height=240 Horizresolution=4718592 Vertresolution=4718592 FrameCount=1 Compressorname="libx264" Depth=24 PreDefined3=-1
    20	              [avcC] Size=46 ... (use "-full avcC" to show all)
    21	          [stts] Size=24 Version=0 Flags=0x000000 EntryCount=1 Entries=[{SampleCount=14315 SampleDelta=1}]
    22	          [stss] Size=832 ... (use "-full stss" to show all)
    23	          [stsc] Size=28 Version=0 Flags=0x000000 EntryCount=1 Entries=[{FirstChunk=1 SamplesPerChunk=1 SampleDescriptionIndex=1}]
    24	          [stsz] Size=57280 ... (use "-full stsz" to show all)
    25	          [stco] Size=57276 ... (use "-full stco" to show all)
    26	        [hdlr] Size=44 Version=0 Flags=0x000000 PreDefined=1684565106 HandlerType="url " Name="DataHandler"
    27	  [trak] Size=224196
    28	    [tkhd] Size=92 ... (use "-full tkhd" to show all)
    29	    [mdia] Size=224096
    30	      [mdhd] Size=32 Version=0 Flags=0x000000 CreationTimeV0=2082844800 ModificationTimeV0=2082844800 Timescale=48000 DurationV0=28628992 Language="und" PreDefined=0
    31	      [hdlr] Size=45 Version=0 Flags=0x000000 PreDefined=1835560050 HandlerType="soun" Name="SoundHandler"
    32	      [minf] Size=224011
    33	        [smhd] Size=16 Version=0 Flags=0x000000 Balance=0
    34	        [dinf] Size=36
    35	          [dref] Size=28 Version=0 Flags=0x000000 EntryCount=1
    36	            [url ] Size=12 Version=0 Flags=0x000001
    37	        [stbl] Size=223907
    38	          [stsd] Size=147 Version=0 Flags=0x000000 EntryCount=1
    39	            [mp4a] Size=131 DataReferenceIndex=1 EntryVersion=1 ChannelCount=2 SampleSize=16 PreDefined=65534 SampleRate=48000 QuickTimeData=[0x0, 0x0, 0x4, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2]
    40	              [wave] Size=79
    41	                [frma] Size=12 DataFormat="mp4a"
    42	                [mp4a] Size=12 QuickTimeData=[0x0, 0x0, 0x0, 0x0]
    43	                [esds] Size=39 ... (use "-full esds" to show all)
    44	                [0x00000000] (unsupported box type) Size=8 Data=[...] (use "-full 0x00000000" to show all)
    45	          [stts] Size=24 Version=0 Flags=0x000000 EntryCount=1 Entries=[{SampleCount=27958 SampleDelta=1024}]
    46	          [stsc] Size=28 Version=0 Flags=0x000000 EntryCount=1 Entries=[{FirstChunk=1 SamplesPerChunk=1 SampleDescriptionIndex=1}]
    47	          [stsz] Size=111852 ... (use "-full stsz" to show all)
    48	          [stco] Size=111848 ... (use "-full stco" to show all)
    49	        [hdlr] Size=44 Version=0 Flags=0x000000 PreDefined=1684565106 HandlerType="url " Name="DataHandler"
    50	  [udta] Size=156
    51	    [(c)enc] (unsupported box type) Size=23 Data=[...] (use "-full (c)enc" to show all)
    52	    [meta] Size=125 Version=0 Flags=0x000000
    53	      [hdlr] Size=33 Version=0 Flags=0x000000 PreDefined=0 HandlerType="mdta" Name=""
    54	      [keys] (unsupported box type) Size=43 Data=[...] (use "-full keys" to show all)
    55	      [ilst] Size=37
    56	        [0x00000001] (unsupported box type) Size=29 Data=[...] (use "-full 0x00000001" to show all)

After this PR

Adds support for both keys and numbered items under the list box type.

After PR changes

From the below output you can now see lines [54-56] show the keys and list blocks as properly handled now.

$ ./mp4tool dump testdata/sample_qt.mp4 | cat -n
     1	[ftyp] Size=20 MajorBrand="qt  " MinorVersion=512 CompatibleBrands=[{CompatibleBrand="qt  "}]
     2	[free] Size=42 Data=[...] (use "-full free" to show all)
     3	[ftyp] Size=20 MajorBrand="qt  " MinorVersion=512 CompatibleBrands=[{CompatibleBrand="qt  "}]
     4	[free] Size=42 Data=[...] (use "-full free" to show all)
     5	[moov] Size=340357
     6	  [mvhd] Size=108 ... (use "-full mvhd" to show all)
     7	  [trak] Size=115889
     8	    [tkhd] Size=92 ... (use "-full tkhd" to show all)
     9	    [mdia] Size=115789
    10	      [mdhd] Size=32 Version=0 Flags=0x000000 CreationTimeV0=2082844800 ModificationTimeV0=2082844800 Timescale=24 DurationV0=14315 Language="und" PreDefined=0
    11	      [hdlr] Size=45 Version=0 Flags=0x000000 PreDefined=1835560050 HandlerType="vide" Name="VideoHandler"
    12	      [minf] Size=115704
    13	        [vmhd] Size=20 Version=0 Flags=0x000001 Graphicsmode=0 Opcolor=[0, 0, 0]
    14	        [dinf] Size=36
    15	          [dref] Size=28 Version=0 Flags=0x000000 EntryCount=1
    16	            [url ] Size=12 Version=0 Flags=0x000001
    17	        [stbl] Size=115596
    18	          [stsd] Size=148 Version=0 Flags=0x000000 EntryCount=1
    19	            [avc1] Size=132 DataReferenceIndex=1 PreDefined=0 PreDefined2=[1179012432, 512, 512] Width=424 Height=240 Horizresolution=4718592 Vertresolution=4718592 FrameCount=1 Compressorname="libx264" Depth=24 PreDefined3=-1
    20	              [avcC] Size=46 ... (use "-full avcC" to show all)
    21	          [stts] Size=24 Version=0 Flags=0x000000 EntryCount=1 Entries=[{SampleCount=14315 SampleDelta=1}]
    22	          [stss] Size=832 ... (use "-full stss" to show all)
    23	          [stsc] Size=28 Version=0 Flags=0x000000 EntryCount=1 Entries=[{FirstChunk=1 SamplesPerChunk=1 SampleDescriptionIndex=1}]
    24	          [stsz] Size=57280 ... (use "-full stsz" to show all)
    25	          [stco] Size=57276 ... (use "-full stco" to show all)
    26	        [hdlr] Size=44 Version=0 Flags=0x000000 PreDefined=1684565106 HandlerType="url " Name="DataHandler"
    27	  [trak] Size=224196
    28	    [tkhd] Size=92 ... (use "-full tkhd" to show all)
    29	    [mdia] Size=224096
    30	      [mdhd] Size=32 Version=0 Flags=0x000000 CreationTimeV0=2082844800 ModificationTimeV0=2082844800 Timescale=48000 DurationV0=28628992 Language="und" PreDefined=0
    31	      [hdlr] Size=45 Version=0 Flags=0x000000 PreDefined=1835560050 HandlerType="soun" Name="SoundHandler"
    32	      [minf] Size=224011
    33	        [smhd] Size=16 Version=0 Flags=0x000000 Balance=0
    34	        [dinf] Size=36
    35	          [dref] Size=28 Version=0 Flags=0x000000 EntryCount=1
    36	            [url ] Size=12 Version=0 Flags=0x000001
    37	        [stbl] Size=223907
    38	          [stsd] Size=147 Version=0 Flags=0x000000 EntryCount=1
    39	            [mp4a] Size=131 DataReferenceIndex=1 EntryVersion=1 ChannelCount=2 SampleSize=16 PreDefined=65534 SampleRate=48000 QuickTimeData=[0x0, 0x0, 0x4, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x2]
    40	              [wave] Size=79
    41	                [frma] Size=12 DataFormat="mp4a"
    42	                [mp4a] Size=12 QuickTimeData=[0x0, 0x0, 0x0, 0x0]
    43	                [esds] Size=39 ... (use "-full esds" to show all)
    44	                [0x00000000] (unsupported box type) Size=8 Data=[...] (use "-full 0x00000000" to show all)
    45	          [stts] Size=24 Version=0 Flags=0x000000 EntryCount=1 Entries=[{SampleCount=27958 SampleDelta=1024}]
    46	          [stsc] Size=28 Version=0 Flags=0x000000 EntryCount=1 Entries=[{FirstChunk=1 SamplesPerChunk=1 SampleDescriptionIndex=1}]
    47	          [stsz] Size=111852 ... (use "-full stsz" to show all)
    48	          [stco] Size=111848 ... (use "-full stco" to show all)
    49	        [hdlr] Size=44 Version=0 Flags=0x000000 PreDefined=1684565106 HandlerType="url " Name="DataHandler"
    50	  [udta] Size=156
    51	    [(c)enc] (unsupported box type) Size=23 Data=[...] (use "-full (c)enc" to show all)
    52	    [meta] Size=125 Version=0 Flags=0x000000
    53	      [hdlr] Size=33 Version=0 Flags=0x000000 PreDefined=0 HandlerType="mdta" Name=""
    54	      [keys] Size=43 Version=0 Flags=0x000000 EntryCount=1 Entries=[{KeySize=27 KeyNamespace="mdta" KeyValue="com.android.version"}]
    55	      [ilst] Size=37
    56	        [0x00000001] Size=29 Version=0 Flags=0x000000 ItemName="data" Data={DataType=UTF8 DataLang=0 Data="1.0.0"}

@dtrejod dtrejod marked this pull request as ready for review December 29, 2023 21:01
@sunfish-shogi
Copy link
Contributor

sunfish-shogi commented Jan 1, 2024

@dtrejod
Thank you for Pull Request.

I have a suggestion.

The value of Item.Type will not be larger than Keys.EntryCount, so we can judge whether the value is valid completely by additional logic instead of using 1024 box type entries.

AddBoxDef(Ex) function and AddAnyTypeBoxDef(Ex) function build boxMap and getBoxDef function of mp4.go finds an entry from boxMap and returns it.
However its map-based resolver is not suitable for this use-case.

We can implement to resolve Apple metadata box by following code instead:

func (boxType BoxType) getBoxDef(ctx Context) *boxDef {
  boxDefs := boxMap[boxType]
  for i := len(boxDefs) - 1; i >= 0; i-- {
    boxDef := &boxDefs[i]
    if boxDef.isTarget == nil || boxDef.isTarget(ctx) {
      return boxDef
    }   
  }

  if ctx.UnderIlst {
    typeID := /* TODO: convert boxType to uint32 */
    if typeID >= 1 && typeID <= ctx.QuickTimeKeysMetaEntryCount {
      return &boxDef {
        /* TODO */
      }
    }
  }

  return nil 
}

For this approach, we need to add QuickTimeKeysMetaEntryCount field to Context, and pass true via its field to brother atoms of keys atom.

reference:

Android libstagefright
https://android.googlesource.com/platform/frameworks/av/+/e7142a0703bc93f75e213e96ebc19000022afed9/media/libstagefright/MPEG4Extractor.cpp#2329

status_t MPEG4Extractor::parseQTMetaVal(
  int32_t keyId, off64_t offset, size_t size) {
  ssize_t index = mMetaKeyMap.indexOfKey(keyId);
  if (index < 0) {
    // corresponding key is not present, ignore
    return ERROR_MALFORMED;
  }

@dtrejod
Copy link
Contributor Author

dtrejod commented Jan 7, 2024

@sunfish-shogi Thank you for the feedback. Agree that approach is much more sensible. I pushed a commit with your suggestion.

UPDATE: After some further testing I identified the approach here does not work however. The latest commit demonstrates the number items under ilst are not properly handled because the keys box is not a parent of the ilst box. Since there isn't a nested relationship, the Context isn't preserved across box types.

I'll revisit this PR when I have time and a better understanding of this repository.

@dtrejod dtrejod marked this pull request as draft January 8, 2024 00:15
@sunfish-shogi
Copy link
Contributor

@dtrejod

The latest commit demonstrates the number items under ilst are not properly handled because the keys box is not a parent of the ilst box. Since there isn't a nested relationship, the Context isn't preserved across box types.

For example, read.go detects ftyp box and sets IsQuickTimeCompatible flag.

https://github.com/abema/go-mp4/blob/v1.1.1/read.go#L53-L65

And it propagate the flag to following same level boxes.

https://github.com/abema/go-mp4/blob/v1.1.1/read.go#L172-L174

It is important to implement in read.go instead of specific handlers ((k *Keys) OnReadField handler).
Because users can skip to read fields of keys box by Seek-function, so it is not necessary to call handlers.

@dtrejod dtrejod marked this pull request as ready for review January 15, 2024 18:05
@dtrejod
Copy link
Contributor Author

dtrejod commented Jan 15, 2024

Thanks for the pointers pointing me in the correct direction.

@sunfish-shogi sunfish-shogi merged commit c058e0e into abema:master Jan 16, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants