MP4

本エントリでは以下のテーマについて説明しています。

MP4 Basic file structure (MP4 における基本的なファイル構造)
Compatibility (QuickTime との互換性)
Sample table / Chunks
Must box (必須MP4Box)

MP4 Basic file structure

Box (MP4Box)
- MP4はツリー構造を持つ構造体です
  - ツリーの各ノードはBox(またはATOM)と呼ばれます
- Box は Box全体の長さ(BoxSize)と4文字の文字列(BoxType)とデータ部分(BoxData)を持つ構造体です
  - BoxSize は BoxSize, BoxType を含む全体の長さです
    - BoxSize == 0 なら MP4ファイルの終端(BinaryStreamの終端)までがBoxSize とみなされます
    - BoxSize == 1 なら BoxData の先頭に8byteの BoxSize (largesize) が格納されています
  - BoxType には ftyp, moov, mvhd, trakといったASCII文字列が格納されています
```
+------------+------------+----------------------+
| BoxSize(4) | BoxType(4) | BoxData(BoxSize - 8) |
+------------+------------+----------------------+
```
- Boxはネストすることが可能です。Boxの中にBoxを格納できます
  - ネストできるBoxとできないBoxがあります。ネスト可能なBoxはコンテナ(コンテナBox)と呼ばれます
  - 殆どのコンテナは子供のBoxを格納するための物で、それ自体が何か情報を持つことは稀です
    - AVCDecoderConfigurationRecord は例外で、コンテナでありながらそれ自体も情報を持っています

以下は映像トラックと最小のBoxのみで構成されたMP4の例です。

mp4: {
    ftyp: { major_brand, minor_version, compatible_brands },
    moov: {
        mvhd: { creation_time, modification_time, timescale,
                duration, rate, volume, matrix, next_track_ID },
        trak: {
            tkhd: { creation_time, modification_time, track_ID,
                    duration, layer, alternate_group, volume,
                    matrix, width, height },
            edts: {
                elst: { entry_count, segment_duration, media_time,
                        media_rate_integer, media_rate_fraction },
            },
            mdia: {
                mdhd: { creation_time, modification_time, timescale,
                        duration, language },
                hdlr: { handler_type, handler_type2, name },
                minf: {
                    vmhd: { graphicsmode, opcolor },
                    dinf: {
                        dref: { entry_count, url },
                    }
                    stbl: {
                        stsd: {
                            avc1: {
                                avcC: { ... },
                            },
                        },
                        stts: { ... },
                        stss: { ... },
                        stsc: { ... },
                        stsz: { ... },
                        stco: { ... },
                    },
                },
            },
        },
        udta: {
            meta: { ... },
            hdlr: { ... },
            ilst: { ... },
        },
    },
    free: { data },
    mdat: { size, data... },
}

Table 1 — Box types, structure, and cross-reference

via ISO/IEC 14496-12

box 名は固定で4文字です。box 名の前に ! が付いている box は省略ができません(実際のbox名には!は付きません)。それらは必須のboxになります
- moof/mfhd や mfra/mfro, meta/hdlr は必須のBoxとなっていますが、上位のコンテナ(moof, mfra, meta)が不要な場合はそれらも必要ありません
1つのmp4に何度も登場するBoxがあるため、Boxの解釈はtypeだけを見て判断せず、親Box(コンテナ)を見て解釈する必要があります(コンテキスト依存)
- hdlr, dinf, dref などは、1つのmp4に複数回登場する可能性があります
  - これらは親となるBox(コンテナ)が何であるかを考慮して解釈する必要があります
- trak もトラックの数だけ登場します。映像用のトラックか音声用のトラックかを知るには...を参照します(TODO:ここ説明を追加する)
udta は moov/udta または moov/trak/udta として配置が可能です
avc1 と avcC は AVC(H.264)を格納する場合に必要です

Lv1	Lv2	Lv3	Lv4	Lv5	Lv6	Lv7	Lv8	Spec
!ftyp								4.3 file type and compatibility
pdin								8.1.3 progressive download information
!moov								8.2.1 container for all the metadata
	!mvhd							8.2.2 movie header, overall declarations
	!trak							8.3.1 container for an individual track or stream
		!tkhd						8.3.2 track header, overall information about the track
		tref						8.3.3 track reference container
		trgr						8.3.4 track grouping indication
		!edts						8.6.4 ❗ edit list container
			!elst					8.6.6 ❗ an edit list
		!mdia						8.4 container for the media information in a track
			!mdhd					8.4.2 media header, overall information about the media
			!hdlr					8.4.3 handler, declares the media (handler) type
			!minf					8.4.4 media information container
				vmhd				8.4.5.2 video media header, overall information (video track only)
				smhd				8.4.5.3 sound media header, overall information (sound track only)
				hmhd				8.4.5.4 hint media header, overall information (hint track only)
				nmhd				8.4.5.5 Null media header, overall information (some tracks only)
				!dinf				8.5 data information box, container
					!dref			8.7.2 data reference box, declares source(s) of media data in track
				!stbl				8.5 sample table box, container for the time/space map
					!stsd			8.5.2 sample descriptions (codec types, initialization etc.)
						!avc1		ISO/IEC 14496-15 AVC file format
							!avcC	ISO/IEC 14496-15 AVC file format
					!stts			8.6.1.2 (decoding) time-to-sample
					ctts			8.6.1.3 (composition) time to sample
					cslg			8.6.1.4 composition to decode timeline mapping
					!stsc			8.7.4 sample-to-chunk, partial data-offset information
					stsz			8.7.3.2 sample sizes (framing)
					stz2			8.7.3.3 compact sample sizes (framing)
					!stco			8.7.5 chunk offset, partial data-offset information
					co64			8.7.5 64-bit chunk offset
					stss			8.6.2 sync sample table
					stsh			8.6.3 shadow sync sample table
					padb			8.7.6 sample padding bits
					stdp			8.7.6 sample degradation priority
					sdtp			8.6.4 independent and disposable samples
					sbgp			8.9.2 sample-to-group
					sgpd			8.9.3 sample group description
					subs			8.7.7 sub-sample information
					saiz			8.7.8 sample auxiliary information sizes
					saio			8.7.9 sample auxiliary information offsets
	!udta	!udta						8.10.1 user-data
			meta					QuickTime File Format Specification
			hdlr					QuickTime File Format Specification
			ilst					QuickTime File Format Specification
	mvex							8.8.1 movie extends box
		mehd						8.8.2 movie extends header box
		!trex						8.8.3 track extends defaults
		leva						8.8.13 level assignment
moof								8.8.4 movie fragment
	!mfhd							8.8.5 movie fragment header
	traf							8.8.6 track fragment
		tfhd						8.8.7 track fragment header
		trun						8.8.8 track fragment run
		sbgp						8.9.2 sample-to-group
		sgpd						8.9.3 sample group description
		subs						8.7.7 sub-sample information
		saiz						8.7.8 sample auxiliary information sizes
		saio						8.7.9 sample auxiliary information offsets
		tfdt						8.8.12 track fragment decode time
mfra								8.8.9 movie fragment random access
	tfra							8.8.10 track fragment random access
	!mfro							8.8.11 movie fragment random access offset
mdat								8.2.2 media data container
free								8.1.2 free space
skip								8.1.2 free space
	udta							8.10.1 user-data
		cprt						8.10.2 copyright etc.
			tsel					8.10.3 track selection box
			strk					8.14.3 sub track box
				stri				8.14.4 sub track information box
				strd				8.14.5 sub track definition box
meta								8.11.1 metadata
	!hdlr							8.4.3 handler, declares the metadata (handler) type
	dinf							8.5 data information box, container
		dref						8.7.2 data reference box, declares source(s) of metadata items
	iloc							8.11.3 item location
	ipro							8.11.5 item protection
		sinf						8.12.1 protection scheme information box
			frma					8.12.2 original format box
			schm					8.12.5 scheme type box
			schi					8.12.6 scheme information box
	iinf							8.11.6 item information
	xml							8.11.2 XML container
	bxml							8.11.2 binary XML container
	pitm							8.11.4 primary item reference
	fiin							8.13.2 file delivery item information
		paen						8.13.2 partition entry
			fire					8.13.7 file reservoir
			fpar					8.13.3 file partition
			fecr					8.13.4 FEC reservoir
		segr						8.13.5 file delivery session group
		gitn						8.13.6 group id to name
	idat							8.11.11 item data
	iref							8.11.12 item reference
meco								8.11.7 additional metadata container
	mere							8.11.8 metabox relation
styp								8.16.2 segment type
sidx								8.16.3 segment index
ssix								8.16.4 subsegment index
prft								8.16.5 producer reference time

Compatibility

MP4 は Apple の QuickTime File Format Specification (QTFF) を流用したものです
- QTFF の世界では MP4Box は ATOM と呼ばれます。MP4Box と ATOM はほぼ同じものです
- QuickTime や Mac OS で再生可能な MP4 を作成するためには、仕様書にないBoxも生成する必要があります
edts, elst が無いと QuickTime や Mac OS で再生不能な MP4 になります
moov/trak/udta 以下に meta, hdlr, ilst が無いと QuickTime や Mac OS で再生不能な MP4 になります
- QTFF Metadata Structure
dref/url は一見すると情報量がゼロに見えるため省略可能に見えますが、このBoxはリソースがmp4ファイルに内蔵されている事を示しているため、省略すると再生できないMP4が出来てしまいます

Sample table

Box の中で特に難解なのは SampleTable(stbl)と、その下位のBox(stsz, stco, stsc, stsd, stts) です。これらの Box は MPEG4-12 (14496-12) で定義されています。

mdat box

mdat (Media Data Box) はメディアデータを格納する box です。

This box contains the media data. In video tracks, this box would contain video frames. A presentation may contain zero or more Media Data Boxes. The actual media data follows the type field; its structure is described by the metadata (see particularly the sample table, subclause 8.14, and the item location box, subclause 8.44.3).

In large presentations, it may be desirable to have more data in this box than a 32-bit size would permit. In this case, the large variant of the size field, above in subclause 6.2, is used. There may be any number of these boxes in the file (including zero, if all the media data is in other files). The metadata refers to media data by its absolute offset within the file (see subclause 8.19, the Chunk Offset Box); so Media Data Box headers and free space may easily be skipped, and files without any box structure may also be referenced and used.

mdat box には、AudioとVideoの両方のデータを含む事ができます。 mdat 内部の構造は stbl(SampleTable) 以下のメタデータを読み解く必要があります。 stco (Chunk Offset) を読み解くと、メディアデータのオフセットを知ることができます。オフセット値はファイル内における絶対値で記録されています。

TODO: ファイル内における絶対値とは? → byte配列で見た場合のオフセット値。つまり、 stco box を生成する場合は、一度stco box にダミーデータを入れておき、その状態で mp4 のファイルイメージ(バイト配列)をメモリ上にダンプした上で、メディアデータのバイナリ配列上での座標(offset値)を算出し、その値をstco box のダミーデータの場所に正確に上書きするという後処理が必要になる (死ねる。死んだ。この処理をもっとシンプルにするにはどうすれよかったんだ…)

mdhd box

mdhd はトラックの基本情報やトラックの実際の再生時間などを提示します。

duration / timeScale で実際の再生時間が分かります
- 例1. { duration: 81920, timeScale: 16384 } -> 81920 / 16384 -> 5 sec
- 例2. { duration: 450000, timeScale: 90000 } -> 450000 / 90000 -> 5 sec
language はよくわからないので undefined を意味する "und" を入れておきます
creation_time はメディアの作成時刻です。1904年からの差分時刻です

mdhd: {
    language:           "und",
    duration:           duration,  // 81920
    timescale:          timeScale, // 16384
    creation_time:      0,
    modification_time:  0,
}

dinf box

dinf は dref のコンテナです。

dref box

dref はトラックの存在場所を提示する情報です。一見すると情報量を持たない無価値なboxに見えますが、このboxがないと再生できないファイルになります。 box の中身は固定でOKです。flags に 1 を指定する必要があります。

dref: {
    entry_count:  1,
    "url ": [{
        "version":  0,
        "flags":    1, // [!]
        "url":      "",
    }]
}

vmhd box

vhmd は映像トラックに固有の情報を提示します。 box の中身は固定でOKです。flags に 1 を指定する必要があります。

vhmd: {
    flags:        1, // [!]
    graphicsmode: 0,
    opcolor:      [0, 0, 0],
};

tkhd box

tkhd はトラックの基本情報です。再生時間や解像度などを提示します。

flags は AVC(H.264) の場合は 3 になります。0 だと Mac の finder で再生不能な mp4 になります。
alternate_group は 0 でOKです
creation_time は 1904年からの差分時刻です。0 でOKです
modification_time は 1904年からの差分時刻です。0 でOKです
width は 4byte(32bit)の値を 16.16 で分割した値です。 128px なら 16bit shift した 8388608 になります
height は 4byte(32bit)の値を 16.16 で分割した値です。 128px なら 16bit shift した 8388608 になります
matrix は [65536, 0, 0, 0, 65536, 0, 0, 0, 1073741824] でOKです
track_ID は 1 から始まる値です, 2番目のトラック(恐らくAudioTrack)なら 2 になります
volume は映像トラックなら 0 になります
layer は 0 で OKです

tkhd: {
    flags:             3, // [!]
    alternate_group:   0,
    creation_time:     0,
    modification_time: 0,
    duration:          duration, // 5000
    width:             0,        // 8388608 = 128 << 16
    height:            0,        // 8388608 = 128 << 16
    matrix:            [65536, 0, 0, 0, 65536, 0, 0, 0, 1073741824],
    track_ID:          1,
    volume:            0,
    layer:             0,
;

hdlr box

hdlr はトラック種別情報を提示します。

handler_type は H264 では "mdir" を入れます
handler_type2 は定義されていませんが、Mac の Finder で再生できる mp4 をビルドするために必要な情報です

hdlr: {
    handler_type: "mdir",
    handler_type2: 1634758764,
    name: "",
}

ilst box

ilst は iTunes Metadata です。アルバムやアーティストの情報などが格納されています。

ilst: {
    "data": [ 0, 0, 0, 37, 169, 116, 111, 111,
              0, 0, 0, 29, 100, 97, 116, 97,
              0, 0, 0, 1, 0, 0, 0, 0,
              76, 97, 118, 102, 53, 54, 46, 52,
              48, 46, 49, 48, 49 ]
}

edts box

edts は elst のコンテナです。

elst box

elst は再生範囲と再生速度を指定する機能です。

elst の内容はそれほど重要ではありませんが、elst が無いと QuickTime が無音部分の判別ができず、無音部分も含めた全体の再生時間をユーザに提示してしまうようです。

media_time 0 以外の値を指定すると stts のデフォルト時間を変更する事ができます (TODO:)
media_rate_integer は media_rate_fraction とペアで使用します。media_rate_integer が整数部, media_rate_fraction が小数部になります
- デフォルトでは media_rate_integer = 1, media_rate_fraction = 0 です
- media_rate_integer = 1 で media_rate_fraction = 0 なら、 1.0000 として評価されます
segment_duration は tkhd.duration と同じ値にしておけばとりあえず動きます (TODO:)
entry_count には entries の要素数を設定します。

elst: {
    "entry_count": 0,
    "entries": [{
        "media_time": 0,
        "media_rate_fraction": 0,
        "media_rate_integer": 1,
        "segment_duration": 5000, // 1000ms を 1sec とする再生時間
    }],
}

stbl box

Sample Table box, container for the time/space map

stbl は stsz, stco, stsc, stsd, stts を入れるためのコンテナです。

Sample はデータを格納するための基本的な単位です。MP4 の Sample は MPEG4 における AccessUnit の事です。

Sample を幾つかまとめたものは Chunk と呼ばれ、Chunk を更にまとめたものが映像や音声の Track になります。

Sample のサイズは stsz(Sample size) に格納します
- sample.size は AccessUnit のバイナリデータサイズと等しくなります
どの Chunk にどの Sample が格納されているかは stsc(Sample-to-chunk) に格納します
stsc, stco, stsz の3つのBoxの情報を元にデコーダはFrameのデータの位置とサイズを把握する事ができます
stts を参照する事でデコーダは、映像と音声の再生時間を取得できます。sttsの情報を使いリップシンクを行う事ができるでしょう
- stts を上手に使う事で、Frame ごとに再生時間が異なる可変フレームレートのムービーを作成することも可能です

stsz box

stsz (Sample sizes framing) は Sample の length のリストです。

stsz: {
    version:      0,
    flags:        0,
    sample_size:  0,
    sample_count: 0,
    samples:      [{ entry_size: 0 }, ...]
}

sample_size にはデフォルトのサンプルサイズを指定します。
- 全てのサンプルのサイズが同じ場合はこの値を設定し samples を空にします。また sample_count も0にします
- サイズの異なるサンプルが複数ある場合は以下のようにします
  - samples に { entry_size: サンプルのサイズ } をサンプルの数だけ追加します
  - sample_count を samples.length に設定します
  - sample_size に 0 を設定します

先頭の AccessUnit( AUD(6) + SPS(23) + PPS(9) + SEI(611) + IDR(173) ) のバイト数の合計値が 822 byte で、後続の4つの AccessUnit の合計値が 282 byte の場合は、このようなテーブルが生成されます。

index	`sample_size`	Note
1	822	AUD(6) + SPS(23) + PPS(9) + SEI(611) + IDR(173) = 822
2	282	AUD(6) + SPS(23) + PPS(9) + IDR(244) = 282
3	282	AUD(6) + SPS(23) + PPS(9) + IDR(244) = 282
4	282	AUD(6) + SPS(23) + PPS(9) + IDR(244) = 282
5	282	AUD(6) + SPS(23) + PPS(9) + IDR(244) = 282

このテーブルは以下のようにデータ化されます。

stsz: {
    version:      0,
    flags:        0,
    sample_size:  0,
    sample_count: 5,
    samples: [
        { entry_size: 822 },
        { entry_size: 282 },
        { entry_size: 282 },
        { entry_size: 282 },
        { entry_size: 282 },
    ]
}

このデータは assets/ff/png.all.mp4.00.ts.mp4 ファイルを分解すると同じ物を見る事ができます。

謎だったもの

assets/ff/png.all.mp4.00.ts.mp4 の stsz には 5つのサンプルが入っているが、IDR NALUnitSize は samples.entry_size とは一致しない

samples = [
 { entry_size: 822 },
 { entry_size: 282 },
 { entry_size: 282 },
 { entry_size: 282 },
 { entry_size: 282 },
]
IDR NALUnitSize = [
 173 bytes,
 240 bytes,
 240 bytes,
 240 bytes,
 240 bytes,
]

この事から、Sample とはピクチャを構成する IDR NALUnit だけではなく、 AccessUnit 全体なのではないかと考えた。

最後のAccessUnit( AUD(6) + SPS(23) + PPS(9) + IDR(244) )の合計を計算すると 288 bytes となり一致する。

このことから、samples.entry_size は AccessUnit (AUD → IDR までの一連の NALUnit) 単位で記録されていると仮定した。

当初は mdat に含めるSample には AUD や SEI, SPS, PPS は含まれていないものだと思っていたが、実際にはそうではなく、それらは含まれていてもよく、その場合は entry_size = AcessUnit.length となっているようだ。

つまり NALUnit の世界の AccessUnit とは MP4 の世界では Sample という単位と等しいらしい。

stsc box

stsc (Sample-to-Chunk, partial data-offset information) は Sample にいくつ Chunk が含まれているのかの情報を提示します。 stso と連携して動作します。

samples には [{ first_chunk, samples_per_chunk, sample_description_index }, ...] を指定します
entry_count には samples.length を指定します。samples の要素数が 1 なら entry_count も 1 になります
first_chunk には最初のチャンク番号を指定します
samples_per_chunk にはサンプル数を指定します
sample_description_index にはデコード方法を示した番号(index)を指定します。sample_description_index は stsd で定義されています

以下は samples が [{1,3,23}, {3,1,23}, {5,1,24}] の例です。

chunk_index	samples_per_chunk	sample_description_index	Note
1	3	23	{1,3,23}
2(*)	3	23	{1,3,23} copy
3	1	23	{3,1,23}
4(*)	1	23	{3,1,23} copy
5	1	24	{5,1,24}

chunk_index が 2 と 4 の行は欠落していますが、これらは前の行のコピーがそのまま利用されます。

このテーブルからは、「chunk_index 1 から 2 までの間に、Chunk が 3つ入っている」ことが分かります。さらに、stco (Chunk Offset) を見ることで Chunk のオフセットバイトが分かります。

stco box

stco (Chunk Offset, partial data-offset information) は、各Chunkのオフセット値が記録されたテーブルを提供します。

オフセットは mdat の先頭をゼロとしたオフセット値になります。オフセットはbox単位の値になります。

samples は chunk のオフセットのリストです
- chunk_offset は mdat に格納されている VideoStream (連続した AccessUnit )の先頭からのオフセット値の事です (TODO: 確認)
- chunk_offset の値は、Sample に対するオフセットではなく、個々の Chunk に対するオフセットになります。

stco: {
    entry_count: 1,
    samples:     [{ chunk_offset: 0 }, ...]
};

謎

png.all.mp4.00.ts.mp4 の stco.samples.chunk_offset = 812 の場所は、IDRチャンクの末尾から-10byteの位置なので、謎。

source

stco: {
    entry_count: 1,
    samples:     [{ chunk_offset: 812 }]
}

stts box

stts (decoding Time-to-Sample) は、サンプルの再生時間のリストを提示します。

stts: {
    "entry_count":  samples.length, // 1
    "samples":      samples,        // [{ sample_count: 1, sample_delta: 81920 }]
}

entry_count は samples の要素数を設定します
samples は { sample_count, sample_delta } の配列です。
- sample_count と sample_delta については以下の例をみてください。

[{ sample_count:4, sample_delta:3 },
 { sample_count:2, sample_delta:1 },
 { sample_count:3, sample_delta:2 }] というリストがあったら以下のリストが生成されます

Sample 番号	`sample_delta`	Note
1	3	`{ sample_count:4, sample_delta:3 }` は 4回 `sample_delta` = 3 が続く
2	3
3	3
4	3
5	1	`{ sample_count:2, sample_delta:1 }` は 2回 `sample_delta` = 1 が続く
6	1
7	2	`{ sample_count:3, sample_delta:2 }` は 3回 `sample_delta` = 2 が続く
8	2
9	2

samples からこのテーブルを生成し、このテーブルを見て sample 6 の Time(長さ)は 1 というようにデコーダが判断します。

このテーブルを元に引くことでサンプル毎のタイムラインテーブルを生成できます。

Sample 番号	sample_delta	beginTime	endTime
1	3	0 (*)	3
2	3	3	6
3	3	6	9
4	3	9	12
5	1	12	13
6	1	13	14
7	2	14	16
8	2	16	18
9	2	18	20

上記のテーブルを使うとシークが可能になります。例えば、beginTime = 12 の場所にシークすると、Sample 5 のピクチャが表示される事がわかります

beginTime の 0 は固定ではなく、elst.media_time の値により変更が可能です。

memo

// http://wiki.multimedia.cx/?title=QuickTime_container#stts を参考に

movieTotalDuration = (sample_count1 * sample_time_delta1 + ... + sample_countN * sample_time_deltaN ) / timescale

stsd box

stsd (Sample Descriptions (codec types, initialization etc.)) は、トラックデータ再生のためのヘッダ情報を提示します。

H.264 を格納した stsd では、avc1 MP4Box を格納するコンテナです。通常コンテナは実データを持ちませんが、stsd は例外です。

stsd {
    entry_count:  1,
    avc1: _root_moov_trak_mdia_minf_stbl_stsd_avc1
}

avc1 box

avc1 は、H.264 Stream のヘッダ情報を提示します。また avcC のコンテナの役割も果たしています。

avc1: {
    compressorname:         "",
    frame_count:            frame_count,    // 1
    data_reference_index:   1,
    depth:                  0x18,           // 0x0018
    width:                  width,
    height:                 height,
    horizresolution:        0x480000,       // 72dpi = 4718592
    vertresolution:         0x480000,       // 72dpi = 4718592
    avcC:                   _root_moov_trak_mdia_minf_stbl_stsd_avc1_avcC
}

frame_count は、ビデオフレームでは常に 1 にします
compressorname は、エンコーダの名前を記録するフィールドです。省略可能です
depth は常に 0x0018 にします
data_reference_index は常に 1 にします
width と height には Stream に含まれる最大の幅と高さを持つサンプルから取った値をそれぞれ設定します
horizresolution と vertresolution はつねに 0x480000 の値を指定します

avcC box

avcC (AVC decoder configuration record) は、MPEG4-15 で定義されている拡張仕様です。

avcC には、一番最初の SPS と PPS を格納するフィールドがあります。

AVCProfileIndication には Profile を指定します。Basic profile なら 66 を指定します
AVCLevelIndication には Level を指定します。Level 3.0 なら 30 を指定します
profile_compatibility には適当に 192 を指定しておきます
configurationVersion は常に 1 を指定します
lengthSizeMinusOne には常に 3 を指定します。この値を変えると NALUnitSize のbyte数に影響します
numOfSequenceParameterSets には常に 1 を指定します。
numOfPictureParameterSets には常に 1 を指定します。
SPS には先頭の SPS のコピーを指定します
PPS には先頭の PPS のコピーを指定します

// AVCDecoderConfigurationRecord
avcC: {
    AVCProfileIndication:         66,     // 66 = Baseline profile
    AVCLevelIndication:           30,     // 30 = Level 3.0
    profile_compatibility:        192,    // ??
    configurationVersion:         1,
    lengthSizeMinusOne:           3,      // NALUnitSize = 3 + 1 = 4byte
    numOfSequenceParameterSets:   sps.length,
    numOfPictureParameterSets:    pps.length,
    SPS: [{
        sequenceParameterSetLength:  22,
        sequenceParameterSetNALUnit: [
            103, 66, 192, 30, 217, 2, 4, 104,
            64, 0, 0, 3, 1, 64, 0, 0,
            3, 0, 131, 197, 139, 146
        ]
    }],
    PPS: [{
        pictureParameterSetLength:  5,
        pictureParameterSetNALUnit: [
            104, 203, 131, 203, 32
        ],
    }],
}

Link

Overview

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MP4