В исправленном сообщении звучит более корректно
А попробуйте, к примеру, вырезать кусок из упакованного таким образом фильма или клипа - и сразу поймете, что проблема налицо.
[more=Собсно кусок справки к Дабу]Why is using vbr mp3 in avis a bad idea?
We discussed this issue at length in irc and Cyrius (suiryc) gave a pretty good explanation:
[21:33] <Belgabor|Home> cyrius, what did your experimets tell?
[21:33] <Suiryc> Belgabor|Home : I think I know know why VBR is not good, and also why Nando's hack works (somehow)
[21:33] <Suiryc> s/know/now
[21:33] <Belgabor|Home> ok, tell me
[21:33] <Suiryc>
[21:34] <Suiryc> first of all there are 2 'headers' in the AVI (audio) stream
[21:34] <Belgabor|Home> I have the feeling i need to hammer that down some throat soon
[21:34] <ChristianHJW> lol
[21:34] <Suiryc> first one is a general one (the same struture is used for each track)
[21:35] <Suiryc> AVISTREAMINFO
[21:35] <Suiryc> (IIRC ... there should use shorter names ...)
[21:35] <Belgabor|Home> lol
[21:37] <spyder482> ChristianHJW: I won't be moving for a few months still though
[21:37] <Suiryc> this one tell how many frames there are in the stream
[21:37] <Suiryc> and what is the rate of the frames
[21:37] <Suiryc> thanks to dwRate & dwScale fields
[21:37] <Belgabor|Home> got that
[21:38] <Suiryc> it also contains a field saying the size of 1 frame
[21:38] <Suiryc> if VBR, then it is set to 0, otherwise it is set to the correct value
[21:38] <Belgabor|Home> dwSampleSize
[21:39] <Suiryc> yep
[21:39] <Suiryc> then there is a header specific to the audio stream (based on WAVEFORMATEX)
[21:39] <Suiryc> this one tell the samplerate (44100, 48000, ...)
[21:39] <Suiryc> the byterate
[21:39] <Suiryc> the format (wFormatTag)
[21:40] <Suiryc> and especially contains a field names nBlockAlign
[21:40] <Suiryc> nBlockAlign tell how many bytes an audio frame contains
[21:40] <Suiryc> _BUT_
[21:40] <Belgabor|Home> And that musnt be 0
[21:40] <Suiryc> cannot be set to 0
[21:40] <spyder482> so much work for AVI...
[21:40] <Suiryc>
[21:40] <Belgabor|Home> ok, i think i get the picture
[21:40] <Suiryc> ok so let's continue
[21:41] <Belgabor|Home> ok
[21:41] <ChristianHJW> all with you guys ...
[21:41] <Suiryc> in Nandub here is what happens with an MP3 stream (VBR one)
[21:42] <Suiryc> Nando set dwRate to the samplerate (44100, 48000, ...)
[21:42] <spyder482> don't you two have a channel for this?
[21:42] <Suiryc> spyder482 : shut up
[21:42] <Suiryc> and set dwScale to 1152
[21:42] <spyder482> lol
[21:42] <Belgabor|Home> no, the other one is just for lurking
[21:42] <Belgabor|Home>
[21:42] <Suiryc> :]
[21:42] <spyder482> hehe
[21:43] <Suiryc> and set nBlockAlign to 1152 too
[21:43] <Suiryc> then, when muxing it only treat whole MP3 frames
[21:43] <Suiryc> (i.e. each MP3 frame is in its own Chunk)
[21:44] <Suiryc> you still follow ?
[21:44] <md`> who has done the mpeg2 import part of vdmod?
[21:44] <Belgabor|Home> ok, one mp3 frame is what?
[21:44] <Belgabor|Home> pulco-citron
[21:44] <md`> hmpf
[21:44] <spyder482> pulco-citron
[21:44] <spyder482> oh
[21:44] <spyder482>
[21:45] <md`> why does he generate d2v and dont let the user decide to pick one...
[21:45] <Belgabor|Home> dunno
[21:45] <md`> if there is one already
[21:45] <md`> hmmm
[21:45] <Suiryc> Belgabor|Home : an Mpeg1-Layer3 frame is the shorter block of data you can use
[21:45] <ChristianHJW> let Suiryc finish guys .. please
[21:45] <md`> yes ok
[21:45] <Belgabor|Home> ok
[21:45] <spyder482> ChristianHJW: check #virtualdub
[21:45] <Suiryc> it contains an header saying what is in the frame, and then the data (audio)
[21:46] <ChristianHJW> we have to know whats wrong in AVI to be able to advertise matroska
[21:46] <Belgabor|Home> this is how much data?
[21:46] <Suiryc> somehow 1 MP3 frame ~ 1 video frame
[21:46] <Belgabor|Home> ChristianHJW: lol
[21:46] <Suiryc> the size of a frame depends on the MP3 settings
[21:46] <Suiryc> (i.e. bitrate, ...)
[21:46] <Belgabor|Home> ok
[21:47] <Belgabor|Home> is it fixed for a file or varible in vbr?
[21:47] <Suiryc> however a Mpeg1-layer3 frame conatins 1152 samples
[21:47] <Suiryc> the size of a frame is variable
[21:47] <Suiryc> even in CBR
[21:48] <Suiryc> (e.g. frames will be of 417 or 418 bytes)
[21:48] <Belgabor|Home> ok, but 1152 is the upper limit?
[21:48] <Suiryc> because a fixed btrate must be achieved
[21:48] <Suiryc> 1152 is the number of samples a frame contains
[21:48] <Suiryc> each frame (whatever its size may be) contains 1152 samples
[21:49] <Belgabor|Home> oic
[21:49] <Suiryc> so let's continue
[21:49] <Suiryc> each frame contains 1152 samples
[21:49] <Belgabor|Home> ok
[21:49] <Suiryc> and the rate of the stream (in AVISTREAMINFO) has been set to :
[21:49] <Suiryc> dwRate / dwScale = SampleRate/1152
[21:50] <Suiryc> since each Frame contains 1152 it is equal to the 'framerate'
[21:50] <Suiryc> (as for video)
[21:50] <Belgabor|Home> ok, i think i got that
[21:50] <Suiryc> now you must recall that each frame is in its own AVI chunk
[21:50] <Belgabor|Home> ok
[21:50] <Suiryc> so it is also the 'chunkrate'
[21:51] <Suiryc> so here is now what happens (it is most likely what happens) when playing the file in Window Media Player
[21:51] <Belgabor|Home> ic
[21:51] <Suiryc> WMP will get both headers
[21:52] <Suiryc> which will say to it that the rate of the stream is SampleRate/1152
[21:52] <Belgabor|Home> gimme a sec, brb
[21:52] <Suiryc> and that each audio frame is 1152 bytes long (nBlockAlign)
[21:52] <Suiryc> k
[21:53] <Belgabor|Home> back
[21:54] <Suiryc> ok so WMP believe each frame is 1152 bytes long
[21:54] <Belgabor|Home> yeah
[21:54] <Suiryc> which is not the case (generally frames are around 400 bytes long with 128kbps stream)
[21:55] <Suiryc> but
[21:55] <Belgabor|Home> yeah, got that much
[21:55] <Suiryc> now you are reading data in the file
[21:55] <Suiryc> and WMP needs to know when to read the audio
[21:55] <Suiryc> (i.e. to which time correspond an audio frame)
[21:56] <Suiryc> to do so it will look at all the previous audio chunks in the file
[21:56] <Suiryc> for each shunk it divide the size (in bytes) of the chunk by nBlockAlign to know how many frames there were in the chunk
[21:56] <Belgabor|Home> ok
[21:56] <Suiryc> s/shunk/chunk
[21:57] <Belgabor|Home> ok
[21:57] <Suiryc> (since every tools dealing with the stream must cut on nBlockAlign boundaries)
[21:57] <Suiryc> since each chunk is shorter than 1152 bytes (nBlockAling) it shoul get 0
[21:57] <Suiryc> but this is not possible
[21:58] <Suiryc> since tools work on blocks of nBlockAlign bytes, it must assume than there is at least 1 frame in the chunk
[21:58] <Suiryc> (even if the chunk is shorter)
[21:59] <Suiryc> so for each chunk it find there is 1 frame in it
[21:59] <Suiryc> which is really the case (each mp3 frame is in its own chunk)
[21:59] <Suiryc> so WMP got the correct number of mp3 frames played so far
[22:00] <Suiryc> and since it has the correct rate (each frame contains 1152 samples, and the rate of the stream is SampleRate/1152)
[22:00] <Suiryc> it also got the correct timecode for the frame
[22:00] <Belgabor|Home> ok
[22:00] <Suiryc> resulting in a perfectly synched MP3 stream
[22:01] <Suiryc> I was lead to this conclusion without debugging WMP while playing
but with some tests I made :
[22:02] <Suiryc> I changed the dwScale value (with or without the nBlockAlign value)
[22:02] <Suiryc> but this resulted in otu of synch issues (audio playing too fast/slow)
[22:02] <Suiryc> out*
[22:02] <Suiryc> I changed the nBlockAlign valuie :
[22:03] <Suiryc> setting it to 1 and then I have out of synch issues too
[22:03] <Suiryc> but setting it 2304 and I stil have a perfectly synched stream
[22:03] <Belgabor|Home> ok
[22:04] <Suiryc> so in fact the 1152 value in nBlockAlign could be anything else
[22:04] <Suiryc> _but_
[22:04] <Suiryc> must be higher than the size of an mp3 frame
[22:04] <Belgabor|Home> ok, what happens if you set it to 0?
[22:04] <Suiryc> lol
[22:05] <Suiryc> if you set it to 0 then WMP won't play the stream (the icon for audio is disabled like if there is no audio in the file)
[22:05] <Suiryc> so no VBR
[22:05] <Belgabor|Home> ok
[22:06] <Belgabor|Home> so the failure is in priciple not in avi, but in the WAVEFORMATEX header
[22:06] <Suiryc> yep
[22:06] <Suiryc> but since the AVI will use WAVEFORMATEX for audio headers, it is still a failure in AVI specs
[22:07] <Belgabor|Home> do you have the resemblance of an idea why vbr mp3 fails?
[22:07] <Belgabor|Home> yep
[22:07] <Suiryc> <Belgabor|Home> do you have the resemblance of an idea why vbr mp3 fails? <-- you mean why it is not good ?
[22:08] <Belgabor|Home> yep, why it fails sometimes
[22:08] <ChristianHJW> thats what i am interested in also
[22:08] <Suiryc> well in the case of WMP, it will divide the chunk size by nBlockAlign
[22:08] <Suiryc> (that's what I think, since the synch is good)
[22:08] <Suiryc> and will set it to 1 if the chunk size is too small
[22:09] <Suiryc> but there is another way to compute timecode
[22:09] <Suiryc> (assuming that you have CBR of course)
[22:09] <Suiryc> you take the total bytes in previous chunks
[22:09] <Suiryc> and divide it by nblockAlign
[22:10] <Belgabor|Home> which fails miserably for the vbr hack
[22:10] <Suiryc> of course in this case you get a completly wrong value since mp3 frames are not 1152 bytes lnog
[22:10] <Suiryc> yep
[22:10] <Suiryc> otehr tools may also assume that the chunk is not valid (corrputed) since its size is shorter than nBlockAlign
[22:11] <Belgabor|Home> ok, thats the failure in principle, but why are some files broken?
[22:12] <Suiryc> what files ?
[22:12] <Suiryc> broken ? what do you mean by broken ?
[22:13] <Belgabor|Home> i had some vbr mp3 avis which seemed like having divx3 freeze frames but where ok when demuxed
[22:13] <Suiryc> dunno
[22:13] <Suiryc> maybe a problem with the decoder
[22:14] <Belgabor|Home> ok, well that cleared things up a bit
[22:14] <Belgabor|Home> thx
[22:14] <Suiryc>
[22:14] <Suiryc> btw there may be problems with Nandub code
[22:14] <Suiryc> because :
[22:14] <Suiryc> 1. layer1 streams only have 384 samples per frame
[22:15] <Suiryc> 2. IIRC with very high bitrates an mp3 frame can be higher than 1152 bytes
[22:15] <Suiryc> s/higher/bigger
[22:16] <Suiryc> (the max size is near 2000 bytes IIRC)
[22:16] <Belgabor|Home> ok, so nBlockAlign should be >2000
[22:17] <Suiryc> so depending on the way dividing is used (rounding to floor or ceil or nearest value)
[22:17] <Suiryc> and the max size of a frame, it may find there are 2 frames in a chunk where there is only 1 frame
[22:17] <Belgabor|Home> ok, i got that
[22:18] <Suiryc> but this is for really high bitrates ...
[22:18] <Suiryc> lemme check ...
[22:19] <Belgabor|Home> what would happen if we put two frames in one chunk? aka set dwRate = 2* sample rate and so on?
[22:20] <Belgabor|Home> no, not two, just double the values?
[22:21] <Suiryc> if you double the value the rate of the audio will be changed accordingly
[22:21] <Suiryc> so to keep it correct you would have to put 2 mp3 frames in each chunk
[22:22] <Suiryc> but then you would most likely go beyond the 1152 bytes per chunk
[22:22] <Suiryc> and increase the chances to generate out of synch problems
[22:23] <Belgabor|Home> let me rethink
[22:24] <Suiryc> changing dwRate and dwScale only affects the rate of the stream
[22:25] <Suiryc> multiplying dwRate by 2 => audio play 2 times faster
[22:25] <Suiryc> multiplying dwScale by 2 => audio play 2 times slower
[22:25] <Belgabor|Home> if we double dwrate, dwscale, nblockalign and dwsamplesize?
[22:25] <Suiryc> multipyling both => no change
[22:25] <Suiryc> dwSampleSize is set to 0
[22:26] <Belgabor|Home> ah ok, so skip that
[22:26] <Suiryc> (dwRate, dwScale) and nBlockAlign are not linked
[22:26] <Suiryc> you can use a higher value in nBlockAlign
[22:27] <Suiryc> (like the 2304 I tested)
[22:27] <Belgabor|Home> nvertheless, if we double all three, shouldnt it be safe for larger mp3 frames?
[22:27] <Suiryc> this won't change anything in the case of WMP because something lower than 1152 divided by 1152 or 2304 will still be rounded to 0
[22:27] <Suiryc> Belgabor|Home : this would be safer
[22:28] <Suiryc> but would cause even more troubles in apps that don't work the same way than Nandub & WMP
[22:28] <Suiryc> I think some apps sometimes check a value of 1152 to know it was made by Nandub
[22:28] <Belgabor|Home> ok, i see the point
[22:29] <Belgabor|Home> faulty concept stays faulty
[22:33] <Suiryc> k I checked
[22:33] <Suiryc> keeping 1152 shoudln't cause too much problems
[22:33] <Suiryc> for Mpeg1-Layer2/3 the mas is near 1750 bytes long
[22:34] <Suiryc> there could be problems with Mpeg2/2.5-layer2/3
[22:34] <Suiryc> where a 160kbps stream of 8kHz have frames of 2881 bytes long at most
[22:35] <Suiryc> anyway I don't think people use this kind of stream
[22:36] <ChristianHJW> highly unlikely ..
[22:51] <Suiryc> nite
[22:52] * Suiryc has left #matroska
[/more]
Добавлено: Если нет желания вдаваться в текст, достаточно запомнить главное: если не хотите однажды вечером в темном переулке получить пыльным мешком по голове - не пользуйте VBR, кодируя видео.