You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the issue
Related to #929 and pull request #938. Part of the problem appears to be coming from how create_hls is written and is called, in that it appears to be running on the same video ID every time any MP4 encoding finishes. Which means if you have 4-5 different resolutions, create_hls appears to be intentionally running 4-5 times. When you have huge videos, this results in atrocious amounts of disk I/O. We need to figure out a way to get create_hls to only run after the very last MP4 encoding has finished.
See here for reference, walking up the call stack to where this is coming from:
If you note in that last one, that's ideally not being done per chunk (which is what #938 should fix) but is still done on every successful encoding, which means every enabled MP4 encoding profile should ultimately result in a call to create_hls if the encoding is successful.
To Reproduce
Steps to reproduce the issue:
Make sure your Celery is configured to allow for more concurrent tasks than you have MP4 encoding profiles so that this can all happen concurrently.
Upload a video that takes at least an hour to transcode, like a high-resolution video which is several hours long.
Watch your tasks in ps or top. You should see Bento running multiple times, and if your processing is long enough or your disk access slow enough, you should eventually see those Bento processes finish and the cp start building up and colliding.
Expected behavior create_hls should only run once per Media record, when all the MP4 encoding processes for that video have finished.
Screenshots
N/A
Environment (please complete the following information):
OS: Ubuntu Linux
Installation method: single server install
Browser, if applicable: N/A
Additional context
N/A
The text was updated successfully, but these errors were encountered:
The description is valid, this was made on purpose neglecting the fact that it adds extra overhead on large videos (I hadn't noticed that it takes time/resources, thought it would be a very light process).
@mgogoulos Thanks for reviewing this! So, for an example of what we're dealing with, looking at one video with a duration of about 3 hours (this is not an unusual case - typically around one of these per day and it's legitimate user-created video):
Media record shows size 4174.4MB
HLS directory shows usage of 5.3GB
Searching through the encoded directory for MP4 files matching this video's filename yields 24.1GB of aggregate data
So with five MP4 encode profiles enabled (assuming it's working as intended and is only running create_hls once per successful encoding and not multiple times based on multiple chunks ending and triggering it), you're still looking at somewhere in the range of about 5*(24.1+5.3)=147GB of data transfer. If, instead, it's 31 concurrent instances (one Bento + 30 cp) like in #938 then it'd be somewhere in the range of 911GB, and that was already nearing the end of the encoding process, so we're looking at 4GB video files resulting in well over a terabyte of data transfer to/from the disks.
Just want to bring up that there were so many instances of this running concurrently on a recent (very long) video that it OOM'd the Celery instance and had to be manually restarted.
Describe the issue
Related to #929 and pull request #938. Part of the problem appears to be coming from how
create_hls
is written and is called, in that it appears to be running on the same video ID every time any MP4 encoding finishes. Which means if you have 4-5 different resolutions,create_hls
appears to be intentionally running 4-5 times. When you have huge videos, this results in atrocious amounts of disk I/O. We need to figure out a way to getcreate_hls
to only run after the very last MP4 encoding has finished.See here for reference, walking up the call stack to where this is coming from:
mediacms/files/tasks.py
Line 408 in c5047d8
mediacms/files/models.py
Line 642 in c5047d8
mediacms/files/models.py
Line 1564 in c5047d8
If you note in that last one, that's ideally not being done per chunk (which is what #938 should fix) but is still done on every successful encoding, which means every enabled MP4 encoding profile should ultimately result in a call to
create_hls
if the encoding is successful.To Reproduce
Steps to reproduce the issue:
ps
ortop
. You should see Bento running multiple times, and if your processing is long enough or your disk access slow enough, you should eventually see those Bento processes finish and thecp
start building up and colliding.Expected behavior
create_hls
should only run once per Media record, when all the MP4 encoding processes for that video have finished.Screenshots
N/A
Environment (please complete the following information):
Additional context
N/A
The text was updated successfully, but these errors were encountered: