-
Notifications
You must be signed in to change notification settings - Fork 96
Old "pre-live" caption ended up in harvested video #51
Comments
Hi @taschmidt You should be able to easily delete the text in your S3 bucket. If you look at the first few VTT caption files in your S3 bucket they can be opened as a text file, and with show caption inside. You can delete the text and re-upload. The S3 console makes you download and re-upload files. If you want it even easier a program like CyberDuck or Transmit lets open files to edit in a text editor and auto uploads when you save the file. Let me know if that works. video_1.vtt
Then Change that to this.
|
So yeah, that's exactly what I ended up doing but it was definitely a hassle. Any idea why that stale text showed up in the harvested stream and how we can avoid this happening every time? |
Hi @taschmidt Currently that is the easiest way. You could have a script blank out the first VTT text file in S3. If you make the first VTT file empty after job harvest, the stream should still work as planned. I will think about this to see if I have an even easier automated way.
To empty
|
Right, but is there a reason why the stream started with an old caption? When we run a MediaPackage harvest job, what controls what subtitles are used? Will the stream always start with the last thing said no matter how long ago? |
Hi @taschmidt. Some background information is that everything that gets sent to AWS MediaPackage including the captions will get saved when a VOD asset is created, and there is not a way to edit what is already ingested into MediaPackage. So in order to remove that initial caption you are mentioning, and get the behavior that you want I believe if you reduce the TTL for items in the Dynamo DB database that is created that may have the desired effect. You will have to test this. But basically what is happening is if someone says something before the stream starts, that gets saved to Dynamo, then the Lambda@Edge function that inserts the captions will keep that caption up on the screen. With a lower TTL in Dynamo older captions will get cleared out faster and you would not see them up on the screen when watching the AWS MediaPackage HLS. There could be other ways that could be achieved in Lambda@Edge such as removing the caption from Dynamo after it is used in Lambda@Edge. Putting this in the backlog for now. |
Looks like the current TTL is 10 minutes? (link) I'm wondering if this would help the problem since the comments we would like excluded could have been spoken as little as 10 seconds before the start time. If we even cut it in half down to 5 minutes, that comment would still appear right? I'm not terribly familiar with the logic in the edge lambda, but is there some logic that could prevent "in process" captions from being sent (i.e. those where the start time is BEFORE the current time)? Or if that's not doable, it looks like our VTTs start at zero. Could that first one be omitted if the start time is prior to that?
|
Hi @taschmidt Didn't see the reply, sorry for late response. The Lambda@Edge takes the newest captions from AWS Dynamo. So if the TTL is set at 2 minutes that should clear out any old captions that are older than 2 minutes if that makes sense. |
We have an automated post-process step that runs a MediaPackage harvest job. The resulting harvested video had, in the subtitles embedded in the HLS stream, someone's commented that was said before we went live (a couple minutes before according to one of our editors):
Note the undesired "Don't ask me hard questions" comment. How do we avoid this? I can't even think of an easy way to scrub this comment since it's now embedded in several of the *.TS files in our S3 bucket.
The text was updated successfully, but these errors were encountered: