You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the intended e2e pipeline of RBPN on RnB, and this issue is relevant to #44.
There are several issues to consider/discuss for implementation.
Step-by-step Pipeline explanation
1. Load sampled segments of videos on the first runner (Loader0).
Multiple runners (for loading videos) can exist, but here, only one runner for loading videos is shown. The mechanism/rule for sampling frames can be discussed further, but for now we will sample the frames based on the intended number of segments. Thus if there are 300 frames and we want to make 5 segments and we let the number of latter and former neighboring frames to be 3, the following indices of frames (1-indexed) will be sampled: ( 1 ~ 63 ) => to make SR of Frames 1-60, ( 58 ~ 123 ) => to make SR of Frames 61-120, ( 118 ~ 183 ), ( 178 ~ 243 ), ( 238 ~ 300).
The first and last three frames that do not have former / latter neighboring frames have to copy itself when generating optical flows. Such detail for sampler will be implemented soon.
fyi. the current e2e RBPN that is yet to be ported on rnb does not use nvvl sampler, but rather uploads all frames on memory at once and retrieve necessary segments (1 segment: 7 frames to SR one target image) by indexing.
The sampled segments then will be put into a queue for the next processor to retrieve and run.
2. Execute FlowNet2.0 and RBPN
There can be multiple runners, running two neural networks (FlowNet & RBPN) sequentially as long as there is an item (sampled LR segment) in the queue. Implementation of this step can be modified for further optimization (running FlowNet & RBPN not sequentially, but concurrently .. etc.). The runner process in this step will receive a sampled LR segment and output corresponding SR images.
3. Encode SR frames to video
This step also needs more discussion as there might be a better way of gathering SR images per video. But what we ideally hope to do is to gather SR images that have been generated from different segments into one and encode the frames into respective videos, and save each on disk storage.
ISSUES ON DATASET
SR task is normally tested on Vid4 dataset, which only contains four videos that are already all extracted into frames that range from 30 to 40. In order to receive video as an input, I have manually encoded the frames into videos using ffmpeg. (Step-by-step on how-to is described in Haabibi/RBPN-PyTorch. Since Vid4 does not contain enough videos nor frames per video, it is possible to consider other datasets for testing to measure the effects of RnB in regards to measuring latency and throughput.
ISSUES WITH NVVL LOADING VIDEO FRAMES
Would it be possible to stream segments as frames are sampled using NVVL so that runners in the back are kept occupied?
ISSUES WITH SEGMENT-INDEX
Segments need to be identified at the end when compiling frames from different segments to encode relevant frames into video.
Abigail's To Do List
Port model specific scripts into RnB repo
Rewrite logic for sampler -> Frame-based to Segment-based
Let a runner execute two models (FlowNet2 & RBPN) sequentially after receiving a segment from a queue
(Effectively) Compile video segments and decode them into a video
Any suggestions/ comments/ further discussions on design & implementation are welcomed!
The text was updated successfully, but these errors were encountered:
This is the intended e2e pipeline of RBPN on RnB, and this issue is relevant to #44.
There are several issues to consider/discuss for implementation.
Step-by-step Pipeline explanation
1. Load sampled segments of videos on the first runner (Loader0).
Multiple runners (for loading videos) can exist, but here, only one runner for loading videos is shown. The mechanism/rule for sampling frames can be discussed further, but for now we will sample the frames based on the intended number of segments. Thus if there are 300 frames and we want to make 5 segments and we let the number of latter and former neighboring frames to be 3, the following indices of frames (1-indexed) will be sampled: ( 1 ~ 63 ) => to make SR of Frames 1-60, ( 58 ~ 123 ) => to make SR of Frames 61-120, ( 118 ~ 183 ), ( 178 ~ 243 ), ( 238 ~ 300).
The first and last three frames that do not have former / latter neighboring frames have to copy itself when generating optical flows. Such detail for sampler will be implemented soon.
fyi. the current e2e RBPN that is yet to be ported on rnb does not use nvvl sampler, but rather uploads all frames on memory at once and retrieve necessary segments (1 segment: 7 frames to SR one target image) by indexing.
The sampled segments then will be put into a queue for the next processor to retrieve and run.
2. Execute FlowNet2.0 and RBPN
There can be multiple runners, running two neural networks (FlowNet & RBPN) sequentially as long as there is an item (sampled LR segment) in the queue. Implementation of this step can be modified for further optimization (running FlowNet & RBPN not sequentially, but concurrently .. etc.). The runner process in this step will receive a sampled LR segment and output corresponding SR images.
3. Encode SR frames to video
This step also needs more discussion as there might be a better way of gathering SR images per video. But what we ideally hope to do is to gather SR images that have been generated from different segments into one and encode the frames into respective videos, and save each on disk storage.
ISSUES ON DATASET
SR task is normally tested on Vid4 dataset, which only contains four videos that are already all extracted into frames that range from 30 to 40. In order to receive video as an input, I have manually encoded the frames into videos using ffmpeg. (Step-by-step on how-to is described in Haabibi/RBPN-PyTorch. Since Vid4 does not contain enough videos nor frames per video, it is possible to consider other datasets for testing to measure the effects of RnB in regards to measuring latency and throughput.
ISSUES WITH NVVL LOADING VIDEO FRAMES
Would it be possible to stream segments as frames are sampled using NVVL so that runners in the back are kept occupied?
ISSUES WITH SEGMENT-INDEX
Segments need to be identified at the end when compiling frames from different segments to encode relevant frames into video.
Abigail's To Do List
Any suggestions/ comments/ further discussions on design & implementation are welcomed!
The text was updated successfully, but these errors were encountered: