Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve FrameInfo handling #8099

Merged
merged 11 commits into from
Nov 12, 2024
3 changes: 2 additions & 1 deletion crates/store/re_video/src/decode/av1.rs
Original file line number Diff line number Diff line change
Expand Up @@ -253,9 +253,10 @@ fn create_frame(debug_name: &str, picture: &dav1d::Picture) -> Result<Frame> {
format,
},
info: FrameInfo {
is_sync: None, // TODO(emilk)
presentation_timestamp: Time(picture.timestamp().unwrap_or(0)),
duration: Time(picture.duration()),
..Default::default()
latest_decode_timestamp: None,
},
})
}
Expand Down
7 changes: 7 additions & 0 deletions crates/store/re_video/src/decode/ffmpeg_h264/ffmpeg.rs
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,11 @@ impl From<Error> for crate::decode::Error {
/// ffmpeg does not tell us the timestamp/duration of a given frame, so we need to remember it.
#[derive(Clone)]
struct FfmpegFrameInfo {
/// The start of a new [`crate::demux::GroupOfPictures`]?
///
/// If true, an entire frame can be decoded from this one sample,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought for very long that this was true. But now I'm not so sure anymore that we are guaranteed to get a frame back from a decoder when we enqueue such a frame.
I would have expected PTS==DTS on those, but that part I know for sure now is not the case.

One subtle problem with this statement that I'm a bit more sure about is that while we have a sample -> frame, it may still be that a frame needs meta information from other samples even though this is supposed to be an IDR frame.
For instance in AV1, the information we append right now in our annex_b stream in the ffmpeg decoder is dispersed over the samples of the video and I have no ideas if there's any guarantees at all.

With that in mind I'd be in favor for leaving this out of the doc comment & tooltip (affects a bunch of places where this is copy pasted) until we understand better what we're dealing with.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this is what Claude 3.5 believes:

image

…but I'm sure it's mostly trained on common misconceptions

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had mixed success with LLMs in this area. And I've got too often burned by now believing the simple answers :/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a strong conviction here as described.
What bugs me about all this is that the is_sync is derived in quite strange way in the mp4 crate which I don't follow and I'd rather have per codec defined things that are easier to look up :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we obviously depend on it meaning something though as it governs our enqueuing strategy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe just slap a We generally assume in front of it 🤷

/// otherwise it needs the context of other samples.
is_sync: bool,
presentation_timestamp: Time,
duration: Time,
decode_timestamp: Time,
Expand Down Expand Up @@ -575,6 +580,7 @@ fn read_ffmpeg_output(
format: pixel_format.clone(),
},
info: FrameInfo {
is_sync: Some(frame_info.is_sync),
presentation_timestamp: frame_info.presentation_timestamp,
duration: frame_info.duration,
latest_decode_timestamp: Some(frame_info.decode_timestamp),
Expand Down Expand Up @@ -690,6 +696,7 @@ impl AsyncDecoder for FfmpegCliH264Decoder {
// We send the information about this chunk first.
// Chunks are defined to always yield a single frame.
let frame_info = FfmpegFrameInfo {
is_sync: chunk.is_sync,
presentation_timestamp: chunk.presentation_timestamp,
decode_timestamp: chunk.decode_timestamp,
duration: chunk.duration,
Expand Down
28 changes: 11 additions & 17 deletions crates/store/re_video/src/decode/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,9 @@ pub fn new_decoder(
/// For details on how to interpret the data, see [`crate::Sample`].
pub struct Chunk {
/// The start of a new [`crate::demux::GroupOfPictures`]?
///
/// If true, an entire frame can be decoded from this one sample,
/// otherwise it needs the context of other samples.
pub is_sync: bool,

pub data: Vec<u8>,
Expand Down Expand Up @@ -250,17 +253,18 @@ pub type FrameContent = webcodecs::WebVideoFrame;
/// Meta information about a decoded video frame, as reported by the decoder.
#[derive(Debug, Clone)]
pub struct FrameInfo {
/// The presentation timestamp of the frame.
/// The start of a new [`crate::demux::GroupOfPictures`]?
///
/// Decoders are required to report this.
/// A timestamp of [`Time::MAX`] indicates that the frame is invalid or not yet available.
/// If true, an entire frame can be decoded from this one sample,
/// otherwise it needs the context of other samples.
///
/// None indicates that the information is not available.
pub is_sync: Option<bool>,

/// The presentation timestamp of the frame.
pub presentation_timestamp: Time,

/// How long the frame is valid.
///
/// Decoders are required to report this.
/// A duration of [`Time::MAX`] indicates that the frame is invalid or not yet available.
// Implementation note: unlike with presentation timestamp we may be able fine with making this optional.
Wumpf marked this conversation as resolved.
Show resolved Hide resolved
pub duration: Time,

/// The decode timestamp of the last chunk that was needed to decode this frame.
Expand All @@ -269,16 +273,6 @@ pub struct FrameInfo {
pub latest_decode_timestamp: Option<Time>,
}

impl Default for FrameInfo {
fn default() -> Self {
Self {
presentation_timestamp: Time::MAX,
duration: Time::MAX,
latest_decode_timestamp: None,
}
}
}

impl FrameInfo {
/// Presentation timestamp range in which this frame is valid.
pub fn presentation_time_range(&self) -> std::ops::Range<Time> {
Expand Down
3 changes: 2 additions & 1 deletion crates/store/re_video/src/decode/webcodecs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -204,9 +204,10 @@ fn init_video_decoder(
on_output(Ok(Frame {
content: WebVideoFrame(frame),
info: FrameInfo {
is_sync: None,
presentation_timestamp,
duration,
..Default::default()
latest_decode_timestamp: None,
},
}));
}) as Box<dyn Fn(web_sys::VideoFrame)>)
Expand Down
5 changes: 4 additions & 1 deletion crates/store/re_video/src/demux/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -457,7 +457,10 @@ impl GroupOfPictures {
/// > The decoding of each access unit results in one decoded picture.
#[derive(Debug, Clone)]
pub struct Sample {
/// Is t his the start of a new [`GroupOfPictures`]?
/// Is this the start of a new [`GroupOfPictures`]?
///
/// If true, an entire frame can be decoded from this one sample,
/// otherwise it needs the context of other samples.
pub is_sync: bool,

/// Time at which this sample appears in the decoded bitstream, in time units.
Expand Down
3 changes: 2 additions & 1 deletion crates/utils/re_log/src/setup.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ pub fn setup_logging() {
use std::io::Write as _;
writeln!(
buf,
"{}:{} {}",
"{} {}:{} {}",
record.level(),
Comment on lines +42 to +43
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this? I'm lacking context

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

context

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant

  • why does it make it better - before/after
  • what has this to do with anything in this pr

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like to flip that switch so I can see where a log message comes from.
Before I couldn't see the log level when I did so. Now I can.

image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was completely unrelated to the PR, except I wanted to debug "what the hell is wring that annoying log message"

record.file().unwrap_or_default(),
record.line().unwrap_or_default(),
record.args()
Expand Down
53 changes: 36 additions & 17 deletions crates/viewer/re_data_ui/src/video.rs
Original file line number Diff line number Diff line change
Expand Up @@ -166,14 +166,16 @@ pub fn show_decoded_frame_info(
}) => {
re_ui::list_item::list_item_scope(ui, "decoded_frame_ui", |ui| {
let default_open = false;
ui.list_item_collapsible_noninteractive_label(
"Current decoded frame",
default_open,
|ui| {
frame_info_ui(ui, &frame_info, video.data());
source_image_data_format_ui(ui, &source_pixel_format);
},
);
if let Some(frame_info) = frame_info {
ui.list_item_collapsible_noninteractive_label(
"Current decoded frame",
default_open,
|ui| {
frame_info_ui(ui, &frame_info, video.data());
source_image_data_format_ui(ui, &source_pixel_format);
},
);
}
});

let response = crate::image::texture_preview_ui(
Expand Down Expand Up @@ -226,14 +228,31 @@ fn samples_statistics_ui(ui: &mut egui::Ui, samples_statistics: &SamplesStatisti
}

fn frame_info_ui(ui: &mut egui::Ui, frame_info: &FrameInfo, video_data: &re_video::VideoData) {
let time_range = frame_info.presentation_time_range();
let FrameInfo {
is_sync,
presentation_timestamp,
duration,
latest_decode_timestamp,
} = *frame_info;

if let Some(is_sync) = is_sync {
ui.list_item_flat_noninteractive(PropertyContent::new("Sync").value_bool(is_sync))
.on_hover_text(
"The start of a new GOP (Group of Frames)?\n\
\n\
If true, an entire frame can be decoded from this one sample, \
otherwise it needs the context of other samples.",
);
}

let presentation_time_range = presentation_timestamp..presentation_timestamp + duration;
ui.list_item_flat_noninteractive(PropertyContent::new("Time range").value_text(format!(
"{} - {}",
re_format::format_timestamp_seconds(time_range.start.into_secs_since_start(
re_format::format_timestamp_seconds(presentation_time_range.start.into_secs_since_start(
video_data.timescale,
video_data.samples_statistics.minimum_presentation_timestamp
)),
re_format::format_timestamp_seconds(time_range.end.into_secs_since_start(
re_format::format_timestamp_seconds(presentation_time_range.end.into_secs_since_start(
video_data.timescale,
video_data.samples_statistics.minimum_presentation_timestamp
)),
Expand All @@ -255,7 +274,7 @@ fn frame_info_ui(ui: &mut egui::Ui, frame_info: &FrameInfo, video_data: &re_vide
}
}

if let Some(dts) = frame_info.latest_decode_timestamp {
if let Some(dts) = latest_decode_timestamp {
ui.list_item_flat_noninteractive(
PropertyContent::new("DTS").value_fn(value_fn_for_time(dts, video_data)),
)
Expand All @@ -264,7 +283,7 @@ fn frame_info_ui(ui: &mut egui::Ui, frame_info: &FrameInfo, video_data: &re_vide
}

ui.list_item_flat_noninteractive(
PropertyContent::new("PTS").value_fn(value_fn_for_time(frame_info.presentation_timestamp, video_data)),
PropertyContent::new("PTS").value_fn(value_fn_for_time(presentation_timestamp, video_data)),
)
.on_hover_text("Raw presentation timestamp prior to applying the timescale.\n\
This specifies the time at which the frame should be shown relative to the start of a video stream.");
Expand All @@ -277,11 +296,11 @@ fn frame_info_ui(ui: &mut egui::Ui, frame_info: &FrameInfo, video_data: &re_vide
.has_sample_highest_pts_so_far
.as_ref()
{
if let Some(sample_idx) = video_data
.latest_sample_index_at_presentation_timestamp(frame_info.presentation_timestamp)
if let Some(sample_idx) =
video_data.latest_sample_index_at_presentation_timestamp(presentation_timestamp)
{
ui.list_item_flat_noninteractive(
PropertyContent::new("Highest PTS so far").value_text(has_sample_highest_pts_so_far[sample_idx].to_string())
PropertyContent::new("Highest PTS so far").value_bool(has_sample_highest_pts_so_far[sample_idx])
).on_hover_text("Whether the presentation timestamp (PTS) at the this frame is the highest encountered so far. If false there are lower PTS values prior in the list.");
}
}
Expand All @@ -290,7 +309,7 @@ fn frame_info_ui(ui: &mut egui::Ui, frame_info: &FrameInfo, video_data: &re_vide
// Information about the current group of pictures this frame is part of.
// Lookup via decode timestamp is faster, but it may not always be available.
if let Some(gop_index) =
video_data.gop_index_containing_presentation_timestamp(frame_info.presentation_timestamp)
video_data.gop_index_containing_presentation_timestamp(presentation_timestamp)
{
ui.list_item_flat_noninteractive(
PropertyContent::new("GOP index").value_text(gop_index.to_string()),
Expand Down
15 changes: 9 additions & 6 deletions crates/viewer/re_renderer/src/video/chunk_decoder.rs
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ impl VideoChunkDecoder {
Err(err) => {
// Many of the errors we get from a decoder are recoverable.
// They may be very frequent, but it's still useful to see them in the debug log for troubleshooting.
re_log::debug!("Error during decoding of {debug_name}: {err}");
re_log::debug_once!("Error during decoding of {debug_name}: {err}");

let err = VideoPlayerError::Decoding(err);
let mut output = decoder_output.lock();
Expand All @@ -80,7 +80,7 @@ impl VideoChunkDecoder {
}

/// Start decoding the given chunk.
pub fn decode(&mut self, chunk: Chunk, _is_keyframe: bool) -> Result<(), VideoPlayerError> {
pub fn decode(&mut self, chunk: Chunk) -> Result<(), VideoPlayerError> {
self.decoder.submit_chunk(chunk)?;
Ok(())
}
Expand Down Expand Up @@ -119,9 +119,12 @@ impl VideoChunkDecoder {

let frame_time_range = frame.info.presentation_time_range();

if frame_time_range.contains(&presentation_timestamp)
&& video_texture.frame_info.presentation_time_range() != frame_time_range
{
let is_up_to_date = video_texture
.frame_info
.as_ref()
.is_some_and(|info| info.presentation_time_range() == frame_time_range);

if frame_time_range.contains(&presentation_timestamp) && !is_up_to_date {
#[cfg(target_arch = "wasm32")]
{
video_texture.source_pixel_format = copy_web_video_frame_to_texture(
Expand All @@ -139,7 +142,7 @@ impl VideoChunkDecoder {
)?;
}

video_texture.frame_info = frame.info.clone();
video_texture.frame_info = Some(frame.info.clone());
}

Ok(())
Expand Down
11 changes: 2 additions & 9 deletions crates/viewer/re_renderer/src/video/mod.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
mod chunk_decoder;
mod player;

use std::{collections::hash_map::Entry, ops::Range, sync::Arc};
use std::{collections::hash_map::Entry, sync::Arc};

use ahash::HashMap;
use parking_lot::Mutex;
Expand Down Expand Up @@ -64,14 +64,7 @@ pub struct VideoFrameTexture {
pub source_pixel_format: SourceImageDataFormat,

/// Meta information about the decoded frame.
pub frame_info: re_video::decode::FrameInfo,
}

impl VideoFrameTexture {
pub fn presentation_time_range(&self) -> Range<re_video::Time> {
self.frame_info.presentation_timestamp
..self.frame_info.presentation_timestamp + self.frame_info.duration
}
pub frame_info: Option<re_video::decode::FrameInfo>,
}

/// Identifier for an independent video decoding stream.
Expand Down
48 changes: 24 additions & 24 deletions crates/viewer/re_renderer/src/video/player.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ impl TimedDecodingError {
/// A texture of a specific video frame.
pub struct VideoTexture {
pub texture: GpuTexture2D,
pub frame_info: FrameInfo,
pub frame_info: Option<FrameInfo>,
pub source_pixel_format: SourceImageDataFormat,
}

Expand Down Expand Up @@ -101,7 +101,7 @@ impl VideoPlayer {

video_texture: VideoTexture {
texture,
frame_info: FrameInfo::default(),
frame_info: None,
source_pixel_format: SourceImageDataFormat::WgpuCompatible(
wgpu::TextureFormat::Rgba8Unorm,
),
Expand Down Expand Up @@ -136,28 +136,25 @@ impl VideoPlayer {
.min(self.data.duration + self.data.samples_statistics.minimum_presentation_timestamp); // Don't seek past the end of the video.

let error_on_last_frame_at = self.last_error.is_some();
let result = self.frame_at_internal(render_ctx, presentation_timestamp, video_data);
self.frame_at_internal(render_ctx, presentation_timestamp, video_data)?;

match result {
Ok(()) => {
let is_active_frame = self
.video_texture
.frame_info
.presentation_time_range()
.contains(&presentation_timestamp);
let frame_info = self.video_texture.frame_info.clone();

if let Some(frame_info) = frame_info {
let time_range = frame_info.presentation_time_range();
let is_active_frame = time_range.contains(&presentation_timestamp);

let is_pending = !is_active_frame;
if is_pending && error_on_last_frame_at {

let show_spinner = if is_pending && error_on_last_frame_at {
// If we switched from error to pending, clear the texture.
// This is important to avoid flickering, in particular when switching from
// benign errors like DecodingError::NegativeTimestamp.
// If we don't do this, we see the last valid texture which can look really weird.
clear_texture(render_ctx, &self.video_texture.texture);
self.video_texture.frame_info = FrameInfo::default();
}

let time_range = self.video_texture.frame_info.presentation_time_range();
let show_spinner = if presentation_timestamp < time_range.start {
self.video_texture.frame_info = None;
true
} else if presentation_timestamp < time_range.start {
// We're seeking backwards and somehow forgot to reset.
true
} else if presentation_timestamp < time_range.end {
Expand All @@ -170,17 +167,21 @@ impl VideoPlayer {
true // Very old frame - show spinner
}
};

Ok(VideoFrameTexture {
texture: self.video_texture.texture.clone(),
is_pending,
show_spinner,
frame_info: self.video_texture.frame_info.clone(),
frame_info: Some(frame_info),
source_pixel_format: self.video_texture.source_pixel_format,
})
} else {
Ok(VideoFrameTexture {
texture: self.video_texture.texture.clone(),
is_pending: true,
show_spinner: true,
frame_info: None,
source_pixel_format: self.video_texture.source_pixel_format,
})
}

Err(err) => Err(err),
}
}

Expand Down Expand Up @@ -324,10 +325,9 @@ impl VideoPlayer {

let samples = &self.data.samples[gop.decode_time_range()];

for (i, sample) in samples.iter().enumerate() {
for sample in samples {
let chunk = sample.get(video_data).ok_or(VideoPlayerError::BadData)?;
let is_keyframe = i == 0;
self.chunk_decoder.decode(chunk, is_keyframe)?;
self.chunk_decoder.decode(chunk)?;
}

Ok(())
Expand Down
Loading
Loading