Skip to content
This repository has been archived by the owner on Jan 7, 2023. It is now read-only.

pipeline with 'mfxh264dec' gives larger end-to-end latency than 'avh264_dec' #174

Open
Robbie-Juelich opened this issue Jun 3, 2019 · 1 comment

Comments

@Robbie-Juelich
Copy link

Problem:

With the same H264 RTP sender stream, receiver pipeline with 'mfxh264dec' gives larger latency (30~60ms) than pipeline with ‘avdec_h264.

In my application, end-to-end latency is critical.

Since mfxh264dec with HW acceleration has less CPU usage than avdec_h264,

it would be great if mfxh264dec could give the (almost) same latency as avdec_264.

Plarform:
Win10, VS2017, gstreamer-1.16, Media SDK 2017
gstreamer MSDK plugin from:
https://github.com/intel/gstreamer-media-SDK.git
(Built with OpenGL support using VS2017)

Note 'mfxh264dec' is built WITH OpenGl support:

C:\Users\UI>gst-inspect-1.0 --gst-plugin-load=C:\gst_MSDK_intel\gst-mfx-build_msvc\gst\mfx\gstmfx.dll mfxh264dec
Factory Details:
Rank primary + 3 (259)
Long-name MFX H264 decoder
Klass Codec/Decoder/Video
Description An MFX-based H264 video decoder
Author Ishmael Sameen[email protected]

Plugin Details:
Name mfx
Description MFX encoder/decoder/video post-processing plugins
Filename C:\gst_MSDK_intel\gst-mfx-build_msvc\gst\mfx\gstmfx.dll
Version 2.0.2
License LGPL
Source module gst_mfx
Binary package gst_mfx
Origin URL http://www.intel.com

GObject
+----GInitiallyUnowned
+----GstObject
+----GstElement
+----GstVideoDecoder
+----GstMfxDec_h264

Pad Templates:
SINK template: 'sink'
Availability: Always
Capabilities:
video/x-h264
alignment: au
profile: { (string)constrained-baseline, (string)baseline, (string)main, (string)high }
stream-format: byte-stream
SRC template: 'src'
Availability: Always
Capabilities:
video/x-raw(memory:MFXSurface)
format: { (string)NV12, (string)BGRA, (string)P010_10LE, (string)YUY2, (string)ENCODED }
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]
video/x-raw(memory:GLMemory)
format: { (string)RGBA }
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]
video/x-raw
format: { (string)NV12, (string)BGRA, (string)P010_10LE, (string)YUY2 }
width: [ 1, 2147483647 ]
height: [ 1, 2147483647 ]
framerate: [ 0/1, 2147483647/1 ]

Here are sender and receiver pipelines:

  1. H264 RTP ender pipeline on Nvidia TX2:

gst-launch-1.0 v4l2src device=/dev/video0 ! 'video/x-raw, format=UYVY, framerate=30/1' !
nvvidconv ! 'video/x-raw(memory:NVMM),format=I420, framerate=30/1' !
omxh264enc bitrate=$bitrate MeasureEncoderLatency=true ! 'video/x-h264, stream-format=(string)byte-stream' !
rtph264pay ! udpsink host=$ip port=$port sync=false async=false

Using wireshark, I confirmed that the sps.pic_order_cnt_type is 2.

  1. Receiver pipeline with avdec_h264:

gst-launch-1.0 udpsrc port=5000 caps="application/x-rtp, encoding-name=H264,payload=96" !
rtph264depay ! h264parse ! avh264_dec ! glimagesink sync=false async=false

  1. Receiver pipeline with mfxh264dec:

gst-launch-1.0 --gst-plugin-load=C:\gst_MSDK_intel\gst-mfx-build_msvc\gst\mfx\gstmfx.dll udpsrc port=5000 caps="application/x-rtp, encoding-name=H264,payload=96" !
rtph264depay ! h264parse ! mfxh264dec live-mode=true ! glimagesink sync=false async=false

I found the latency topic was discussed:
https://software.intel.com/en-us/forums/intel-media-sdk/topic/704136
And in this discussion, Dmitry E. (Intel) Thu, 06/22/2017 - 14:37 gave possible solutions.

Now here if 'live-mode=true',

In gst-libs/mfx/gstmfxdecoder.c, line 333~line 339:

if (live_mode) {
decoder->params.AsyncDepth = 1;
decoder->bs.DataFlag = MFX_BITSTREAM_COMPLETE_FRAME;
/* Hack for H264 low-latency streaming */
if (decoder->params.mfx.CodecId == MFX_CODEC_AVC)
decoder->params.mfx.DecodedOrder = 1;
}

And remember sender's sps.pic_order_cnt_type == 2.

So ALL possible solutions have been applied,
but the latency is still there.

Any suggestion?

@ishmael1985
Copy link
Contributor

@Robbie-Juelich indeed all options needed for low-latency decoding have been set for your scenario, can you try with mfxsink and let me know of the latency? Using glimagesink incurs an additional NV12->RGBA CSC operation which may contribute to the latency.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants