Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AV1 frames can't be decoded in chromium (if the browser runs in linux, works properly on windows) #299

Open
3DRX opened this issue Mar 8, 2025 · 7 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@3DRX
Copy link

3DRX commented Mar 8, 2025

Your environment.

sender

	github.com/pion/mediadevices v0.7.2-0.20250305024045-959d53708012 // a version that use the latest pion/rtp and pion/webrtc/v4
	github.com/pion/rtp v1.8.12
	github.com/pion/webrtc/v4 v4.0.13

receiver

chromium M134 that supports AV1 decode in WebRTC, both on windows and on linux.

What did you do?

I'm creating a game streaming server with the client running in browser. I use ffmpeg (go-astiav) with h264_nvenc and av1_nvenc to encode the videos (made a custom mediadevices codec module that implements the interfaces in pion/mediadevices so I can just use it with GetUserMedia). The H264 one works perfectly, when I switch the codec to AV1, the client running chromium on linux (debian 12) can only decode the first frame. I've checked and make sure that the sdp and signaling works as expected, since I also have a data-channel in the peer connection which is working normally. When I switched my client to a chrome browser running on windows, the AV1 decode just works.

This is super wierd, I'm not sure if this is a problem in pion or in chromium, since I'm not familar with the AV1 decoder implementation in libwebrtc.

What did you expect?

Decode shouldn't be different between platforms. Using console.log(RTCRtpReceiver.getCapabilities("video")) I can confirm that both chromium in linux and in windows can handle AV1 decode in RTPReceiver.


I'd like to provide any related informations and code if needed, if anyone have a clue about what could be the problem please help! Also, thanks to @JoeTurki for the AV1 support!

@3DRX 3DRX changed the title AV1 frames can't be decoded in chromium (only if the client it linux) AV1 frames can't be decoded in chromium (if the browser runs in linux, works properly on windows) Mar 8, 2025
@JoeTurki
Copy link
Member

JoeTurki commented Mar 8, 2025

Hello @3DRX
Can you share a media file, I would happy to test, But this is most likely a platform issue, do you by any chance use HW decoding? Also what is your encoder settings? Maybe try to set the GOP size to 2s? some hardware decoders struggled with bigger GOP size while i was testing.

@JoeTurki JoeTurki self-assigned this Mar 8, 2025
@3DRX
Copy link
Author

3DRX commented Mar 8, 2025

Hi @JoeTurki thanks for the reply! For encoding settings, I've tried multiple different GOP sizes, varied from 1s to 3s, but the problem still exists. The following is the code of my encoder setting.

	codec := astiav.FindEncoderByName(params.codecName)
	if codec == nil {
		return nil, fmt.Errorf("codec not found: %s", params.codecName)
	}

	codecCtx := astiav.AllocCodecContext(codec)
	if codecCtx == nil {
		return nil, fmt.Errorf("failed to allocate codec context")
	}

	// Configure codec context
	codecCtx.SetWidth(p.Width)
	codecCtx.SetHeight(p.Height)
	codecCtx.SetTimeBase(astiav.NewRational(1, int(p.FrameRate)))
	codecCtx.SetFramerate(codecCtx.TimeBase().Invert())
	codecCtx.SetPixelFormat(astiav.PixelFormat(astiav.PixelFormatCuda))
	codecCtx.SetBitRate(int64(params.BitRate))
	codecCtx.SetGopSize(params.KeyFrameInterval)
	codecOptions := codecCtx.PrivateData().Options()
	switch params.codecName {
	case "av1_nvenc":
		codecCtx.SetProfile(astiav.Profile(astiav.ProfileAv1Main))
		codecOptions.Set("tier", "0", 0)
	}
	codecOptions.Set("zerolatency", "1", 0)
	codecOptions.Set("delay", "0", 0)
	codecOptions.Set("tune", "ull", 0)
	codecOptions.Set("preset", "p1", 0)
	codecOptions.Set("rc", "cbr", 0)

btw, I just found out that chromium on macos also doesn't properly decode AV1 sent by pion, so for now only chromium on windows works in my case.

As for the media file, do you mean a ivf file created by ivfwriter? I noticed that the input of IVFWriter WriteRTP function is *rtp.Packet, but at send side using pion/mediadevices it's not straight forward for me to access every *rtp.Packet produced. Is there an easier way to reproduce, such as using the codec module with a default camera driver to create a send side?
Here is the implementation of my codec module using ffmpeg:

package ffmpeg

import (
	"bufio"
	"errors"
	"fmt"
	"io"
	"log/slog"
	"os"
	"sync"

	"github.com/asticode/go-astiav"
	"github.com/pion/mediadevices/pkg/codec"
	"github.com/pion/mediadevices/pkg/io/video"
	"github.com/pion/mediadevices/pkg/prop"
	"github.com/pion/rtp/codecs"
	"github.com/pion/webrtc/v4"
)

type Params struct {
	codec.BaseParams
	codecName string
}

type encoder struct {
	codec          *astiav.Codec
	codecCtx       *astiav.CodecContext
	hwFramesCtx    *astiav.HardwareFramesContext
	frame          *astiav.Frame
	hwFrame        *astiav.Frame
	packet         *astiav.Packet
	width          int
	height         int
	r              video.Reader
	nextIsKeyFrame bool

	// for stats
	statsItemChan chan StatsItem

	mu     sync.Mutex
	closed bool
}

type H264Params struct {
	Params
}

func NewH264Params() (H264Params, error) {
	return H264Params{
		Params: Params{
			codecName: "h264_nvenc",
		},
	}, nil
}

// RTPCodec represents the codec metadata
func (p *H264Params) RTPCodec() *codec.RTPCodec {
	return codec.NewRTPH264Codec(90000)
}

func (p *H264Params) BuildVideoEncoder(r video.Reader, property prop.Media) (codec.ReadCloser, error) {
	readCloser, err := newEncoder(r, property, p.Params)
	if err != nil {
		slog.Error("failed to create new encoder", "error", err)
		return nil, err
	}
	slog.Info("sucsessfully created new encoder")
	return readCloser, nil
}

type H265Params struct {
	Params
}

func NewH265Params() (H265Params, error) {
	return H265Params{
		Params: Params{
			codecName: "hevc_nvenc",
		},
	}, nil
}

func (p *H265Params) RTPCodec() *codec.RTPCodec {
	return NewRTPH265Codec(90000)
}

func (p *H265Params) BuildVideoEncoder(r video.Reader, property prop.Media) (codec.ReadCloser, error) {
	readCloser, err := newEncoder(r, property, p.Params)
	if err != nil {
		slog.Error("failed to create new encoder", "error", err)
		return nil, err
	}
	slog.Info("sucsessfully created new encoder")
	return readCloser, nil
}

type AV1Params struct {
	Params
}

func NewAV1Params() (AV1Params, error) {
	return AV1Params{
		Params: Params{
			codecName: "av1_nvenc",
		},
	}, nil
}

func (p *AV1Params) RTPCodec() *codec.RTPCodec {
	return NewRTPAV1Codec(90000)
}

func (p *AV1Params) BuildVideoEncoder(r video.Reader, property prop.Media) (codec.ReadCloser, error) {
	readCloser, err := newEncoder(r, property, p.Params)
	if err != nil {
		slog.Error("failed to create new encoder", "error", err)
		return nil, err
	}
	slog.Info("sucsessfully created new encoder")
	return readCloser, nil
}

func NewRTPH265Codec(clockrate uint32) *codec.RTPCodec {
	return &codec.RTPCodec{
		RTPCodecParameters: webrtc.RTPCodecParameters{
			RTPCodecCapability: webrtc.RTPCodecCapability{
				MimeType:     webrtc.MimeTypeH265,
				ClockRate:    90000,
				Channels:     0,
				SDPFmtpLine:  "",
				RTCPFeedback: nil,
			},
			PayloadType: 125,
		},
		Payloader: &codecs.H265Payloader{},
	}
}

func NewRTPAV1Codec(clockrate uint32) *codec.RTPCodec {
	return &codec.RTPCodec{
		RTPCodecParameters: webrtc.RTPCodecParameters{
			RTPCodecCapability: webrtc.RTPCodecCapability{
				MimeType:     webrtc.MimeTypeAV1,
				ClockRate:    90000,
				Channels:     0,
				SDPFmtpLine:  "level-idx=5;profile=0;tier=0",
				RTCPFeedback: nil,
			},
			PayloadType: 100,
		},
		Payloader: &codecs.AV1Payloader{},
	}
}

func newEncoder(r video.Reader, p prop.Media, params Params) (*encoder, error) {
	if p.FrameRate == 0 {
		slog.Warn("frame rate is 0, setting to 90")
		p.FrameRate = 90
	}
	slog.Info("creating new encoder", "params", params, "props", p)
	astiav.SetLogLevel(astiav.LogLevel(astiav.LogLevelDebug))

	hwDevice, err := astiav.CreateHardwareDeviceContext(
		astiav.HardwareDeviceType(astiav.HardwareDeviceTypeCUDA),
		"/dev/dri/card1",
		nil,
		0,
	)
	if err != nil {
		return nil, fmt.Errorf("failed to create device: %w", err)
	}

	codec := astiav.FindEncoderByName(params.codecName)
	if codec == nil {
		return nil, fmt.Errorf("codec not found: %s", params.codecName)
	}

	codecCtx := astiav.AllocCodecContext(codec)
	if codecCtx == nil {
		return nil, fmt.Errorf("failed to allocate codec context")
	}

	// Configure codec context
	codecCtx.SetWidth(p.Width)
	codecCtx.SetHeight(p.Height)
	codecCtx.SetTimeBase(astiav.NewRational(1, int(p.FrameRate)))
	codecCtx.SetFramerate(codecCtx.TimeBase().Invert())
	codecCtx.SetPixelFormat(astiav.PixelFormat(astiav.PixelFormatCuda))
	codecCtx.SetBitRate(int64(params.BitRate))
	codecCtx.SetGopSize(params.KeyFrameInterval)
	codecOptions := codecCtx.PrivateData().Options()
	switch params.codecName {
	case "av1_nvenc":
		codecCtx.SetProfile(astiav.Profile(astiav.ProfileAv1Main))
		codecOptions.Set("tier", "0", 0)
	}
	codecOptions.Set("zerolatency", "1", 0)
	codecOptions.Set("delay", "0", 0)
	codecOptions.Set("tune", "ull", 0)
	codecOptions.Set("preset", "p1", 0)
	codecOptions.Set("rc", "cbr", 0)
	// codecOptions.Set("cbr", "1", 0)
	// codecOptions.Set("qp", "5", 0)
	for i, li := range codecOptions.List() {
		fmt.Printf("li %d: %s\n", i, li.Name())
	}

	// Create hardware frames context
	hwFramesCtx := astiav.AllocHardwareFramesContext(hwDevice)
	if hwFramesCtx == nil {
		hwDevice.Free()
		return nil, fmt.Errorf("failed to allocate hw frames context")
	}

	// Set hardware frames context parameters
	hwFramesCtx.SetWidth(p.Width)
	hwFramesCtx.SetHeight(p.Height)
	hwFramesCtx.SetHardwarePixelFormat(astiav.PixelFormat(astiav.PixelFormatCuda))
	hwFramesCtx.SetSoftwarePixelFormat(astiav.PixelFormat(astiav.PixelFormatRgba))
	hwFramesCtx.SetInitialPoolSize(20)

	err = hwFramesCtx.Initialize()
	if err != nil {
		return nil, fmt.Errorf("failed to initialize hw frames context: %w", err)
	}
	codecCtx.SetHardwareFramesContext(hwFramesCtx)

	// Open codec context
	if err := codecCtx.Open(codec, nil); err != nil {
		codecCtx.Free()
		return nil, fmt.Errorf("failed to open codec context: %w", err)
	}

	softwareFrame := astiav.AllocFrame()
	if softwareFrame == nil {
		codecCtx.Free()
		return nil, fmt.Errorf("failed to allocate frame")
	}

	softwareFrame.SetWidth(p.Width)
	softwareFrame.SetHeight(p.Height)
	softwareFrame.SetPixelFormat(astiav.PixelFormat(astiav.PixelFormatRgba))

	err = softwareFrame.AllocBuffer(0)
	if err != nil {
		return nil, fmt.Errorf("failed to allocate sorfware buffer: %w", err)
	}

	hardwareFrame := astiav.AllocFrame()

	err = hardwareFrame.AllocHardwareBuffer(hwFramesCtx)
	if err != nil {
		return nil, fmt.Errorf("failed to allocate hardware buffer: %w", err)
	}

	packet := astiav.AllocPacket()
	if packet == nil {
		softwareFrame.Free()
		codecCtx.Free()
		return nil, fmt.Errorf("failed to allocate packet")
	}

	statsItemChan := make(chan StatsItem, 100)
	go StatsThread(statsItemChan)

	return &encoder{
		codec:          codec,
		codecCtx:       codecCtx,
		hwFramesCtx:    hwFramesCtx,
		frame:          softwareFrame,
		hwFrame:        hardwareFrame,
		packet:         packet,
		width:          p.Width,
		height:         p.Height,
		r:              r,
		nextIsKeyFrame: false,
		statsItemChan:  statsItemChan,
	}, nil
}

func (e *encoder) Controller() codec.EncoderController {
	return e
}

func (e *encoder) Read() ([]byte, func(), error) {
	e.mu.Lock()
	defer e.mu.Unlock()

	if e.closed {
		return nil, func() {}, io.EOF
	}

	img, release, err := e.r.Read()
	if err != nil {
		return nil, func() {}, err
	}
	defer release()

	if e.nextIsKeyFrame {
		e.frame.SetPictureType(astiav.PictureType(astiav.PictureTypeI))
		e.hwFrame.SetPictureType(astiav.PictureType(astiav.PictureTypeI))
		e.nextIsKeyFrame = false
	} else {
		e.frame.SetPictureType(astiav.PictureType(astiav.PictureTypeNone))
		e.hwFrame.SetPictureType(astiav.PictureType(astiav.PictureTypeNone))
	}

	err = e.frame.Data().FromImage(img)
	if err != nil {
		return nil, func() {}, fmt.Errorf("failed to copy image data: %w", err)
	}

	err = e.frame.TransferHardwareData(e.hwFrame)
	if err != nil {
		return nil, func() {}, err
	}

	// Send frame to encoder
	if err := e.codecCtx.SendFrame(e.hwFrame); err != nil {
		return nil, func() {}, fmt.Errorf("failed to send frame: %w", err)
	}

	for {
		if err = e.codecCtx.ReceivePacket(e.packet); err != nil {
			if errors.Is(err, astiav.ErrEof) || errors.Is(err, astiav.ErrEagain) {
				continue
			}
			return nil, func() {}, fmt.Errorf("failed to receive packet: %w", err)
		}
		break
	}

	e.statsItemChan <- StatsItem{
		FrameSize: e.packet.Size(),
	}
	data := make([]byte, e.packet.Size())
	copy(data, e.packet.Data())
	e.packet.Unref()

	return data, func() {}, nil
}

// ForceKeyFrame forces the next frame to be encoded as a keyframe
func (e *encoder) ForceKeyFrame() error {
	e.mu.Lock()
	defer e.mu.Unlock()
	slog.Info("forcing key frame")
	e.nextIsKeyFrame = true
	return nil
}

// SetBitrate updates the encoder's bitrate
func (e *encoder) SetBitrate(bitrate int64) error {
	e.mu.Lock()
	defer e.mu.Unlock()

	e.codecCtx.SetBitRate(bitrate)

	return nil
}

func (e *encoder) Close() error {
	if e.packet != nil {
		e.packet.Free()
	}
	if e.frame != nil {
		e.frame.Free()
	}
	if e.codecCtx != nil {
		e.codecCtx.Free()
	}
	if e.statsItemChan != nil {
		close(e.statsItemChan)
	}
	return nil
}

type StatsItem struct {
	FrameSize int
}

func StatsThread(frameSizeChan chan StatsItem) {
	// open file for writing
	f, err := os.Create("frame_size.csv")
	if err != nil {
		panic(err)
	}
	w := bufio.NewWriter(f)
	w.WriteString("frame_size\n")
	defer f.Close()
	index := 0
	var statsItem StatsItem
	for {
		select {
		case statsItem = <-frameSizeChan:
			if statsItem.FrameSize == 0 {
				// there will be mutiple 0s when the stream is stopped,
				// it's safe to just ignore them
				continue
			}
			// slog.Info("frame size", "size", frameSize)
			_, err := w.WriteString(fmt.Sprintf("%d\n", statsItem.FrameSize))
			if err != nil {
				slog.Error("failed to write frame size to file", "error", err)
			}
			if index%270 == 0 {
				w.Flush()
			}
			index++
		}
	}
}

@3DRX
Copy link
Author

3DRX commented Mar 8, 2025

Just found an example here might be a good way to write encoded video track to file, I will have a look and perhaps record a file tomorrow. Again, huge thanks @JoeTurki !

@JoeTurki JoeTurki added the help wanted Extra attention is needed label Mar 8, 2025
@JoeTurki
Copy link
Member

JoeTurki commented Mar 8, 2025

@3DRX This is most likely not related to Pion, looks like an encoder settings issue, unfortunately i don't have access to av1_nvenc encoder right now to test your code find the correct settings, Also If you disable chrome hardware acceleration with nvenc on windows does it stop working? I think it'll.

For starter, Can you try with PixelFormatYUV420P instead of PixelFormatRgba?

I mean by media file, can you send me any video file encoded with your encoder in any container?

@3DRX
Copy link
Author

3DRX commented Mar 9, 2025

@JoeTurki I agree, this doesn't seems to be a pion's bug since the video works in some case. Perhaps I need to search for related issue in ffmpeg and chromium's community.

I tried PixelFormatYUV420P instead of PixelFormatRgba but it didn't make any difference (should be expected since according to ffmpeg's -h output, av1_nvenc supports rgba)

Image

I also tried using h265_nvenc and h264_nvenc with the same code (only codec name is different), and both of them works fine.

For turning off hardware acceleration with nvenc on windows and videofile, I'm going to try it on monday.

@JoeTurki
Copy link
Member

JoeTurki commented Mar 9, 2025

@3DRX Can you try something simple like play from disk renegotiation maybe with a command like this to generate the video?

ffmpeg -f lavfi -t 10 -i testsrc2=size=1920x1080:rate=30 -c:v av1_nvenc -g 30 -b:v 2M output.ivf

I tried PixelFormatYUV420P instead of PixelFormatRgba but it didn't make any difference (should be expected since according to ffmpeg's -h output, av1_nvenc supports rgba)

I think chromium playback is designed around YUV 420P, Not sure if it'll play nice with RGBA.

I might be have an access to av1_nvenc encoder next month, Then I'll be able to test your code!

@3DRX
Copy link
Author

3DRX commented Mar 24, 2025

Hi, sorry for the late reply, there were other things happened unexpectedly which prevents me from continuing the work on this.
I just came across this thread https://groups.google.com/g/discuss-webrtc/c/pSq-QXvjYgQ/m/cMOyRGYJBQAJ?utm_medium=email&utm_source=footer&pli=1 in chromium webrtc discussion and it looks like that’s the problem in my case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants