Skip to content

Conversation

@res2k
Copy link
Contributor

@res2k res2k commented Jun 10, 2025

This adds ReSTIR light sampling, based on https://github.com/zyanidelab/Q2RTX.

It's essentially #475 but some additional flourishes:

I also tried if having separate shaders for ReSTIR vs RIS makes any difference - but it didn't seem to significantly impact performance, so I kept things as they were originally.

Visuals:

  • In places with noisy direct lighting, ReSTIR visible reduces the noise. (q2dm5, (-605.3, -310.2, 430.2) (0.999, -0.041, -0.026) is one such spot.)
  • However, with ReSTIR enabled I observed visible dark "smears" when previously occluded pixels are revealed. This can be seen when looking at "revealed" wall pixels when the view weapon is moved, or when turning the camera. (Perhaps some unfortunate interaction with the denoiser?) Does not occur with RIS.
  • Also notably, 232c47c combines static and dynamic lights into a single list. This has the potential to degrade visuals in certain circumstances - but practically I didn't notice any issues.

Performance:
I measured frametimes using the timedemos crusher, demo1, demo2 on a 5070 TI at 3800x2000 and a 3070 Laptop at 2560x1600. pt_num_bounces was set to 0.

5070 TI:

demo master branch frametime restir branch, pt_restir 1 percentage of master restir branch, pt_restir 0 percentage of master
crusher 9.551 ms 9.593 ms 100.4% 9.331 ms 97.7%
demo1 8.926 ms 9.065 ms 101.5% 8.954 ms 100.3%
demo2 9.844 ms 9.393 ms 95.4% 9.328 ms 94.8%

3070 Laptop:

demo master branch frametime restir branch, pt_restir 1 percentage of master restir branch, pt_restir 0 percentage of master
crusher 14.942 ms 15.179 ms 101.6% 14.830 ms 99.3%
demo1 14.174 ms 14.167 ms 99.9% 14.179 ms 100.0%
demo2 14.966 ms 15.043 ms 100.5% 15.049 ms 100.6%

Since we're dealing with frametimes, lower is better.

My thoughts:
Performance-wise, there's no issue (not much of a change compared to the current light sampling), and visually, it really helps in some previously noisy areas - so it would be a clear win if there wasn't the increase of "dark smearing" being observed. (Personally, I find the smearing more distracting than the noise issues, mainly because the smearing occurs constantly, while the current light sampling is decently tuned for the Q2 content and noise is not visible that often.) So it's a bit of a "so-so" right now.
But hopefully the smearing can be addressed, though I didn't look into that yet.

@res2k res2k mentioned this pull request Jun 10, 2025
@abalfoort
Copy link
Contributor

Thanks. This I can handle better.
Is there a map where the smearing issue is most visible. I haven't noticed it (built and tested on Debian 12).

@res2k
Copy link
Contributor Author

res2k commented Jun 11, 2025

Is there a map where the smearing issue is most visible. I haven't noticed it (built and tested on Debian 12).

Hmm. It kinda appears on every map. Though, curiously, certain sky-lit surfaces don't seem affected.

Here's a screengrab from q2dm5, near (-605.3, -310.2, 430.2) (0.999, -0.041, -0.026) which I mentioned earlier as a good spot to see improvements:
image

Note that this was made with pt_num_bounces 0. Enabling light bounces might possibly hide the issue somewhat.

@abalfoort
Copy link
Contributor

I see what you mean. It is even visible without setting pt_num_bounces.
I did not see that smearing in #475.

I think it's going to be difficult to track this one down.

@res2k
Copy link
Contributor Author

res2k commented Jun 12, 2025

I did not see that smearing in #475.

Hmm, I can notice a slight amount there. Though that might just the "unavoidable" minimum that occurs while not many samples have been accumulated yet.

Also, we have two generally working but, evidently, slightly differently behaving versions... maybe we can figure something out with the help of the differences.

@abalfoort
Copy link
Contributor

I haven't got much time the next couple of days. But for your convenience I've made a diff between the restir branch and this PR.
restir_477.zip

@abalfoort
Copy link
Contributor

My average C++ knowledge is, unfortunately, not sufficient to solve the smearing issue. I hope that @res2k finds the time to take a look at this.

@res2k
Copy link
Contributor Author

res2k commented Jun 26, 2025

Actually, that last commit: 5f630b3
makes a notable difference. These changes seem to be absent from #475.

@apanteleev
Copy link
Collaborator

Thanks for working on this, @res2k and @abalfoort!

I took a quick look, and I'm generally not opposed to these changes, but they need some polishing. Most importantly, the new lighting pass needs to play well with the denoiser, and currently it doesn't. Even in the starting location of base1, you can see black smearing when looking around, which is not there when pt_restir = 0, and that means the denoiser is not getting the right signals somewhere. Looking at the signal without the denoiser (flt_enable = 0), you can see that the signal becomes somewhat noisier when you enable ReSTIR, which doesn't sound right.

I should be able to help you with these issues, but for that I need to allocate some time and take a closer look.

@res2k
Copy link
Contributor Author

res2k commented Jun 28, 2025

Even in the starting location of base1, you can see black smearing when looking around,

Indeed. One thing I noticed, when using Nsight Graphics, is that the "reservoir" pixels (on PT_RESTIR_x) in the smear area seem to be "invalid", apparently for a relatively long time, and I guess that's why don't get good lighting for these.
Unfortunately, my understanding of ReSTIR is quite limited at this time, so I don't know why that may happen. (I might also be totally wrong.)
(NB: This observation is from q2dm5 as described above, I'm assuming the base1 you noticed is the same.)

Looking at the signal without the denoiser (flt_enable = 0), you can see that the signal becomes somewhat noisier when you enable ReSTIR, which doesn't sound right.

Hmm, notable differences in light selection for ReSTIR:

  • Looks at fewer lights (RESTIR_SAMPLING_M=4, vs MAX_BRUTEFORCE_SAMPLING=8)
  • Doesn't consider light stats

Could those differences be relevant?

I should be able to help you with these issues, but for that I need to allocate some time and take a closer look.

Thanks, your suggestions and input are appreciated!

@abalfoort
Copy link
Contributor

Just to verify if it makes any change at all I set RESTIR_SAMPLING_M=8 and saw far less smearing. I don't know what it does to performance, but playing a bit I did not notice much change.

@res2k
Copy link
Contributor Author

res2k commented Jul 10, 2025

Prodding around I noticed that disabling the use of the "previous reservoir" seemed to fix the smear (in exchange for more noise).
So I've pushed a commit that limits the reuse of the previous reservoir to only when it's similar in position and normal.
This seems to improve things/noticeably reduce the "smear".
Let me know what you think.

@abalfoort
Copy link
Contributor

I tried it (on Debian 12) and I think you really have to look hard to see any "smear" with this commit.
Still far less noise compared to regular STIR and I did not notice any performance hit.
I will check this with the MPs as well, but I don't expect any differences.

For me this PR is ready to merge.
Thanks for you work. Personally, I found it very educational.

@abalfoort
Copy link
Contributor

Holidays are over and I've been testing the MPs with this PR.
Unfortunately, the smearing is still very much present.

@res2k
Copy link
Contributor Author

res2k commented Aug 11, 2025

the MPs

Uh, not sure what "the MPs" means? Could you expand the acronym?

@abalfoort
Copy link
Contributor

Ah, sorry, MP=Mission Pack.
I tested with MP1, The Reckoning; MP2, Ground Zero, and Zaero.

JKSunny added a commit to JKSunny/EternalJK that referenced this pull request Oct 8, 2025
@JKSunny
Copy link

JKSunny commented Oct 8, 2025

Sorry for accidental cross-reference, I though I had removed it during rebase.
~feeling stupid

@abalfoort
Copy link
Contributor

Is there any progress on this change or is it not worth the effort?

@apanteleev
Copy link
Collaborator

No progress from my side.

@res2k
Copy link
Contributor Author

res2k commented Oct 18, 2025

No progress since my last changes to this, either.
But not because I would think this is a worthless endeavour; just that I happened to tinker with other stuff in my free time...

@abalfoort
Copy link
Contributor

Ah, well thanks for the update.

@abalfoort
Copy link
Contributor

I have been playing with these latest commits. The smearing was gone. Sometimes I had some FPS drops but that might be my older GPU (RTX 3060).

@abalfoort
Copy link
Contributor

@res2k How do you test the demos? I know how to use timedemo 1 and load demo demo1.dm2, but not how you did your tests in the OP.

@res2k
Copy link
Contributor Author

res2k commented Nov 9, 2025

@res2k How do you test the demos? I know how to use timedemo 1 and load demo demo1.dm2, but not how you did your tests in the OP.

  • timedemo in Q2RTX has a little secret: when it's argument is the number of runs, making it easy to run demos multiple times in successions to get multiple results.
  • After running a demo multiple times, I usually discard the highest and lowest FPS/smallest and largest runtime before averaging.
  • All that I have automated, with some extent, using a script (below) and a Google sheet /(for the averaging).

Benchmark script to run multiple demos a number of times:

#!/bin/sh

NUM_RUNS=5

while [[ $# -gt 0 ]]; do
  case $1 in
    -r|--runs)
      NUM_RUNS=$2
      shift # past argument
      shift # past value
      ;;
    -*|--*)
      echo "Unknown option $1"
      exit 1
      ;;
    *)
      break
      ;;
  esac
done

for demo in $*; do
    echo "$demo:"
    ./q2rtx.exe +timedemo $NUM_RUNS +set nextserver quit +demo $demo | grep fps
    echo
done

@abalfoort
Copy link
Contributor

abalfoort commented Nov 10, 2025

Thanks. I had to adapt the script a bit, but it helped a lot.
I only checked fps (demo1, demo2 - 5 runs).

pt_restir 0

Average demo1 48,5873973333333 fps
Average demo2 45,6974146666667 fps

pt_restir 1

Average demo1 48,506986 fps
Average demo2 45,9371706666667 fps

Hardly any difference.

@res2k
Copy link
Contributor Author

res2k commented Nov 11, 2025

Thanks. I had to adapt the script a bit, but it helped a lot.
I only checked fps (demo1, demo2 - 5 runs).

Hi, some notes:

  • While frames per second is something we're interested in as humans, it's not a great unit to compare performance. Notably, a difference doesn't really say much - a 10 fps is difference can have a wildly different meaning depending on whether you're starting at 20 fps or 80 fps. Better to look at frametimes (ie "(milli)seconds per frame"). See eg http://www.joshbarczak.com/blog/?p=1248 for some more explanation on the topic.
  • It's also a good idea to note down the used hardware & screen resolution, if only to give some context. Eg if you have a very modern and fast GPU and/or run at low resolution, "hardly any difference" has a different quality than when measuring on older hardware and/or at higher resolutions.

@abalfoort
Copy link
Contributor

So, it is preferred to calculate the milliseconds per frame, exclude the minimum and maximum values of these calculated values and finally calculate the average milliseconds per frame value for each demo. Correct?

@abalfoort
Copy link
Contributor

abalfoort commented Nov 12, 2025

Hardware info  
Operating System SolydXK 13 (Debian Trixie, stable)
KDE Plasma Version 6.3.6
KDE Frameworks Version 6.13.0
Qt Version 6.8.2
Kernel Version 6.12.48+deb13-amd64 (64-bit)
Graphics Platform Wayland
Processors 16 × AMD Ryzen 7 3700X 8-Core Processor
Memory 64 GiB of RAM (62,7 GiB usable)
Graphics Processor NVIDIA GeForce RTX 3060
Manufacturer Gigabyte Technology Co., Ltd.
Product Name B550 AORUS ELITE V2
Nvidia driver 580.105.08
Resolution 2560x1440 @179,96, full screen
demo master branch frametime restir-update branch, pt_restir 0, ms percentage of master restir-update branch, pt_restir 1, ms percentage of master
crusher 21,253 21,216 99,8 21,253 100,0
demo1 20,844 19,908 95,5 20,247 97,1
demo2 22,127 21,472 97,0 21,732 98,2

[EDIT]
This morning I realized that I ran the demos with my own q2config.cfg. So, I decided to run the test again but now with the default q2config.cfg. The performance was noticeably better.

demo master branch frametime restir-update branch, pt_restir 0, ms percentage of master restir-update branch, pt_restir 1, ms percentage of master
crusher 14,182 14,198 100,1 14,258 100,5
demo1 14,852 14,803 99,7 14,682 98,9
demo2 15,444 15,455 100,1 15,444 100,0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants