I am wondering you may haven't used the long term reward as in your paper? thanks #33

restart-again · 2024-10-06T10:39:36Z

Hi, thanks for your wonderful sharing. While from your code, in all your learning based algorithms, the total reward calculation is based on the instance_done, which means your reward is only the reward of that specific instance, rather than all the VNR requests.
Your reward in learn_singly(...) is always 0, I suppose the long term reward should be accumulated here which is the true long term reward, while you always keep it to 0.
Please feel free to correct me if I am wrong, thanks.

GeminiLight · 2024-10-16T17:36:30Z

Thank you for your feedback. In our library, we support two types of learning paradigms:

Instance-level optimization, which focuses on finding the optimal solution for each individual instance as it arrives.
Online-level optimization, which aims to learn a globally optimal policy to maximize overall system performance metrics across all instances.

You can find the implementations of both paradigms, including the instance-level and online-level environments, in the rl_solver directory.

As you correctly noted, most of the current implementations in our library are based on the instance-level paradigm. This is due to our empirical analysis and insights. In network systems, the randomness of service requests makes it difficult to learn a robust online-level policy that meets expectations. This approach tends to require more time and often does not deliver satisfactory performance. Conversely, the instance-level paradigm allows us to efficient obtain a high-quality solving policy, leading to more reliable and efficient results in practice.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I am wondering you may haven't used the long term reward as in your paper? thanks #33

I am wondering you may haven't used the long term reward as in your paper? thanks #33

restart-again commented Oct 6, 2024

GeminiLight commented Oct 16, 2024

I am wondering you may haven't used the long term reward as in your paper? thanks #33

I am wondering you may haven't used the long term reward as in your paper? thanks #33

Comments

restart-again commented Oct 6, 2024

GeminiLight commented Oct 16, 2024