Skip to content

Expose grid.evolve to the C-API #344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 34 commits into
base: master
Choose a base branch
from
Open

Conversation

Radonirinaunimi
Copy link
Member

@Radonirinaunimi Radonirinaunimi commented Apr 18, 2025

Using grid.evolve to evolve grid with APFEL++ evolution

In addition to performing the evolution on the fly and outputting theory predictions, APFEL++ also wants the possibility to dump the evolved grids on disk for future usage (aka FK tables). This effectively implies grid.evolve to be exposed at the C interface.

@vbertone, the idea is then for you to provide an operator which is a rank-4 tensor. To be more illustrative, let's consider a simple DIS case. The convolution, for a fixed $Q^2$, writes as:

$$\mathrm{FK}_a\left(x_i ; \mu_0^2\right)=\sum_{k, b, \ell} \alpha_{\mathrm{s}}^{n+k}\left(\mu^2\right) \mathrm{EKO}_{a, i}^{b, \ell} \sigma_b^{(k)}\left(x_\ell, \mu^2\right)$$

where:

  • $a,b$ represent respectively the flavor index of the FK table and grid
  • $i,\ell$ represent respectively the $x$-grid index of the FK table and grid

These form the dimensions of an EKO, ie EKO[a][i][b][l]. Notice that the $x$-grid coordinates of the grid and FK table do not have to be same, and they do not have to be the same as what is actually inside the grid $-$ same is true for PID bases.

The C++ function that consumes the operator from APFEL++ would therefore tentatively look as follows:! Deprecated, see evolve-grid.cpp.

struct OperatorInfo {
   std::vector<std::size_t> pids_in;
   std::vector<std::size_t> x_in;
   std::vector<std::size_t> pids_out;
   std::vector<std::size_t> x_out;
   std::double q_grid;
   std::double q0_fk;
   pid_basis fk_pid_basis;
}

void evolve_grid(
   pineappl_grid* grid,
   OperatorInfo op_info,
   std::vector<double> operators, // Flatten,
   std::vector<double> xi = {1.0, 1.0, 1.0},
) {
   ...
   grid.evolve(slices, order_mask, xi, alphas_table);
}

and this is the function that'll dump the evolved grid. I think this is the best way to do it as opposed to what we discussed during the meeting.

So the summarize, what APFEL++ needs to provide (for a given value of $Q^2$), is a rank-4 tensor representing the EKO. In addition, we need the $x$ and PID values (can be in any basis) from which the EKO was computed $-$ this way we don't rely on much information from the grids to construct the EKOs.

Does this make sense?

Extra-changes included in this PR (not fully related):

  • Introduced re-usable action to download/cache data in all of the workflows
  • Exposed a function to get the values of the subgrid nodes (needed if one wants to access the weights of the subgrids)
  • Removed references on objects which are immediately de-referenced by the compiler
  • Improved the get-subgrids.cpp examples

TODO:

  • Find the best way to download the LHCB_WP_7TEV_opt.pineappl.lz4 grid for the test which currently is available at:
wget --no-verbose --no-clobber https://data.nnpdf.science/pineappl/test-data/LHCB_WP_7TEV_opt.pineappl.lz4
  • Infer the order masking in the example from the Grid
  • Find the best way to download the EKO_LHCB_WP_7TEV.txt which contains the flattened evolution operators

@vbertone
Copy link

Hi @Radonirinaunimi, yes, it does make sense.
So, on my side, it all boils down to providing the rank-4 tensor operator flattened on a std::vector<double>, which I can easily do.
The only question I have is how you need it to be flattened. I assume following the order in EKO[a][i][b][l], but better to make sure.
Thank you!

@Radonirinaunimi
Copy link
Member Author

Hi @vbertone, apologies for the late reply, I wanted to sort out the interface first before providing you with how the order of the flattening should look like.

However, I had not anticipated how difficult the interface would be, and therefore this is not done yet. It might still takes some time.

@vbertone
Copy link

Hi @Radonirinaunimi, no problem at all. Take your time.
I'll kick in once you are done.

@Radonirinaunimi
Copy link
Member Author

Hi @Radonirinaunimi, no problem at all. Take your time. I'll kick in once you are done.

Ok, this is not completely done (needs some clean up, fix some memory leak in the example, etc.) but I think we can start to test and experiment it.

So there is a primary example in evolve-grid.cpp. The operators should be flattened following the EKO[a][i][b][l] order as you mentioned above; ie.:

std::size_t index = 0;
for a { for i { for b {for l { eko[index++] = ... } } } }

The b and l which refers to pids1 and x1 of the grid from which the EKO should be computed from need to be extracted from the grid, for example as follows:

// Get the values of the evolve info parameters. These contain, for example, the
// information on the `x`-grid and `PID` used to interpolate the Grid.
// NOTE: These are used to construct the Evolution Operator
std::vector<double> fac1(evinfo_shape[0]);
std::vector<double> frg1(evinfo_shape[1]);
std::vector<int> pids1(evinfo_shape[2]);
std::vector<double> x1(evinfo_shape[3]);
std::vector<double> ren1(evinfo_shape[4]);
pineappl_grid_evolve_info(grid, order_mask.data(), fac1.data(),
frg1.data(), pids1.data(), x1.data(), ren1.data());

Please let me know if something is wrong/doesn't make sense.

@Radonirinaunimi
Copy link
Member Author

This is now doing what we intended to do.

@vbertone please test it and let me know. You can download the Grid for the test via:

wget --no-verbose --no-clobber https://data.nnpdf.science/pineappl/test-data/LHCB_WP_7TEV_opt.pineappl.lz4

@cschwan This is not super-clean yet and would appreciate if you can have a quick look (your feedback is always quite helpful).

@vbertone
Copy link

Hi @Radonirinaunimi, thanks a lot, I will test it as soon as possible!

@vbertone
Copy link

Hi @Radonirinaunimi, I just ran the code and it does work fine. Also, starting from here I should be able to follow the steps to eventually interface it to the evolution operator computed with APFEL++. I just have one question. Here is the output I get:

   Bin           Grid        FkTable        reldiff
  ----  -------------  -------------  -------------
     0   7.545911e+02   7.603251e+02   7.598823e-03
     1   6.902834e+02   6.949192e+02   6.715820e-03
     2   6.002520e+02   6.037697e+02   5.860343e-03
     3   4.855224e+02   4.878688e+02   4.832807e-03
     4   3.619546e+02   3.634639e+02   4.169948e-03
     5   2.458669e+02   2.467891e+02   3.750676e-03
     6   1.158685e+02   1.163001e+02   3.724532e-03
     7   2.751727e+01   2.763520e+01   4.285840e-03

So differences between grid and FK-table predictions are on the verge of 1%. Since the grid that is being used is (I guess) a realistic one and that the evolution is essentially the unity, this seems a purely numerical effect related, I suppose, to interpolation. The question is: is this accuracy satisfactory? And can in principle be improved? Thank you.

@Radonirinaunimi
Copy link
Member Author

Radonirinaunimi commented Apr 29, 2025

Hey @vbertone and @felixhekhorn, there was a bug in the masking of the orders of the example (fixed in b2bf81b). Now the agreement is:

   Bin           Grid        FkTable        reldiff
  ----  -------------  -------------  -------------
     0   7.545911e+02   7.545911e+02  -3.762829e-08
     1   6.902834e+02   6.902834e+02  -3.537025e-08
     2   6.002520e+02   6.002520e+02  -3.293028e-08
     3   4.855224e+02   4.855223e+02  -3.037598e-08
     4   3.619546e+02   3.619545e+02  -2.782860e-08
     5   2.458669e+02   2.458669e+02  -2.526536e-08
     6   1.158685e+02   1.158685e+02  -2.182740e-08
     7   2.751727e+01   2.751727e+01  -1.732084e-08

@vbertone
Copy link

Thank you, @Radonirinaunimi, this looks much better!

@vbertone
Copy link

Hi @Radonirinaunimi, I'm having some difficulties in understanding the behaviour of the code. Specifically, I would like to change the output PIDs because evolution can possibly generate channels that are not originally present. To this purpose, I made the following basic changes to the code evolve-grids.cpp:

diff --git a/examples/cpp/evolve-grid.cpp b/examples/cpp/evolve-grid.cpp
index 678f42e2..f951f4c4 100644
--- a/examples/cpp/evolve-grid.cpp
+++ b/examples/cpp/evolve-grid.cpp
@@ -118,11 +118,12 @@ int main() {
 
     // ------------------ Construct the Evolution Operator ------------------
     // The Evolution Operator is a vector with length `N_conv * N_Q2_slices * Σ product(OP shape)`
+    std::vector<int> pids2{-6, -5, -4, -3, -2, -1, 21, 1, 2, 3, 4, 5, 6};
     std::vector<double> op_slices;
-    std::size_t flat_len = x1.size() * x1.size() * pids1.size() * pids1.size();
+    std::size_t flat_len = x1.size() * x1.size() * pids1.size() * pids2.size();
     for (std::size_t _i = 0; _i != conv_types.size(); _i++) {
         for (std::size_t j = 0; j != fac1.size(); j++) {
-            std::vector<double> eko = generate_fake_ekos(pids1, x1, pids1, x1);
+            std::vector<double> eko = generate_fake_ekos(pids1, x1, pids2, x1);
             for (std::size_t k = 0; k != flat_len; k++) {
                 op_slices.push_back(eko[k]);
             }
@@ -137,11 +138,11 @@ int main() {
     }
 
     std::vector<double> xi = {1.0, 1.0, 1.0};
-    std::vector<std::size_t> tensor_shape = {pids1.size(), x1.size(), pids1.size(), x1.size()};
+    std::vector<std::size_t> tensor_shape = {pids1.size(), x1.size(), pids2.size(), x1.size()};
 
     pineappl_fk_table* fktable = pineappl_grid_evolve(grid, opinfo_slices.data(),
         max_orders.data(), op_slices.data(), x1.data(),
-        x1.data(), pids1.data(), pids1.data(),
+        x1.data(), pids1.data(), pids2.data(),
         tensor_shape.data(), xi.data(), ren1.data(), alphas_table.data());
 
     // ------------------ Compare Grid & FK after convolution ------------------

The purpose is to allow for all possible channels. However, I must be doing something wrong because when I run the code I get this error that I cannot interpret:

thread '<unnamed>' panicked at pineappl_capi/src/lib.rs:2223:23:
Evolving grid failed: General("operator information (13, 39, 6, 39) does not match the operator's dimensions: (6, 39, 13, 39)")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
fatal runtime error: failed to initiate panic, error 5
Abort trap: 6

I also tried to play around with the order of variables here and there, but still get the same error (except that sometimes the two tensor shapes in the error message are swapped). Any idea of what I'm doing wrong? Thank you.

@Radonirinaunimi
Copy link
Member Author

Hi @vbertone, the error is because a mismatched shape between the EKO tensor_shape passed and the actual shape of the operator that are computed from the Rust side. I can seed that, as it is right, this can be really confusing because it is not clear which {pids, x} goes in (from the grid) and which one goes out (FK table). Let me clean up a bit the example and implements the example that you were trying to do as that's more interesting.

@Radonirinaunimi
Copy link
Member Author

Hi @Radonirinaunimi, thanks. Yes, I'm looking into the code now and I think that this generally doable. However, I'm concerned by the fact that the callback function does not allow one to pass any theory settings (other than the convolution type), such as perturbative order, heavy-quark masses, alpha_s, etc, necessary to construct the evolution operator. Moreover, APFEL++ also needs other settings such as the x and mu grid parameters. One could define these settings externally as global variables but that does not look ideal to me.

Hi @vbertone! My bad, my thinking, at least for that first proof of concept, was to expose as little objects as possible by abstracting away all the APFEL++-dependent arguments in such a way that it can be portable to any interface. I should have at least looked carefully into the APFEL++ example for the crucial objects that have to be passed.

I believe that in 9efc9bc the exposed function now contains the least amount of needed arguments to construct APFEL++ operators (?) in that all of the other parameters can be wrapped/constructed inside the generate_fake_ekos (as usual, you can check the example evolve-grid-identity.cpp for the changes). But there is just one minor problem: the lambda function as is called both by pineappl_grid_evolve (through the alphas_table) and the building of the DLGAP. In principle, as can be recomputed inside generate_fake_ekos but that'd be doing the same computation twice. However, I am not seeing yet a better alternative because this would involve passing yet a function as an argument to:

/// Type alias for the operator callback
type OperatorCallback = extern "C" fn(
*const i32, // pids_in
*const f64, // x_in
*const i32, // pids_out
*const f64, // x_out
*mut f64, // Evolution Operator data buffer
*mut c_void, // Callable of PDF object
ConvType, // Convolution type
f64, // fac1
usize, // Length of pids_in
usize, // Length of x_in
usize, // Length of pids_out
usize, // Length of x_out
);

Please let me know what you think.

PS: @cschwan, in case you'd like to get access to the PineAPFEL repository for context, please just let us know and I am sure @enocera can add you.

@vbertone
Copy link

Hi @cschwan and @Radonirinaunimi, ok, understood, thanks. I will try to use this new interface to construct the evolution operator with APFEL++. Specifically, if I understand correctly, the void* pdf_state can now be used to encapsulate the function that returns the evolution operator, right?

@vbertone
Copy link

@Radonirinaunimi, concerning the question of alpha_s, I don't think that's an issue. The function that returns alpha_s is fast enough that the overhead of computing it twice is most probably negligible.

@Radonirinaunimi
Copy link
Member Author

Hi @cschwan and @Radonirinaunimi, ok, understood, thanks. I will try to use this new interface to construct the evolution operator with APFEL++. Specifically, if I understand correctly, the void* pdf_state can now be used to encapsulate the function that returns the evolution operator, right?

So, assuming that as is recomputed inside the generator_fake_ekos callback, almost the entirety of the computation of the evolution operator should be moved in there. So, it'd look something like this:

extern "C" void generate_fake_ekos(
  const int* _pids_in,
  const double* _x_in,
  const int* pids_out,
  const double* x_out,
  double* eko_buffer,
  void* pdf_state,
  pineappl_conv_type conv_type,
  double fac1,
  std::size_t pids_in_len,
  std::size_t x_in_len,
  std::size_t pids_out_len,
  std::size_t x_out_len
) {
  // Ignore unused variables
  (void) _pids_in;
  (void) _x_in;
  (void) pids_in_len;
  (void) x_in_len;

  // ------------------ Construct the Evolution Operator with APFEL++
  // ------------------ Accumulate of thresholds and PIDs from LHAPDF set
  std::vector<double> Thresholds;  //{0, 0, 0, 1.3, 4.75};
  std::vector<int> pids_in;        //{-5, -4, -3, -2, -1, 21, 1, 2, 3, 4, 5};
  const double mu0 = static_cast<LHAPDF::PDF*> (pdf_state)->qMin();
  for (auto const& v : static_cast<LHAPDF::PDF*> (pdf_state)->flavors()) {
    pids_in.push_back(v);
    if (v > 0 && v < 7) {
      Thresholds.push_back(static_cast<LHAPDF::PDF*> (pdf_state)->quarkThreshold(v));
    }
  }

  // Get perturbative order from LHAPDF set
  const int PerturbativeOrder = static_cast<LHAPDF::PDF*> (pdf_state)->qcdOrder();

  // Construct running strong coupling
  apfel::AlphaQCD a{static_cast<LHAPDF::PDF*> (pdf_state)->alphasQ(91.1876), 91.1876, Thresholds, PerturbativeOrder};
  const apfel::TabulateObject<double> Alphas{a, 200, 0.9, 13001, 3};
  const auto as = [&](double const& mu) -> double {
    return Alphas.Evaluate(mu);
  };

  // Construct x-space grid for APFEL++ (can this be determined dynamically?)
  // const apfel::Grid g{{apfel::SubGrid{100, 1e-5, 3}, apfel::SubGrid{100,
  // 1e-1, 3}, apfel::SubGrid{100, 6e-1, 3}, apfel::SubGrid{80, 8.5e-1, 5}}};
  const apfel::Grid g{
      {apfel::SubGrid{80, 1e-5, 3}, apfel::SubGrid{40, 1e-1, 3}}};

  // Get joint grid
  std::vector<double> x_in(g.GetJointGrid().nx() + 1);
  for (size_t i = 0; i < x_in.size(); i++)
    x_in[i] = g.GetJointGrid().GetGrid()[i];

  // The Evolution Operator is a vector with length `N_conv * N_Q2_slices * Σ
  // product(OP shape)` Initial scale
  const std::size_t flat_len = pids_out_len * x_out_len * pids_in.size() * x_in.size();
  std::vector<std::size_t> shape = {pids_out_len, x_out_len, pids_in.size(), x_in.size()};

  // Zero operator
  const apfel::Operator ZeroOp{g, apfel::Null{}, apfel::eps5};

  // Run over convolution types
  int k = -1;
  // Construct DGLAP objects for the relevant evolution and
  // tabulate evolution operator
  std::unique_ptr<apfel::Dglap<apfel::Operator>> EvDists;
  if (conv_type == pineappl_conv_type::PINEAPPL_CONV_TYPE_UNPOL_PDF)
    EvDists = BuildDglap(InitializeDglapObjectsQCDPhys(g, Thresholds, true), mu0, PerturbativeOrder, as);
  else if (conv_type == pineappl_conv_type::PINEAPPL_CONV_TYPE_POL_PDF)
    EvDists = BuildDglap(InitializeDglapObjectsQCDpolPhys(g, Thresholds, true), mu0, PerturbativeOrder, as);
  else if (conv_type == pineappl_conv_type::PINEAPPL_CONV_TYPE_UNPOL_FF)
    EvDists = BuildDglap(InitializeDglapObjectsQCDTPhys(g, Thresholds, true), mu0, PerturbativeOrder, as);
  else
    throw std::runtime_error("Unknow convolution type");
  const apfel::TabulateObject<apfel::Set<apfel::Operator>> TabulatedOps{
      *EvDists, 100, 1, 13000, 3};

  // NOTE: The Loop over the scale `fac1` is also removed
  ....
}

with the main changes that the loops on conv_types (or different evolution operators) and fac1 are removed. Does this make sense or am I still missing something?

@vbertone
Copy link

Hi @Radonirinaunimi, yes, this does make sense and makes it much clearer. Thank you.
I will try to implement it like this in the PineAPFEL repository.

I still have a doubt though. In this example the evolution parameters are retrieved from the LHAPDF set, which in turn is passed to the function that returns the evolution operators through pdf_state. However, in the general case, we would not have an LHAPDF object and would need to explicitly pass the evolution parameters like thresholds, alpha_s reference value, and possibly also the grid parameters (which in the example above as well as in the previous implementation were hard coded). How would you suggest to proceed in that case?

@Radonirinaunimi
Copy link
Member Author

I still have a doubt though. In this example the evolution parameters are retrieved from the LHAPDF set, which in turn is passed to the function that returns the evolution operators through pdf_state. However, in the general case, we would not have an LHAPDF object and would need to explicitly pass the evolution parameters like thresholds, alpha_s reference value, and possibly also the grid parameters (which in the example above as well as in the previous implementation were hard coded). How would you suggest to proceed in that case?

Hmhm, I understand you mean by generic interface. Let me think a bit more.

However, in the general case, we would not have an LHAPDF object and would need to explicitly pass the evolution parameters like thresholds, alpha_s reference value, and possibly also the grid parameters

Could you perhaps provide an example of such a general case? In this way, the next iteration could be based upon that, and therefore also general xd.

@vbertone
Copy link

Hi @Radonirinaunimi, you mean an example of implementation? At the moment I don't have anything specific in mind. What I mean is: does the current signature of the function that returns the evolution operator accommodate a situation in which the pdf_state variable does not point to an LHAPDF object, but rather to something that contains the evolution parameters?

For example, one may define a structure which encapsulates the evolution parameters and pass an instance of that structure through pdf_state. Would that make sense?

@Radonirinaunimi
Copy link
Member Author

Hi @Radonirinaunimi, you mean an example of implementation? At the moment I don't have anything specific in mind. What I mean is: does the current signature of the function that returns the evolution operator accommodate a situation in which the pdf_state variable does not point to an LHAPDF object, but rather to something that contains the evolution parameters?

Yes, as long as it is not a complicated object (such as function, although function pointers might still work). See for example the patch below for evolve-grid-identity.cpp:

diff --git a/examples/cpp/evolve-grid-identity.cpp b/examples/cpp/evolve-grid-identity.cpp
index f68a094d..42624e81 100644
--- a/examples/cpp/evolve-grid-identity.cpp
+++ b/examples/cpp/evolve-grid-identity.cpp
@@ -12,6 +12,11 @@
 // NOTE: Uses the scale of the Grid as the starting scale such that we can use an IDENTITY EKO.
 double FAC0 = 6456.44;

+struct ApfelxxParams {
+    int x_nodes;
+    double as_ref;
+};
+
 std::vector<std::size_t> unravel_index(std::size_t flat_index, const std::vector<std::size_t>& shape) {
     std::size_t ndim = shape.size();
     std::vector<std::size_t> coords(ndim);
@@ -30,7 +35,7 @@ extern "C" void generate_fake_ekos(
     const int* pids_out,
     const double* x_out,
     double* eko_buffer,
-    void* pdf_state,
+    void* params,
     pineappl_conv_type conv_type,
     double fac1,
     std::size_t pids_in_len,
@@ -46,9 +51,10 @@ extern "C" void generate_fake_ekos(
     (void) conv_type;
     (void) fac1;

-    // Check to get the μ0 from the PDF
-    const double mu0_scale = static_cast<LHAPDF::PDF*> (pdf_state)->q2Min();
-    (void) mu0_scale; // Mark as unused variable
+    // Check external APFEL++ parameters
+    ApfelxxParams* op_params = static_cast<ApfelxxParams*>(params);
+    std::cout << op_params->as_ref << "\n";
+    std::cout << op_params->x_nodes << "\n";

     std::size_t flat_len = pids_in_len * x_in_len * pids_out_len * x_out_len;
     // NOTE: The EKO has to have as shape: (pids_in, x_in, pids_out, x_out)
@@ -175,6 +181,12 @@ int main() {
         alphas_table.push_back(alpha);
     }

+    // ApfelxxParams
+    ApfelxxParams* op_params = new ApfelxxParams;
+    op_params->x_nodes = 10;
+    op_params->as_ref = 0.118;
+    auto apfelxx_params = static_cast<void*>(op_params);
+
     std::vector<double> xi = {1.0, 1.0, 1.0};
     // NOTE: The EKO has to have as shape: (pids_in, x_in, pids_out, x_out)
     std::vector<std::size_t> tensor_shape = {pids_in.size(), x_in.size(), pids_out.size(), x_in.size()};
@@ -193,7 +205,7 @@ int main() {
     //     - `ren1`: values of the renormalization scales
     //     - `alphas_table`: values of alphas for each renormalization scales
     pineappl_fk_table* fktable = pineappl_grid_evolve(grid, opinfo_slices.data(),
-        generate_fake_ekos, max_orders.data(), pdf.get(), x_in.data(),
+        generate_fake_ekos, max_orders.data(), apfelxx_params, x_in.data(),
         x_in.data(), pids_in.data(), pids_out.data(),
         tensor_shape.data(), xi.data(), ren1.data(), alphas_table.data());

Would this be enough?

PS: I should rename this to params_state instead of pdf_state.

@vbertone
Copy link

Hi @Radonirinaunimi, yes, that's precisely what I had in mind. I'll try that and let you know. Thank you.

@vbertone
Copy link

Hi @Radonirinaunimi and @cschwan, I managed to modify the code in PineAPFEL in a way to make it work using the new interface for the evolution of pineappl.

Things seem to work fine. However, when computing the convolution between grids and evolution operators in the case of two convolution types, neither the memory footprint nor the performance seem to have got significantly better. I uploaded the corresponding code in the PineAPFEL repository. Please have a look, perhaps I'm misusing the code.

@Radonirinaunimi
Copy link
Member Author

Hi @vbertone, out of curiosity, how are you estimating the memory usage? As for the runtime, I did not expect the speed to improve much to be honest.

I am currently profiling the codes with the previous version and the current improvement but it is taking quite some time 😬 (I am using valgrind). But with some preliminary results, with *-double-apfel.cpp the heap memory is only ~60 Mbs with the improved version:

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
 50 7,308,743,163,287       57,211,888       56,026,682     1,185,206            0
 51 7,310,799,914,895       57,211,888       56,026,650     1,185,238            0

while previously it was of the order of Gbs (I am still checking the stack).

@vbertone
Copy link

Hi @Radonirinaunimi, I just used top. If I run evolve-grid-double-apfel with a coarse grid (10 nodes), I also see a memory usage of around 60M:

PID    COMMAND      %CPU  TIME     #TH    #WQ   #PORT MEM    PURG   CMPRS  PGRP  PPID  STATE    BOOSTS            %CPU_ME %CPU_OTHRS UID       FAULTS   COW   MSGSENT    MSGRECV   SYSBSD    SYSMACH   CSW        PAGEIN
78584  evolve-grid- 100.3 00:13.49 1/1    0     11    65M+   0B     0B     78584 78566 running  *0[1]             0.00000 0.00000    362226154 6370+    99    39         12        9021      182+      1004+      230

But if I use I finer grid (~100 nodes) I find:

    COMMAND      %CPU  TIME     #TH    #WQ   #PORT MEM    PURG   CMPRS  PGRP  PPID  STATE    BOOSTS            %CPU_ME %CPU_OTHRS UID       FAULTS   COW   MSGSENT    MSGRECV   SYSBSD    SYSMACH   CSW        PAGEIN
78803  evolve-grid- 97.2  00:07.44 1/1    0     11    9357M  0B     6013M+ 78803 78566 running  *0[1]             0.00000 0.00000    362226154 737589+  99    27         12        12040     263       3326+      15
36

So now the memory usage is around 10G.

Let me know if there is anything you would like to discuss.

@Radonirinaunimi
Copy link
Member Author

Radonirinaunimi commented May 27, 2025

Stupid question: How many nodes are used in the current remote version of evolve-grid-double-apfel.cpp? Or more generally, where in the code is the nodes specified? That's the one I am using.

@vbertone
Copy link

vbertone commented May 27, 2025

The current version uses the coarse grid. The grid is controlled through these lines:

    // Construct x-space grid for APFEL++ (can this be determined dynamically?)
    //const apfel::Grid g{{apfel::SubGrid{100, 1e-5, 3}, apfel::SubGrid{100, 1e-1, 3}, apfel::SubGrid{100, 6e-1, 3}, apfel::SubGrid{80, 8.5e-1, 5}}};
    //const apfel::Grid g{{apfel::SubGrid{80, 1e-5, 3}, apfel::SubGrid{40, 1e-1, 3}}};
    const apfel::Grid g{{apfel::SubGrid{10, 1e-5, 3}}};

The first is perhaps excessively dense, the second is kind of fine, the third (which is the one commented you are probably using) is not accurate enough.

@Radonirinaunimi
Copy link
Member Author

Hi @vbertone, sorry the bit of a delay. Using the dense grid, I now get consistent benchmarks with yours:

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
 62 41,126,439,468,263    9,808,105,888    9,805,719,875     2,374,885       11,128
 63 41,130,202,556,748    9,808,105,968    9,805,719,875     2,374,901       11,192
 64 41,133,965,645,415    9,808,105,952    9,805,719,875     2,374,885       11,192
...

In hindsight, I'd say this is a fact of life. The optimization above was meant to not store all of the evolution operators in memory all at once but only one at time. While here, the problem is really with a single operator (maybe this is what you meant to tell me since the beginning but I missed it).

I am not sure (and I don't think) there is really around this tbh (cc @cschwan) other than using reasonable number of nodes (in our case, we always use 50 for which the accuracy is acceptable). Nevertheless, I'd say that if 100 nodes can still be run on a personal computer, that is not too bad.

@vbertone
Copy link

Hi @Radonirinaunimi, ok, then that's what it is. Actually, I was wondering whether SIHP could be somehow optimised. If I remember correctly, you were mentioning that all partonic channels contribute to the cross section, which would make 13^3 (or 11^3) channels. If so, I'm pretty sure that not all of them are independent. Wouldn't that allow us to reduce the amount of computations do be done for the combination?

Also, I launched the combination with around 100 nodes yesterday evening and it's still running. Hopefully, I should get it done by today. But still, the computation time it's of the order of one day, which is pretty heavy.

@Radonirinaunimi
Copy link
Member Author

Hi @Radonirinaunimi, ok, then that's what it is. Actually, I was wondering whether SIHP could be somehow optimised. If I remember correctly, you were mentioning that all partonic channels contribute to the cross section, which would make 13^3 (or 11^3) channels. If so, I'm pretty sure that not all of them are independent. Wouldn't that allow us to reduce the amount of computations do be done for the combination?

The SIHP grids are already optimized so that non-contributing channels are removed. See here for example.

But on that note, I am actually wondering whether the memory and speed issue is related to the explicit double-precision representation of the grid (?). But I am just thinking out loud...

@vbertone
Copy link

Hi @Radonirinaunimi, the code finally converged (~20 hours) and here is the output:

   Bin           Grid        FkTable        reldiff
  ----  -------------  -------------  -------------
     0   1.206410e+03   1.249609e+03   3.580811e-02
     1   5.589098e+02   5.754833e+02   2.965326e-02
     2   2.601829e+02   2.669309e+02   2.593575e-02
     3   1.244742e+02   1.274532e+02   2.393277e-02
     4   6.272873e+01   6.414157e+01   2.252287e-02
     5   2.543235e+01   2.598695e+01   2.180695e-02
Time elapsed: 78720.584506 seconds

The accuracy is not ideal but I suspect it has to do with the time-like evolution which does not precisely match that of the FF set that I'm using. But this is definitely encouraging.

@Radonirinaunimi
Copy link
Member Author

Hi @vbertone, that took quite some time 😅

As for the accuracy, that is indeed definitely because of mismatch in the evolution, which is also what I recall when I evolved these grids with pineko a few months back.

The question is then: with a more reasonable number of grid nodes (say 40 or 50) would the accuracy still of the same order (without significant deterioration)? If so, then we can just run with coarse grid.

@vbertone
Copy link

Sure, I just launched a new run with a smaller grid with around 50 nodes. I'll let you know how long that takes and what the accuracy is.

Tomorrow, I will also produce an LHAPDF grid with the exact same evolution as APFEL++ for the FFs. This will help us gauge any inaccuracies.

@vbertone
Copy link

Hi @Radonirinaunimi, the run with the coarser grid (~50 nodes) has come to an end and surprisingly it seems to have given rise to better accuracy than the finer grid:

   Bin           Grid        FkTable        reldiff
  ----  -------------  -------------  -------------
     0   1.206410e+03   1.212607e+03   5.136834e-03
     1   5.589098e+02   5.562280e+02  -4.798380e-03
     2   2.601829e+02   2.573192e+02  -1.100624e-02
     3   1.244742e+02   1.226425e+02  -1.471564e-02
     4   6.272873e+01   6.166114e+01  -1.701924e-02
     5   2.543235e+01   2.496521e+01  -1.836787e-02
Time elapsed: 15534.511375 seconds

I wonder why. It might be accidental?
This time the computation time has gone down to around 4 hours, which is reasonable.

@cschwan
Copy link
Contributor

cschwan commented May 29, 2025

[..] Actually, I was wondering whether SIHP could be somehow optimised. If I remember correctly, you were mentioning that all partonic channels contribute to the cross section, which would make 13^3 (or 11^3) channels. If so, I'm pretty sure that not all of them are independent. Wouldn't that allow us to reduce the amount of computations do be done for the combination?

Generally you want the number of channels to be as small as possible before the evolution. If you have the same hadronic initial states symmetrizing the channels is a quick option, but looking at @Radonirinaunimi's comment that seems to be done already. Furthermore, I think I even found an algorithm to minimize the number of channels further to the minimum possible. Furthermore, make sure you do not evolve and perform a basis change at the same time, because the basis change usually blows up the operators a lot. Instead, perform the evolution first and only then change the basis.

On the technical side, I think we want to make sure that

  1. when evolving, 99% of the CPU cycles are actually spent inside Grid::evolve; if that's not the case, then we possibly need to improve the interface further.
  2. I don't think we ever benchmarked a grid with three convolutions, so we want to take a closer look into were in Grid::evolve CPU cycles are spent.

@cschwan
Copy link
Contributor

cschwan commented May 29, 2025

I wonder why. It might be accidental? This time the computation time has gone down to around 4 hours, which is reasonable.

Which orders are in your grid? If you have just LO + NLO and the NLO correction is very small, it could come from numeric cancellations. In any case, you can try evolving LO and the NLO correction separately and see how that changes the agreement.

@vbertone
Copy link

Generally you want the number of channels to be as small as possible before the evolution. If you have the same hadronic initial states symmetrizing the channels is a quick option, but looking at @Radonirinaunimi's comment that seems to be done already. Furthermore, I think I even found an algorithm to minimize the number of channels further to the minimum possible. Furthermore, make sure you do not evolve and perform a basis change at the same time, because the basis change usually blows up the operators a lot. Instead, perform the evolution first and only then change the basis.

Hi @cschwan, could you please be more specific on this point? What the code does at the moment is to perform the evolution in a specific basis internal to APFEL++ and then, at each required scale, the resulting evolution operator is rotated into the physical basis before feeding it to PineAPPL. However, evolution + rotation is pretty quick. Do you have something different in mind?

On the technical side, I think we want to make sure that

  1. when evolving, 99% of the CPU cycles are actually spent inside Grid::evolve; if that's not the case, then we possibly need to improve the interface further.
  2. I don't think we ever benchmarked a grid with three convolutions, so we want to take a closer look into were in Grid::evolve CPU cycles are spent.

I'll leave @Radonirinaunimi to comment of these technical points.

@vbertone
Copy link

Which orders are in your grid? If you have just LO + NLO and the NLO correction is very small, it could come from numeric cancellations. In any case, you can try evolving LO and the NLO correction separately and see how that changes the agreement.

I think I'm actually doing LO + NLO. Specifically, I'm using:

    const std::vector<uint8_t> max_orders = {3, 3};

Ok, I can try the two orders separately, but I'm not quite sure how to do that.

@cschwan
Copy link
Contributor

cschwan commented May 29, 2025

Hi @cschwan, could you please be more specific on this point? What the code does at the moment is to perform the evolution in a specific basis internal to APFEL++ and then, at each required scale, the resulting evolution operator is rotated into the physical basis before feeding it to PineAPPL. However, evolution + rotation is pretty quick. Do you have something different in mind?

What I mean is: does the tensor you feed into pineappl_grid_evolve contain a rotation from the flavour basis (u, d, s, ... of your input grid) to an evolution basis (singlet, vector, ... in the final FK-table)? Or yet another words, are your FK-tables expressed in terms of an evolution basis (after reading your answer more carefully that seems to be the case).

I explained the problem a bit better here, NNPDF/pineko#183 (comment). From the numbers I reported there, I believe it's better to 1) evolve first without changing the basis, as the operators are smaller (22 GB vs. 45 Gb) and then 2) only after evolving perform the rotation using Grid::rotate_pid_basis. Since the operators are smaller I believe that should make evolution faster and use less space.

@vbertone
Copy link

Hi @cschwan, ok, I think I see what you mean. But just to make sure I'm not mistaken, what you suggest is to combine hard matrix elements and evolution operators in the evolution basis, and only after the convolution rotate it into the physical basis. Did I understand correctly?

If so, I can definitely provide the evolution operator in the evolution basis, but then I would not know how to tell PineAPPL to do the convolution in this basis. Perhaps you and/or @Radonirinaunimi can help me with that?

@Radonirinaunimi
Copy link
Member Author

@cschwan, we don't perform basis rotation exactly for this reason. So the FK table is computed in the same basis as the grid's (which is always the physical basis/PDG basis). What @vbertone mentioned about APFEL++ basis $\to$ physical basis is something that internally done within APFEL++. Hence this is not an issue.

On the technical side, I think we want to make sure that

  1. when evolving, 99% of the CPU cycles are actually spent inside Grid::evolve; if that's not the case, then we possibly need to improve the interface further.
  2. I don't think we ever benchmarked a grid with three convolutions, so we want to take a closer look into were in Grid::evolve CPU cycles are spent.

I will have a closer look at this part which I haven't payed much attention when doing the benchmark. Perhaps there's something we can learn from there.

@Radonirinaunimi
Copy link
Member Author

Hi @cschwan, ok, I think I see what you mean. But just to make sure I'm not mistaken, what you suggest is to combine hard matrix elements and evolution operators in the evolution basis, and only after the convolution rotate it into the physical basis. Did I understand correctly?

If so, I can definitely provide the evolution operator in the evolution basis, but then I would not know how to tell PineAPPL to do the convolution in this basis. Perhaps you and/or @Radonirinaunimi can help me with that?

Sorry, I was only a couple of minutes late with my comment 🙈. Based on #344 (comment), this would not be needed.

@vbertone
Copy link

Hi @cschwan, ok, I think I see what you mean. But just to make sure I'm not mistaken, what you suggest is to combine hard matrix elements and evolution operators in the evolution basis, and only after the convolution rotate it into the physical basis. Did I understand correctly?
If so, I can definitely provide the evolution operator in the evolution basis, but then I would not know how to tell PineAPPL to do the convolution in this basis. Perhaps you and/or @Radonirinaunimi can help me with that?

Sorry, I was only a couple of minutes late with my comment 🙈. Based on #344 (comment), this would not be needed.

My dad, it looks like I totally misunderstood what is going on under the hood 😄.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants