reduce memory_footprint for sparse PCA transform (#5964)

Intron7 · web-flow · commit d4535d28e42a · 2024-07-28T16:27:41.000Z
The sparse PCA still densified `X` during the transform step. This defeats the purpose of a sparse PCA in a sense. However ``` precomputed_mean_impact = self.mean_ @ self.components_.T mean_impact = cp.ones((X.shape[0], 1)) @ precomputed_mean_impact.reshape(1, -1) X_transformed = X.dot(self.components_.T) -mean_impact ``` is the same as ``` X = X - self.mean_ X_transformed = X.dot(self.components_.T) ``` The new implementation is faster (but mainly due to the fact that we don't have to rely on cupy's `to_array()`) and uses a lot less memory. Authors: - Severin Dicks (https://github.com/Intron7) - Dante Gama Dessavre (https://github.com/dantegd) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) URL: #5964
diff --git a/python/cuml/cuml/decomposition/pca.pyx b/python/cuml/cuml/decomposition/pca.pyx
@@ -1,5 +1,5 @@
 #
-# Copyright (c) 2019-2023, NVIDIA CORPORATION.
+# Copyright (c) 2019-2024, NVIDIA CORPORATION.
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
@@ -632,8 +632,9 @@ class PCA(UniversalBase,
                 self.components_ *= cp.sqrt(self.n_samples_ - 1)
                 self.components_ /= self.singular_values_.reshape((-1, 1))
 
-            X = X - self.mean_
-            X_transformed = X.dot(self.components_.T)
+            precomputed_mean_impact = self.mean_ @ self.components_.T
+            mean_impact = cp.ones((X.shape[0], 1)) @ precomputed_mean_impact.reshape(1, -1)
+            X_transformed = X.dot(self.components_.T) -mean_impact
 
             if self.whiten:
                 self.components_ *= self.singular_values_.reshape((-1, 1))