Updating residuals #27

lennybronner · 2024-09-24T23:45:46Z

Description

This PR changes how we compute the residuals for the linear solvers. For LinearSolver we add the ability to compute the k-fold residual as an estimate for the leave one out residual. The K residual, is computed by generating K folds of the data, re-estimating the model on all but fold and computing the residual on the outstanding fold. All the K residuals are then concatenated together.

If K is set to None, then we just compute the training residual. For the OLS model we still allow for the exact leave one out residual, if K is equal to the number of units.

Jira Ticket

This is necessary to do this ticket: https://arcpublishing.atlassian.net/browse/ELEX-4549

Test Steps

Unit tests have been added

dmnapolitano · 2024-09-30T16:05:08Z

src/elexsolver/LinearSolver.py

+        y: np.ndarray,
+        weights: np.ndarray | None = None,
+        lambda_: float = 0.0,
+        cache: bool = True,


Can you help me understand how the cache is used? It seems like there are cases where we want to (not) keep track of certain things computed during the k-fold residuals process, but I'm not sure I understand why or why not. Plus, this looks like it's only used in the subclasses, so I'd suggest we either (a) remove it from here, or (b) add a method in this super-class that's like def cache_things() (bad method name) where the logic for this is used and can be shared by all subclasses 🤔

dmnapolitano · 2024-09-30T16:12:15Z

src/elexsolver/LinearSolver.py

        """
        Fits model
        """
        raise NotImplementedError

-    def predict(self, x: np.ndarray) -> np.ndarray:
+    def predict(self, x: np.ndarray, coefficients: np.ndarray | None = None) -> np.ndarray:


How would you feel about leaving this (public method) like:

def predict(self, x: np.ndarray) -> np.ndarray: return _predict(x, self.coefficients)

and adding:

def _predict(self, x: np.ndarray, coefficients: np.ndarray) -> np.ndarray: return x @ coefficients

and then in residuals() you call _predict(x_test, coefficients_k) ? The way it's written now invites users to pass in any arbitrary coefficients, which might not be a good idea 🤔

dmnapolitano · 2024-09-30T16:15:36Z

src/elexsolver/OLSRegressionSolver.py

+        center: bool = True,
+        **kwargs
+    ) -> np.ndarray:
+        if K == x.shape[0]:


Is it possible that any other subclasses would benefit from this logic? 🤔

dmnapolitano · 2024-09-30T16:17:04Z

tests/conftest.py

@@ -3,6 +3,7 @@
 import os
 import sys

+import numpy as np


I know you said you're struggling to add unit tests for all these changes and I'm curious what you think is missing. These look good to me although I'll keep thinking about it 🤔 🎉

dmnapolitano · 2024-09-30T16:18:20Z

tests/test_quantile.py

@@ -23,7 +23,7 @@ def test_basic_median_1():
    preds = quantreg.predict(x)
    # you'd think it would be 8 instead of 7.5, but run quantreg in R to confirm
    # has to do with missing intercept
-    np.testing.assert_array_equal(preds, [[7.5, 7.5, 7.5, 15]])
+    np.testing.assert_array_equal(preds, [[7.5], [7.5], [7.5], [15]])


Why did this return type have to change? 🤔

lennybronner added 2 commits July 3, 2024 00:17

improving residuals

64debde

Merge branch 'develop' into updating-residuals

585ae38

lennybronner requested a review from a team as a code owner September 24, 2024 23:45

lennybronner added 2 commits September 24, 2024 21:09

fixed unit tests

d752cd7

ran linter

835a3f6

lennybronner added the 💭 wip label Sep 25, 2024

lennybronner mentioned this pull request Sep 25, 2024

Elex 4549 add model flexibility washingtonpost/elex-live-model#109

Open

lennybronner added 2 commits September 24, 2024 22:22

updated tests

fd442b3

linter

db3c80b

dmnapolitano reviewed Sep 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updating residuals #27

Updating residuals #27

lennybronner commented Sep 24, 2024 •

edited

Loading

dmnapolitano Sep 30, 2024

dmnapolitano Sep 30, 2024

dmnapolitano Sep 30, 2024

dmnapolitano Sep 30, 2024

dmnapolitano Sep 30, 2024

Updating residuals #27

Are you sure you want to change the base?

Updating residuals #27

Conversation

lennybronner commented Sep 24, 2024 • edited Loading

Description

Jira Ticket

Test Steps

dmnapolitano Sep 30, 2024

Choose a reason for hiding this comment

dmnapolitano Sep 30, 2024

Choose a reason for hiding this comment

dmnapolitano Sep 30, 2024

Choose a reason for hiding this comment

dmnapolitano Sep 30, 2024

Choose a reason for hiding this comment

dmnapolitano Sep 30, 2024

Choose a reason for hiding this comment

lennybronner commented Sep 24, 2024 •

edited

Loading