Inconsistent results on different computers #166
Description
Hi,
I have recently updated the commit I am working with to #160. The commit I used to work with before the update was 30abe96 (Mar 31, 2016).
After updating I noticed that in some cases the model fit by the Earth estimator can be different between two computers running the exact same script, even when both computers have identical virtual environments, as indicated by "pip freeze" generating the exact same output.
The trained estimator is consistent when the script is run repeatedly on the same computer. Also, the trained estimator can be consistent between two different computers, but this is not guaranteed. So far I have been unable to find any consistent difference between computers where the trained estimator ends up different, except perhaps for a difference between i7 and i5 CPUs. I have no idea whether this could be relevant in any way.
This inconsistency does not occur when using the 30abe96 commit.
I was able to replicate this by slightly modifying one of the example from the py-earth documentation as show below. I have set the verbose output on, which shows that there are small differences between two runs of this code made on the same computer.
As you can see in the outputs from computer A and B below (both running Windows 8.1), there are a lot of similarities in the output, but also some significant differences. The numbers calculated in the forward pass are identical for the first five terms, but begin to differ, usually slightly from the 6th term onward. The backward pass has difference numbers for all the terms, and the selected iterations are different for the two cases, 21 vs. 24.
The final model basis displayed below also shows differences. On a private dataset (which I cannot share), the forward pass was identical, and the differences only began showing on the backward pass.
I'll be grateful for any idea on what might cause such an inconsistency in the trained estimator. Thank you for your help.
The modified example code:
import numpy
from pyearth import Earth
from matplotlib import pyplot
#Create some fake data
numpy.random.seed(0)
m = 1000
n = 10
X = 80*numpy.random.uniform(size=(m,n)) - 4*numpy.pi
y = X[:,3]*(numpy.abs(X[:,6] - numpy.pi) + 15*numpy.random.gamma(2,size=m)) + 2*(X[:,2] - 2) -1*(X[:,7] + 3)**2
#Fit an Earth model
model = Earth(verbose=1,max_terms=30, max_degree=1, allow_missing=False, penalty=3, endspan_alpha=None,
endspan=None, minspan_alpha=None, minspan=None, thresh=0.0001, zero_tol=None, min_search_points=None,
check_every=None, allow_linear=True, use_fast=True, fast_K=20, fast_h=1, smooth=None,
enable_pruning=True)
model.fit(X,y)
#Print the model
print(model.trace())
print(model.summary())
The output I receive on computer A:
Beginning forward pass
---------------------------------------------------------------------------
iter parent var knot mse terms gcv rsq grsq
---------------------------------------------------------------------------
0 - - - 4950518.324962 1 4960434.233 0.000 0.000
1 0 7 311 2737418.709755 3 2770565.758 0.447 0.441
2 0 3 -1 1022465.093864 4 1040071.158 0.793 0.790
3 0 6 320 754542.772780 6 775335.527 0.848 0.844
4 0 7 200 740470.232168 8 768647.106 0.850 0.845
5 0 7 554 729236.351357 10 764757.622 0.853 0.846
6 0 6 255 722135.459949 12 765126.178 0.854 0.846
7 0 8 -1 720170.334629 13 766986.419 0.855 0.845
8 0 8 405 717642.698760 15 772243.396 0.855 0.844
9 0 8 661 714963.572360 17 777403.874 0.856 0.843
10 0 2 -1 712990.074603 18 779315.897 0.856 0.843
11 0 2 514 710075.084704 20 784308.080 0.857 0.842
12 0 3 217 708491.421267 22 790848.617 0.857 0.841
13 0 3 434 705014.947214 24 795348.884 0.858 0.840
14 0 3 672 702076.013034 26 800513.337 0.858 0.839
15 0 3 604 700045.813445 28 806790.441 0.859 0.837
16 0 8 331 698728.157244 30 813986.866 0.859 0.836
17 0 8 981 695971.429421 32 819597.703 0.859 0.835
-------------------------------------------------------------------------
Stopping Condition 0: Reached maximum number of terms
Beginning pruning pass
------------------------------------------------------
iter bf terms mse gcv rsq grsq
------------------------------------------------------
0 - 32 697629.17 821549.912 0.859 0.834
1 7 31 696049.83 815260.480 0.859 0.836
2 19 30 696087.18 810910.243 0.859 0.837
3 20 29 696098.55 806564.867 0.859 0.837
4 27 28 696093.12 802235.036 0.859 0.838
5 24 27 696100.61 797954.747 0.859 0.839
6 15 26 696100.61 793700.130 0.859 0.840
7 9 25 696100.61 789479.451 0.859 0.841
8 10 24 696100.61 785292.349 0.859 0.842
9 13 23 696044.23 781075.201 0.859 0.843
10 30 22 696071.59 776985.068 0.859 0.843
11 28 21 696100.61 772928.979 0.859 0.844
12 23 20 696100.61 768872.680 0.859 0.845
13 16 19 696310.24 765078.566 0.859 0.846
14 21 18 697463.55 762345.015 0.859 0.846
15 26 17 698870.46 759905.290 0.859 0.847
16 1 16 701506.92 758810.593 0.858 0.847
17 18 15 704209.63 757788.296 0.858 0.847
18 17 14 706125.72 755924.315 0.857 0.848
19 14 13 710259.13 756430.911 0.857 0.848
20 12 12 711204.04 753543.977 0.856 0.848
21 3 11 716678.64 755451.425 0.855 0.848
22 22 10 717283.22 752222.248 0.855 0.848
23 29 9 722648.68 753983.483 0.854 0.848
24 31 8 724040.81 751592.497 0.854 0.848
25 4 7 730640.99 754594.851 0.852 0.848
26 5 6 749443.92 770096.162 0.849 0.845
27 8 5 770625.58 787863.245 0.844 0.841
28 2 4 924877.06 940802.731 0.813 0.810
29 11 3 1202850.60 1217415.760 0.757 0.755
30 25 2 2977555.15 2998507.979 0.399 0.396
31 6 1 4950518.32 4960434.233 0.000 0.000
--------------------------------------------------------
Selected iteration: 24
Forward Pass
---------------------------------------------------------------------------
iter parent var knot mse terms gcv rsq grsq
---------------------------------------------------------------------------
0 - - - 4950518.324962 1 4960434.233 0.000 0.000
1 0 7 311 2737418.709755 3 2770565.758 0.447 0.441
2 0 3 -1 1022465.093864 4 1040071.158 0.793 0.790
3 0 6 320 754542.772780 6 775335.527 0.848 0.844
4 0 7 200 740470.232168 8 768647.106 0.850 0.845
5 0 7 554 729236.351357 10 764757.622 0.853 0.846
6 0 6 255 722135.459949 12 765126.178 0.854 0.846
7 0 8 -1 720170.334629 13 766986.419 0.855 0.845
8 0 8 405 717642.698760 15 772243.396 0.855 0.844
9 0 8 661 714963.572360 17 777403.874 0.856 0.843
10 0 2 -1 712990.074603 18 779315.897 0.856 0.843
11 0 2 514 710075.084704 20 784308.080 0.857 0.842
12 0 3 217 708491.421267 22 790848.617 0.857 0.841
13 0 3 434 705014.947214 24 795348.884 0.858 0.840
14 0 3 672 702076.013034 26 800513.337 0.858 0.839
15 0 3 604 700045.813445 28 806790.441 0.859 0.837
16 0 8 331 698728.157244 30 813986.866 0.859 0.836
17 0 8 981 695971.429421 32 819597.703 0.859 0.835
---------------------------------------------------------------------------
Stopping Condition 0: Reached maximum number of terms
Pruning Pass
--------------------------------------------------------
iter bf terms mse gcv rsq grsq
--------------------------------------------------------
0 - 32 697629.17 821549.912 0.859 0.834
1 7 31 696049.83 815260.480 0.859 0.836
2 19 30 696087.18 810910.243 0.859 0.837
3 20 29 696098.55 806564.867 0.859 0.837
4 27 28 696093.12 802235.036 0.859 0.838
5 24 27 696100.61 797954.747 0.859 0.839
6 15 26 696100.61 793700.130 0.859 0.840
7 9 25 696100.61 789479.451 0.859 0.841
8 10 24 696100.61 785292.349 0.859 0.842
9 13 23 696044.23 781075.201 0.859 0.843
10 30 22 696071.59 776985.068 0.859 0.843
11 28 21 696100.61 772928.979 0.859 0.844
12 23 20 696100.61 768872.680 0.859 0.845
13 16 19 696310.24 765078.566 0.859 0.846
14 21 18 697463.55 762345.015 0.859 0.846
15 26 17 698870.46 759905.290 0.859 0.847
16 1 16 701506.92 758810.593 0.858 0.847
17 18 15 704209.63 757788.296 0.858 0.847
18 17 14 706125.72 755924.315 0.857 0.848
19 14 13 710259.13 756430.911 0.857 0.848
20 12 12 711204.04 753543.977 0.856 0.848
21 3 11 716678.64 755451.425 0.855 0.848
22 22 10 717283.22 752222.248 0.855 0.848
23 29 9 722648.68 753983.483 0.854 0.848
24 31 8 724040.81 751592.497 0.854 0.848
25 4 7 730640.99 754594.851 0.852 0.848
26 5 6 749443.92 770096.162 0.849 0.845
27 8 5 770625.58 787863.245 0.844 0.841
28 2 4 924877.06 940802.731 0.813 0.810
29 11 3 1202850.60 1217415.760 0.757 0.755
30 25 2 2977555.15 2998507.979 0.399 0.396
31 6 1 4950518.32 4960434.233 0.000 0.000
--------------------------------------------------------
Selected iteration: 24
Earth Model
-------------------------------------
Basis Function Pruned Coefficient
-------------------------------------
(Intercept) No 72582.3
h(x7-35.0812) Yes None
h(35.0812-x7) No -42.4157
x3 Yes None
h(x6-4.22723) No -1065.73
h(4.22723-x6) No 1113.05
h(x7-1.00788) No -87.8541
h(1.00788-x7) Yes None
h(x7-53.3666) No -77.5691
h(53.3666-x7) Yes None
h(x6-66.34) Yes None
h(66.34-x6) No -1094.65
x8 Yes None
h(x8-51.3231) Yes None
h(51.3231-x8) Yes None
h(x8+6.2673) Yes None
h(-6.2673-x8) Yes None
x2 Yes None
h(x2+9.81187) Yes None
h(-9.81187-x2) Yes None
h(x3-66.4284) Yes None
h(66.4284-x3) Yes None
h(x3-63.2394) Yes None
h(63.2394-x3) Yes None
h(x3-65.8573) Yes None
h(65.8573-x3) No -54.4066
h(x3-62.6728) Yes None
h(62.6728-x3) Yes None
h(x8+11.7519) Yes None
h(-11.7519-x8) Yes None
h(x8+10.5919) Yes None
h(-10.5919-x8) Yes None
-------------------------------------
MSE: 724040.8075, GCV: 751592.4974, RSQ: 0.8537, GRSQ: 0.8485
The output I receive on computer B:
Beginning forward pass
---------------------------------------------------------------------------
iter parent var knot mse terms gcv rsq grsq
---------------------------------------------------------------------------
0 - - - 4950518.324962 1 4960434.233 0.000 0.000
1 0 7 311 2737418.709755 3 2770565.758 0.447 0.441
2 0 3 -1 1022465.093864 4 1040071.158 0.793 0.790
3 0 6 320 754542.772780 6 775335.527 0.848 0.844
4 0 7 200 740470.232168 8 768647.106 0.850 0.845
5 0 7 554 729236.351357 10 764757.622 0.853 0.846
6 0 6 255 722316.228852 12 765317.709 0.854 0.846
7 0 8 -1 720316.871586 13 767142.482 0.854 0.845
8 0 8 405 717697.402167 15 772302.261 0.855 0.844
9 0 8 661 715006.680289 17 777450.747 0.856 0.843
10 0 2 -1 713035.003056 18 779365.005 0.856 0.843
11 0 2 514 710128.220766 20 784366.771 0.857 0.842
12 0 3 217 708664.271302 22 791041.559 0.857 0.841
13 0 3 434 705142.169448 24 795492.407 0.858 0.840
14 0 3 672 702191.404756 26 800644.908 0.858 0.839
15 0 3 604 700187.073599 28 806953.241 0.859 0.837
16 0 8 331 698867.906750 30 814149.668 0.859 0.836
17 0 8 981 696100.611195 32 819749.831 0.859 0.835
-------------------------------------------------------------------------
Stopping Condition 0: Reached maximum number of terms
Beginning pruning pass
------------------------------------------------------
iter bf terms mse gcv rsq grsq
------------------------------------------------------
0 - 32 696100.61 819749.831 0.859 0.835
1 16 31 696100.61 815319.957 0.859 0.836
2 29 30 696100.61 810925.893 0.859 0.837
3 9 29 696100.61 806567.256 0.859 0.837
4 2 28 696100.61 802243.666 0.859 0.838
5 20 27 696100.61 797954.747 0.859 0.839
6 4 26 696100.61 793700.130 0.859 0.840
7 17 25 696100.61 789479.451 0.859 0.841
8 23 24 696100.61 785292.349 0.859 0.842
9 27 23 696100.61 781138.470 0.859 0.843
10 3 22 696100.61 777017.461 0.859 0.843
11 14 21 696100.61 772928.979 0.859 0.844
12 30 20 696100.61 768872.680 0.859 0.845
13 15 19 696310.24 765078.566 0.859 0.846
14 18 18 697149.10 762001.321 0.859 0.846
15 21 17 698309.49 759295.329 0.859 0.847
16 26 16 699773.34 756935.402 0.859 0.847
17 13 15 703223.30 756726.922 0.858 0.847
18 7 14 706929.75 756785.047 0.857 0.847
19 12 13 711449.91 757699.103 0.856 0.847
20 31 12 712267.50 754670.755 0.856 0.848
21 28 11 714429.27 753080.363 0.856 0.848
22 19 10 719584.21 754635.325 0.855 0.848
23 22 9 724988.15 756424.389 0.854 0.848
24 24 8 725762.72 753379.932 0.853 0.848
25 10 7 732099.50 756101.172 0.852 0.848
26 8 6 742110.69 762560.854 0.850 0.846
27 5 5 758739.65 775711.443 0.847 0.844
28 6 4 879885.89 895036.848 0.822 0.820
29 11 3 1116514.74 1130034.475 0.774 0.772
30 25 2 2804385.57 2824119.811 0.434 0.431
31 1 1 4950518.32 4960434.233 0.000 0.000
--------------------------------------------------------
Selected iteration: 21
Forward Pass
---------------------------------------------------------------------------
iter parent var knot mse terms gcv rsq grsq
---------------------------------------------------------------------------
0 - - - 4950518.324962 1 4960434.233 0.000 0.000
1 0 7 311 2737418.709755 3 2770565.758 0.447 0.441
2 0 3 -1 1022465.093864 4 1040071.158 0.793 0.790
3 0 6 320 754542.772780 6 775335.527 0.848 0.844
4 0 7 200 740470.232168 8 768647.106 0.850 0.845
5 0 7 554 729236.351357 10 764757.622 0.853 0.846
6 0 6 255 722316.228852 12 765317.709 0.854 0.846
7 0 8 -1 720316.871586 13 767142.482 0.854 0.845
8 0 8 405 717697.402167 15 772302.261 0.855 0.844
9 0 8 661 715006.680289 17 777450.747 0.856 0.843
10 0 2 -1 713035.003056 18 779365.005 0.856 0.843
11 0 2 514 710128.220766 20 784366.771 0.857 0.842
12 0 3 217 708664.271302 22 791041.559 0.857 0.841
13 0 3 434 705142.169448 24 795492.407 0.858 0.840
14 0 3 672 702191.404756 26 800644.908 0.858 0.839
15 0 3 604 700187.073599 28 806953.241 0.859 0.837
16 0 8 331 698867.906750 30 814149.668 0.859 0.836
17 0 8 981 696100.611195 32 819749.831 0.859 0.835
---------------------------------------------------------------------------
Stopping Condition 0: Reached maximum number of terms
Pruning Pass
--------------------------------------------------------
iter bf terms mse gcv rsq grsq
--------------------------------------------------------
0 - 32 696100.61 819749.831 0.859 0.835
1 16 31 696100.61 815319.957 0.859 0.836
2 29 30 696100.61 810925.893 0.859 0.837
3 9 29 696100.61 806567.256 0.859 0.837
4 2 28 696100.61 802243.666 0.859 0.838
5 20 27 696100.61 797954.747 0.859 0.839
6 4 26 696100.61 793700.130 0.859 0.840
7 17 25 696100.61 789479.451 0.859 0.841
8 23 24 696100.61 785292.349 0.859 0.842
9 27 23 696100.61 781138.470 0.859 0.843
10 3 22 696100.61 777017.461 0.859 0.843
11 14 21 696100.61 772928.979 0.859 0.844
12 30 20 696100.61 768872.680 0.859 0.845
13 15 19 696310.24 765078.566 0.859 0.846
14 18 18 697149.10 762001.321 0.859 0.846
15 21 17 698309.49 759295.329 0.859 0.847
16 26 16 699773.34 756935.402 0.859 0.847
17 13 15 703223.30 756726.922 0.858 0.847
18 7 14 706929.75 756785.047 0.857 0.847
19 12 13 711449.91 757699.103 0.856 0.847
20 31 12 712267.50 754670.755 0.856 0.848
21 28 11 714429.27 753080.363 0.856 0.848
22 19 10 719584.21 754635.325 0.855 0.848
23 22 9 724988.15 756424.389 0.854 0.848
24 24 8 725762.72 753379.932 0.853 0.848
25 10 7 732099.50 756101.172 0.852 0.848
26 8 6 742110.69 762560.854 0.850 0.846
27 5 5 758739.65 775711.443 0.847 0.844
28 6 4 879885.89 895036.848 0.822 0.820
29 11 3 1116514.74 1130034.475 0.774 0.772
30 25 2 2804385.57 2824119.811 0.434 0.431
31 1 1 4950518.32 4960434.233 0.000 0.000
--------------------------------------------------------
Selected iteration: 21
Earth Model
-------------------------------------
Basis Function Pruned Coefficient
-------------------------------------
(Intercept) No 4696.33
h(x7-35.0812) No -62.1822
h(35.0812-x7) Yes None
x3 Yes None
h(x6-4.22723) Yes None
h(4.22723-x6) No 43.8207
h(x7-1.00788) No -36.4057
h(1.00788-x7) Yes None
h(x7-53.3666) No -56.807
h(53.3666-x7) Yes None
h(x6-66.34) No -976.209
h(66.34-x6) No -28.1518
x8 Yes None
h(x8-51.3231) Yes None
h(51.3231-x8) Yes None
h(x8+6.2673) Yes None
h(-6.2673-x8) Yes None
x2 Yes None
h(x2+9.81187) Yes None
h(-9.81187-x2) No -246.073
h(x3-66.4284) Yes None
h(66.4284-x3) Yes None
h(x3-63.2394) No 261.51
h(63.2394-x3) Yes None
h(x3-65.8573) No -1016.35
h(65.8573-x3) No -53.5466
h(x3-62.6728) Yes None
h(62.6728-x3) Yes None
h(x8+11.7519) Yes None
h(-11.7519-x8) Yes None
h(x8+10.5919) Yes None
h(-10.5919-x8) Yes None
-------------------------------------
MSE: 714429.2661, GCV: 753080.3626, RSQ: 0.8557, GRSQ: 0.8482