Paper: Computational Resource Optimisation in Feature Selection under Class Imbalance Conditions #947

AmadiGabriel · 2024-06-08T05:02:42Z

If you are creating this PR in order to submit a draft of your paper, please name your PR with Paper: <title>. An editor will then add a paper label and GitHub Actions will be run to check and build your paper.

See the project readme for more information.

Editor: Chris Calloway @cbcunc

Reviewers:

Jane Adams @janeadams
Andrei Paleyes @apaleyes

…ings into 2024

github-actions · 2024-06-09T16:33:41Z

Curvenote Preview

Directory	Preview	Checks	Updated (UTC)
papers/amadi_udu	🔍 Inspect	✅ 80 checks passed (4 optional)	Jul 5, 2024, 9:51 AM

cbcunc · 2024-06-20T22:32:34Z

Review reminder sent to @janeadams

janeadams · 2024-06-26T05:47:08Z

A succinct and interesting read on evaluating permutation feature importance (PFI) impacts on three different classification models (Random Forest, LightGBM, and SVM) with varying proportions of subsampled data featuring unbalanced classes. I have minor comments but overall I think this a great contribution.

The dual axes in the processing time figure were odd to me at first; it might be valuable to explain that SVM's poor performance relative to the other two methods is likely due to its poor parallelizability (if that's a word)
The "decrease in AUC" figures are confusing in that negative x-axis values must therefore indicate increased in AUC? (Correct me if I am misunderstanding). This forces the reader to think about a "double negative makes a positive" which adds possibly unnecessary complexity to interpretation. I would recommend either 1) changing the axis / measure to just be "change in AUC" and/or 2) adding annotations directly onto the white space with an arrow indicating "poorer performance this direction" or similar.

I particularly appreciated the pre-filtering step of using hierarchical clustering of features to account for potential collinearities. I also appreciated that the authors used multiple data sets and evaluated at a range of sample proportions. This is a nice example of how a lot of scientific computing python libraries can come together into a single interesting experiment.

AmadiGabriel · 2024-06-27T09:15:02Z

A succinct and interesting read on evaluating permutation feature importance (PFI) impacts on three different classification models (Random Forest, LightGBM, and SVM) with varying proportions of subsampled data featuring unbalanced classes. I have minor comments but overall I think this a great contribution.

The dual axes in the processing time figure were odd to me at first; it might be valuable to explain that SVM's poor performance relative to the other two methods is likely due to its poor parallelizability (if that's a word)

The "decrease in AUC" figures are confusing in that negative x-axis values must therefore indicate increased in AUC? (Correct me if I am misunderstanding). This forces the reader to think about a "double negative makes a positive" which adds possibly unnecessary complexity to interpretation. I would recommend either 1) changing the axis / measure to just be "change in AUC" and/or 2) adding annotations directly onto the white space with an arrow indicating "poorer performance this direction" or similar.

I particularly appreciated the pre-filtering step of using hierarchical clustering of features to account for potential collinearities. I also appreciated that the authors used multiple data sets and evaluated at a range of sample proportions. This is a nice example of how a lot of scientific computing python libraries can come together into a single interesting experiment.

Thank you for the encouraging comments and observations on the paper @janeadams . We are currentlly addressing some of the comments raised by @apaleyes . Hopefully, all observations raised will be responded to early next week and the paper updated accordingly.

AmadiGabriel added 14 commits June 8, 2024 05:40

all changes

18b5847

Update main.md

3a7676c

Update main.md

a2bf6a3

Update myst.yml

74a8c06

Create images

7767e98

Update main.md

a4000cc

remove images

69efd33

one image

e0989c4

Update main.md

5fe3e62

resultspics

076b9a7

aMerge branch '2024' of https://github.com/AmadiGabriel/scipy_proceed…

6b8397d

…ings into 2024

Update main.md

578d3e8

Update myst.yml

6cbc868

Update main.md

45d3c1c

ameyxd self-assigned this Jun 8, 2024

ameyxd added the paper This indicates that the PR in question is a paper label Jun 8, 2024

mepa changed the title ~~paper: Computational Resource Optimisation in Feature Selection under Class Imbalance Conditions~~ Paper: Computational Resource Optimisation in Feature Selection under Class Imbalance Conditions Jun 9, 2024

Credit Notes myst.yml

05f3cd5

AmadiGabriel added 11 commits June 9, 2024 18:00

Update main.md

62d8fa1

Update main.md

0ad5c85

Update myst.yml

fd29f61

removed googlesholarID

de0fa7f

Update myst.yml

c6a637b

Update main.md

301afc6

Updated subfigures

02b0181

Update main.md

eef3f3b

Update main.md

fa0fb5e

updated all subfigures

91505e2

Update main.md

5d6b503

Update main.md

1550c7a

Update main.md

351c53c

AmadiGabriel added 25 commits July 3, 2024 22:45

Update main.md

e8bd49d

merged fig2

ce1d7fa

Update main.md

b8c7fdc

Delete papers/amadi_udu/images/gsvs_cmap.png

edbadc5

Delete papers/amadi_udu/images/gsvs_hierclus.png

cc3291e

Delete papers/amadi_udu/images/gsvs_hierclus_cmap.png

3db8513

merged fig 2

b1ba7d2

Update main.md

9afc966

Update main.md

dadada1

Delete papers/amadi_udu/images/rf_statlog_shuttle_feature_0.png

290ce18

Delete papers/amadi_udu/images/lgbm_statlog_shuttle_feature_0.png

0834a43

Delete papers/amadi_udu/images/svm_statlog_shuttle_feature_0.png

97e98c9

updated fig5

1094652

Delete papers/amadi_udu/images/rf_census_income_feature_2.png

cc7cf3b

Delete papers/amadi_udu/images/lgbm_census_income_feature_2.png

518003d

Delete papers/amadi_udu/images/svm_census_income_feature_2.png

32bc957

census_income_fig

83eb8b1

Update main.md

faa1aa7

Delete papers/amadi_udu/images/lgbm_bank_marketing_feature_5.png

7ae07d4

Delete papers/amadi_udu/images/rf_bank_marketing_feature_5.png

4a338e2

Delete papers/amadi_udu/images/svm_bank_marketing_feature_5.png

fc20a80

fig 5 updated

1a31a8d

Update main.md

c3ef82e

Update main.md

acb9438

Update main.md

8e913d1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Paper: Computational Resource Optimisation in Feature Selection under Class Imbalance Conditions #947

Paper: Computational Resource Optimisation in Feature Selection under Class Imbalance Conditions #947

AmadiGabriel commented Jun 8, 2024 •

edited by cbcunc

Loading

github-actions bot commented Jun 9, 2024 •

edited

Loading

cbcunc commented Jun 20, 2024

janeadams commented Jun 26, 2024 •

edited

Loading

AmadiGabriel commented Jun 27, 2024

Paper: Computational Resource Optimisation in Feature Selection under Class Imbalance Conditions #947

Are you sure you want to change the base?

Paper: Computational Resource Optimisation in Feature Selection under Class Imbalance Conditions #947

Conversation

AmadiGabriel commented Jun 8, 2024 • edited by cbcunc Loading

github-actions bot commented Jun 9, 2024 • edited Loading

cbcunc commented Jun 20, 2024

janeadams commented Jun 26, 2024 • edited Loading

AmadiGabriel commented Jun 27, 2024

AmadiGabriel commented Jun 8, 2024 •

edited by cbcunc

Loading

github-actions bot commented Jun 9, 2024 •

edited

Loading

janeadams commented Jun 26, 2024 •

edited

Loading