get_bugbug_labels no longer adds nobug type to regression training data#3396
Open
avinashselvam wants to merge 1 commit intomozilla:masterfrom
Open
get_bugbug_labels no longer adds nobug type to regression training data#3396avinashselvam wants to merge 1 commit intomozilla:masterfrom
avinashselvam wants to merge 1 commit intomozilla:masterfrom
Conversation
Collaborator
|
@avinashselvam this is part of the request from #539. The other part is not to consider bugs with type "enhancement" or "task" as label 0 for the regression model. |
Member
|
@avinashselvam are you still interested in working on this? If so, I will be glad to help. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
#539
Modified
get_bugbug_labelsindefect.pyto include only those data points that are labelled eitherregressionorbug_no_regressionin the training set.Training the model without changes
72486 non-regression bugs
Cross Validation scores:
Accuracy: f0.9731263445549161 (+/- 0.0012810455820845609)
Precision: f0.9560802008310938 (+/- 0.006503421458310747)
Recall: f0.9316432362619518 (+/- 0.0042866900183067425)
Training the model after changes
71597 non-regression bugs (889 dropped)
Cross Validation scores:
Accuracy: f0.9739072259525028 (+/- 0.0019480324611321944)
Precision: f0.9561803892880535 (+/- 0.006928496874119621)
Recall: f0.9358629670750973 (+/- 0.0045683573571298)
Minor improvement in precision and recall.
Should categories
task,enhancement,featurealso be removed from the training data for regression?Please let me know if I have misunderstood the task.