-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable more rigorous authorship detection by default #461
Comments
@yamidark what do you think? I remember I asked you to switch off this feature to make the analysis faster. Can we bring it back? |
Yes, we can bring this back if we want. Currently, the 'git blame' command we use only ignores whitespace changes, we can track code movement/copied using the This could make the analysis more 'accurate', but would take a (much) larger amount of time depending on the 'level' of code movement we want to track (currently there are 3 levels for the |
Perhaps we can try different settings on cs2103 dataset to see how much of a time difference it makes? |
We can use a smaller set (e.g., 10 teams) and extrapolate. |
Tested using these config files(config.zip) No move/copy detection: ~2min30s Seems setting at least Will also try testing the performance on a much larger repo (Teammates) |
Thanks for the update @yamidark |
Tested with the same config file given: The largest amount of authorship line changes occurs from Also tested on TEAMMATES repo with |
I see.
|
Copy and move detection is done together ( Summary on what each 'level' of copy and move detection can be found in my comment above, more details can be found here on the git blame docs. |
For our use case, it is reasonable to not credit the author for moving code but it is also reasonable to credit the author for copied code as the copy could be used for a different purpose. |
Yes, I agree we should credit the original author for code that is moved, but copied code could be credited to the person who copies.
Yes, that sounds good to me. |
Current: last person modified is identified as the author
Suggested: Use git features to ignore whitespace changes, code movement etc. by default but give a way to switch to the faster (less accurate) method
The text was updated successfully, but these errors were encountered: