Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#944] Implement authorship analysis #2030

Closed

Conversation

SkyBlaise99
Copy link
Contributor

Fixes #944

Proposed commit message

A line is credited to the author who last modified it.

Another author might have written the line initially and the current
author only modified it slightly. In such a case, the current author
gets credited for work that is not entirely done by them.

Lets analyze how similar a line is to its ancestor line (previous
version of the line) and give partial or full credit to the author
based on the analysis. This feature is turned off by default as
it takes quite long to run.

Other information

There is a small change as compared to the initial idea. I have set the default value for isFullCredit to true in LineInfo.java. When the feature is turned off, the default behavior should be that the last author is given full credit.

Sample preview:

image

@SkyBlaise99 SkyBlaise99 marked this pull request as ready for review August 24, 2023 08:29
@chan-j-d chan-j-d requested a review from a team August 24, 2023 17:24
@damithc
Copy link
Collaborator

damithc commented Sep 4, 2023

@SkyBlaise99 How is the performance affected? For example, for https://github.com/nus-cs2103-AY2223S2/tp-dashboard
You can compare locally and also on GitHub Actions environment, which has limitations on minutes and memory.

getDeletedLineWithHighestSimilarityInDiff returning null
@SkyBlaise99
Copy link
Contributor Author

SkyBlaise99 commented Sep 5, 2023

Here's a comparison:

Local GitHub Actions
Normal 14min 4min
With Authorship Analysis > 2h 1h 30min

I'm not sure why but somehow running locally is slower.

P.S. currently looking into solutions to lower the timing, but that will probably be in another pr.

@damithc
Copy link
Collaborator

damithc commented Sep 7, 2023

Thanks for the performance data, @SkyBlaise99
Given this huge disparity of performance, we might not merge this to the master branch yet, until we are sure there is a solution that has acceptable performance tradeoffs. In the meantime, we can merge this to a separate feature branch and you can this code in a separate long-lived feature branch and you can PR against that.

@reposense/active-developers can you guys set up such a long-lived feature branch and merge this code to that branch?

@damithc
Copy link
Collaborator

damithc commented Sep 7, 2023

Others, you are welcome and encouraged to give feedback on this work, but there is no need for a thorough code review until we are ready to merge this work to the master branch.

@chan-j-d
Copy link
Contributor

chan-j-d commented Sep 8, 2023

Hi @SkyBlaise99, I've created a branch on the repo for you to work with at https://github.com/reposense/RepoSense/tree/944-analyze-authorship. Do let me know if you need any additional support for it

@SkyBlaise99
Copy link
Contributor Author

@damithc I explored around with caching and the performance it now around 1h. I'll continue to try and see if it can be further reduced.

@chan-j-d thanks. So for any future works, I'll just push to the branch instead? Then this PR can be closed first?

@ckcherry23
Copy link
Member

@SkyBlaise99 For any future work, I think the recommended workflow is to create PRs with 944-analyze-authorship as the target branch instead of the master branch. The team may also give some preliminary feedback for each PR but we don't need a thorough code review to merge to this new branch.

@SkyBlaise99
Copy link
Contributor Author

@ckcherry23 noted with thanks. Then in that case should I close this PR since it is already on 944-analyze-authorship branch?

@gok99
Copy link
Contributor

gok99 commented Oct 1, 2023

@SkyBlaise99 you could mark it as draft - it would be a long-term PR that you can merge into, and mark back as open when ready

@chan-j-d
Copy link
Contributor

chan-j-d commented Oct 5, 2023

@SkyBlaise99 Yep, closing this PR sounds like a good idea. If you need some comments from us, could PR to the other branch instead and ping us.

@SkyBlaise99
Copy link
Contributor Author

Okay, then closing this PR and further improvements will be made towards this branch instead.

@SkyBlaise99 SkyBlaise99 closed this Oct 7, 2023
@github-actions
Copy link
Contributor

github-actions bot commented Oct 7, 2023

The following links are for previewing this pull request:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve code authorship attribution
5 participants