Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#1546] RepoCloner: fix local repos relative pathing and spaces #1562

Merged
merged 39 commits into from
Aug 22, 2021

Conversation

chan-j-d
Copy link
Contributor

@chan-j-d chan-j-d commented Jul 10, 2021

Fixes #1546

For relative paths, the issues lies in RepoCloner's various process spawning methods. While the local repository's location is correctly verified in the RepoLocation class, the subsequent processes spawned have their root path changed resulting in --repo arguments with relative pathing to fail. This PR addresses this by adjusting the root path back to the user directory and the output directory is changed to account for this. Related methods with this issue have also been updated.

For paths containing spaces, the issue comes from the reconstruction of the Path objects into a string to be executed as a command. While local repos with paths containing spaces (when inside quotation marks) are accepted initially when executing the jar file, the program will fail to clone them. This is fixed by adding quotation marks to the various git clone commands that accept paths.

GitClone::cloneBare's output path has been altered to match GitClone::cloneBareAsync's logic more closely. At present, cloneBare's actual output folder is relative to the location specified in the config file. It is difficult to specify an exact output directory for a repo to be cloned to. This is necessary for the test case that I have written.

Due to how commands are reconstructed as strings to be ran on CMD in windows, there is a particular issue with repo paths with trailing backslashes that arises from placing paths in quotation marks. Taken from https://ss64.com/nt/syntax-esc.html:

The \ escape can cause problems with quoted directory paths that contain a trailing backslash because the closing quote " at the end of the line will be escaped \".

As such, there is a need to remove trailing backslashes for paths on windows computers.

Commit message:

RepoCloner: fix local repos relative pathing and spaces

At present, local repositories specified by relative pathing or paths
with spaces are not cloned properly even though their location is
verified.

Relative pathing issue is caused by processes being spawned at a
different root path. Issue relating to paths with spaces are caused by
RepoLocation objects converted back into strings to be used in a git
command, thus no longer wrapped in quotation marks. 

Let's fix these issues with local repos pathing.

@chan-j-d chan-j-d force-pushed the 1546-relative-paths-dont-work branch from 8ca899e to a36025a Compare July 11, 2021 15:05
@chan-j-d
Copy link
Contributor Author

Hi @dcshzj, I would like advice regarding the following issues.

  1. codecov failure 1: The private methods that create the commands in GitClone class are marked as not covered in tests but I am not entirely sure if they require tests. I am not sure how to proceed for this.
  2. codecov failure 2: The main public-facing method in the RepoClone class which is the cloneBare method only has 1 test case which does not really cover its functionality (which further contributes to the codecov %). I also don't think its appropriate to include the new test cases in this PR as they are only tangentially related.
  3. In issue Generating reports for local repos: relative paths don't work #1546, it is stated that trailing backslashes is an issue. After some testing, I've realised that windows allows for a trailing \ whereas in Ubuntu 18.04, having a trailing backslash indicates the command is not complete, regardless on whether or not it is wrapped by quotation marks or not. As can be seen in the images below
    Ubuntu 18.04
    ubuntu
    Windows Powershell
    windows
    So the only way I see this becoming an issue is when a local path with a trailing backslash is used in a config file. In which case I think the only solution is to manually check if the last character of a path given is a backslash and remove it. The solution seems slightly crude so I would like your input.

@chan-j-d
Copy link
Contributor Author

chan-j-d commented Jul 11, 2021

Also related is a suggestion to create some kind of GitCommandBuilder class similar to InputBuilder in the test code which has methods like addPath(String path) which automatically wraps the path with quotation marks. Here, we can also add the change of removing trailing backslashes so that the command can be ran on linux machines. Right now, manually adding quotes seems like a very crude way to achieve it and future contributors might forget to add them when creating new commands. But this would require replacing all existing git command strings and might quickly become obsolete if we move away from CommandRunner as in issue #442.

@dcshzj
Copy link
Member

dcshzj commented Jul 11, 2021

  1. codecov failure 1: The private methods that create the commands in GitClone class are marked as not covered in tests but I am not entirely sure if they require tests. I am not sure how to proceed for this.

Looks okay to me. Your PR just happens to touch files that were not previously covered by tests, so it should be fine. We can go as far as testing the string outputs, but that should be in a separate PR.

  1. codecov failure 2: The main public-facing method in the RepoClone class which is the cloneBare method only has 1 test case which does not really cover its functionality (which further contributes to the codecov %). I also don't think its appropriate to include the new test cases in this PR as they are only tangentially related.

For this, the methods mainly pertain to shallow cloning and the tests should be covered in the system tests under testSinceBeginningDateRangeWithShallowCloning and test30DaysFromUntilDateWithShallowCloning. However, it looks like a regression from #1478 as the two tests skip cloning entirely. This is a bug we have to fix in another pull request.

So the only way I see this becoming an issue is when a local path with a trailing backslash is used in a config file. In which case I think the only solution is to manually check if the last character of a path given is a backslash and remove it. The solution seems slightly crude so I would like your input.

It does seem like the only way to me, so it should be fine.

Also related is a suggestion to create some kind of GitCommandBuilder class similar to InputBuilder in the test code which has methods like addPath(String path) which automatically wraps the path with quotation marks. Here, we can also add the change of removing trailing backslashes so that the command can be ran on linux machines. Right now, manually adding quotes seems like a very crude way to achieve it and future contributors might forget to add them when creating new commands. But this would require replacing all existing git command strings and might quickly become obsolete if we move away from CommandRunner as in issue #442.

I doubt issue #442 would be resolved so soon, so it is fine to proceed with the assumption that we are keeping to the current way of forming Git commands. However, I am not sure about the utility of having a GitCommandBuilder, as currently we are only solving the issue of adding quotation marks around paths and removing backslashes. There are many Git commands that we use, and each of them have many more possible options, which may cause a GitCommandBuilder file to get too big quickly.

An ideal case is that we would have tests to catch such issues, but even without it, the many addQuote or addPath function usages should be sufficient to remind either contributors or reviewers the need to handle such issues when dealing with paths.

Nonetheless, I appreciate the suggestion! It is great to hear new ideas and discuss them, and the project benefits from such fresh perspectives.

@chan-j-d
Copy link
Contributor Author

@dcshzj Ready for review.

Added an additional paragraph in the OP "Due to how..." to explain the changes in addressing a peculiar issue faced with paths with trailing backslashes. Besides that, I have been unable to replicate the issue with the trailing backslash on the master branch (the issue mentioned above is due to adding quotation marks to handle spaces in paths, something which is not present in the master branch) so it has been left out of the commit message unless there is an update.

@chan-j-d chan-j-d marked this pull request as ready for review July 15, 2021 15:34
@dcshzj dcshzj requested a review from a team July 16, 2021 01:15
@chan-j-d
Copy link
Contributor Author

Requesting a review for this PR

Copy link
Contributor

@Tejas2805 Tejas2805 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! One question before I approve.

@@ -22,4 +31,17 @@ public void repoCloner_emptyRepo_failsGracefully() throws Exception {

Assert.assertNull(clonedRepoLocation);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any test for nonEmptyRepo but an invalid path for any file or folder? Or will all repos fall only in these two categories -> empty and nonEmpty with everything right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not too sure of this. I can see some issues where a linux repo might cause some path issues when cloned into a windows computer. But otherwise, I am not sure how this can occur since the list of files is obtained with a git command which should also be there if it was cloned successfully.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tejas2805 Any updates regarding this?

@dcshzj dcshzj requested a review from a team August 5, 2021 15:20
@chan-j-d
Copy link
Contributor Author

chan-j-d commented Aug 6, 2021

Link below provides a more accurate file diff as merging with current master seems to have caused merged changes to show up in the file diff.
master...chan-j-d:1546-relative-paths-dont-work

@chan-j-d
Copy link
Contributor Author

@fzdy1914 Requesting a review as it has gotten the initial approval.

@dcshzj dcshzj changed the title [#1546] Generating reports for local repos: relative paths don't work [#1546] RepoCloner: fix local repos relative pathing and spaces Aug 22, 2021
@dcshzj dcshzj merged commit 2c05624 into reposense:master Aug 22, 2021
@chan-j-d chan-j-d deleted the 1546-relative-paths-dont-work branch March 23, 2022 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generating reports for local repos: relative paths don't work
4 participants