Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement two-person reviewed check #439

Closed
wants to merge 9 commits into from
Closed

feat: implement two-person reviewed check #439

wants to merge 9 commits into from

Conversation

Yao-Wen-Chang
Copy link
Contributor

  • Modifying the design logic in two-person-reviewed-check.py.
  1. Fetching all PRs number, and it is able to specify target branch for checking.
  2. Checking commit time is existed, and this step will preserve the merge and closed PRs number.
  3. Querying each PRs, and counting the number of reviewers.
  • Adding parameter to defaults.ini to specify the number of reviewers required ( 1 reviewers required based on the requirement based on SLSA document, please check out the link in the comment )

@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Aug 27, 2023
@Yao-Wen-Chang
Copy link
Contributor Author

Yao-Wen-Chang commented Aug 27, 2023

Following is the requirement from SLSA.

two-person review requirement
In the above link, the author mentioned:
The following combinations are acceptable:

  • Uploader and reviewer are two different trusted persons.
  • Two different reviewers are trusted persons.

Does this mean only one authenticated reviewer is acceptable, since the Uploader is the contributor rather than the maintainer?

@tromai
Copy link
Member

tromai commented Aug 29, 2023

@Yao-Wen-Chang There is no need to create another PR if the pipeline is failing. We have setup the pipeline so that it will automatically run again when you make changes to the PR (push new commit, rebase + push force, etc.). So if the pipeline is failing, please create a fix commit and push. You could also rebase and git push force if you want to keep the commit list tidy (this only applies if you are still in draft mode, after you have marked a PR as ready for review, any new commit shouldn't be squashed with previous one so that it's easier for us to review your changes).
Could you please close all duplicated PRs that you will not continue working on and leave only the one that you are currently working on? If this PR is not ready yet, feel free to mark it as a Draft PR (see instructions here).
If you have any other questions, please feel free to ask here (if it's related to the PR) or create a GitHub issue (if it's not entirely related to this PR).
Thanks for your contribution!

@Yao-Wen-Chang Yao-Wen-Chang marked this pull request as draft August 30, 2023 08:31
@Yao-Wen-Chang
Copy link
Contributor Author

Thank @tromai ! I have closed the duplicated one, and converted this one to a draft.

@@ -23,37 +20,21 @@
logger: logging.Logger = logging.getLogger(__name__)


class TwoPersonReviewedTable(CheckFacts, ORMBase):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove the table? This table is necessary for users to write policies and validate the check results using the policy engine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right, I will restore it.
I misunderstood that this is used to store results.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is used to persist the results in a database, which is used by the policy engine separately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to remove it since I also checked the vsc_check.py and there is not table defined. Does this mean the results from the VCS check will not be consumed by the policy engine?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. We should add a table for the VCS check too (not added yet because we didn't have a use case so far). But we need to make the check result available to the policy engine in your case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error: FAILED integration test (line 37) Error: FAILED integration test (line 44)
Some other similar errors results from analysis function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Macaron 0.3.0 we changed the data model and the DB constraints have changed. You should make sure to remove any existing macaron.db file from your output directory to avoid this error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there. The check is implemented through multi-threading method in order to tackle with io-bound issue. Since the check requires scanning through numerous PRs, GitHub API rate limit mechanism will force the system to sleep for a long time.
Due to the this issue, the integration test will be canceled.
Also, numerous API request causes unstable results. The expected results might not be generated, which results in integration test fail.
Would it be possible to handle rate limit and fetch the data in a stable way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this issue, I guess that we are trying to check too many PRs, which results in in-deterministic results. Is there a way that we only limit the range of the opened/closed PRs so that we don't have to perform too many queries?

The check is implemented through multi-threading method in order to tackle with io-bound issue.

I don't think it's possible to implement the checks using multi-threading as we haven't considered multi-threading inside checks for our current infrastructure.

In additions, I also have some questions on the logic of your proposed implementation:

  1. Fetching all PRs number, and it is able to specify target branch for - Does this mean that we will check for all PRs existed inside a repository?
  2. Checking commit time is existed, and this step will preserve the merge and closed PRs - What do you mean by commit time here and what is the meaning of preserve the merge and closed PRs.

Would it be possible to handle rate limit and fetch the data in a stable way?

Given the nature of GitHub API, the current solution is to check if we exceed the GitHub API limit and wait until the rate limit is removed (implementation here). However, this implementation is not ideal too.
I guess in terms of your check, we might need to think of a better heuristic to query only subset of PRs to reduce the number of queries. (cc @behnazh-w)
In terms of user experience, taking too much time on a single check wouldn't be ideal after all.

Copy link
Contributor Author

@Yao-Wen-Chang Yao-Wen-Chang Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given this issue, I guess that we are trying to check too many PRs, which results in in-deterministic results. Is there a way that we only limit the range of the opened/closed PRs so that we don't have to perform too many queries?

I will try moving from REST to GraphQL, since it will allow fetching the precise content we need. Currently, I have specified the parameter to 'closed' only. I am wondering if only checking the critical changes or complex changes.

I don't think it's possible to implement the checks using multi-threading as we haven't considered multi-threading inside checks for our current infrastructure.

Alright, I'll modify my design.

  1. Fetching all PRs number, and it is able to specify target branch for - Does this mean that we will check for all PRs existed inside a repository?

Yep, and now, the scope has been scale down to only the merge PRs.

  1. Checking commit time is existed, and this step will preserve the merge and closed PRs - What do you mean by commit time here and what is the meaning of preserve the merge and closed PRs.

I think commit time should be merge_at key word that ensures the PR has been merged. My approach is requesting closed PR information, then check the merge_at keyword existed. The closed PR does not guarantee it is being merged.

In terms of user experience, taking too much time on a single check wouldn't be ideal after all.

I will figure out and test different approach, then updating the performance of the approach in this thread.

@Yao-Wen-Chang Yao-Wen-Chang marked this pull request as ready for review September 18, 2023 23:53
@Yao-Wen-Chang
Copy link
Contributor Author

Yao-Wen-Chang commented Sep 29, 2023

PR closed:
Modify the source from staging to 420-two-person-reviewed-check.
Reference to new PR #492 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OCA Verified All contributors have signed the Oracle Contributor Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants