Skip to content

Latest commit

 

History

History
424 lines (228 loc) · 16.5 KB

review-checklist.md

File metadata and controls

424 lines (228 loc) · 16.5 KB

Common PR Review Questions

People often ask "how do I get better at reviewing PRs?" and I'm not sure how to answer that other than "review a lot of PRs" and "look at a lot of PR reviews that others have done."

"Bad" Reviews

A "bad" review would be something like:

  • I checked out the PR and ran all the automated tests. ACK.

  • Not sure what this does, but it's trying to change wallet behavior. NACK.

  • ACK (no explanation)

  • NACK (no explanation)

Great Reviews

A "good" review doesn't necessarily find a bug, or contain a ton of comments, or picks with a fine-toothed comb. I think it just needs to have enough information so that maintainer or author can determine how much weight to put on it. For example, any monkey can create 10 github accounts and sybil att-ACK PR reviews.

Other examples:

You can leave a helpful review without being an expert:

Lots more examples below as well.

What is this?

I personally think a good review is one where I've asked myself a lot of pointed questions about the PR and been satisfied with the answers to them. Sometimes, the same question applies to many PRs, so I've made an effort to write them down here.

This is a list of questions I might ask myself while reviewing a Bitcoin Core PR. It is not a list of all the questions I would ask while reviewing a PR. In fact, I don't think any list of 100 questions could be sufficient for any specific PR.

I do NOT recommend treating this as a checklist with which you review PRs.

I think it's a good idea to create your own list of common PR review questions, and perhaps use these examples as inspiration for generating those questions.

On the organization of these questions:

This list is approximately organized by the order in which I'd ask those questions. For me, the nature of the question usually depends on what stage the PR is at and the extent of my own review. Naturally, I start with conceptual questions, then approach-related questions, and then implementation questions. Generally, I personally think it's useless to leave C++ syntax-related comments on a draft PR, and would feel rude going back to "does this make sense" after the author has addressed 20+ of my code organization suggestions.

Conceptual

Before considering anything implementation-related:

  • What type of PR is this? New feature, bug fix, performance, refactor, documentation, testing, etc. Does it contain more than one of those categories?

  • Does the PR take into account other work in the project (eg does it conflict unnecessarily with other PRs, or go in a different direction from other people’s work, etc).

  • Could this PR be re-organized to make review easier?

  • Could this PR be split up?

  • Could the PR-splitting leave us in a weird intermediary state? bitcoin/bitcoin#23443 (review), bitcoin/bitcoin#23075 (review)

  • Are all commits atomic? If a commit fails on its own, it's harder to revert and breaks git bisect. bitcoin/bitcoin#23075 (review)

  • Could this just be a scripted-diff?

Motivation

Is this PR sufficiently motivated?

Does the PR add a new feature?

  • Is the feature useful for a significant number of users, or very few?

  • Would this feature only benefit an extremely advanced user who should probably write a custom patch instead?

  • Has anyone actually requested this? examples: bitcoin/bitcoin#22871 (comment)

  • Are there other features that would be enabled by this PR?

  • If this PR is an improvement, how is it demonstrated?

    • Is there a bench or simulation?

    • Has it been tested within the context of the full network? How does it interact with other protocol features (e.g. compact block relay) or proposed improvements?

    • Did you verify the results yourself? bitcoin/bitcoin#23157 (review)

    • Did you evaluate the method used to gather measurements?

Does the PR solve a problem? Does it fix a bug?

Downsides

What are the downsides of this PR, conceptually?

  • What are the maintenance costs of this PR?

    • Does it require someone to update it at every release?

    • Is this PR introducing a bunch of hard-to-understand new code?

    • Is this PR introducing any new dependencies?

  • Is it incompatible with another existing/proposed improvement?

  • Is this compatible with all of the {operating systems, architectures, platforms, dependency versions} we want to support? bitcoin/bitcoin#23585 (comment)

For users

  • Does this introduce a footgun?

  • Should this be the new default for users?

  • Should this be optional/configurable? Should it be opt in or opt out?

  • Are the changes backwards-compatible?

Approach

This PR is a good idea. That doesn't mean it should be merged.

Security, Privacy, DoS

  • What threat model? Does this change the way we handle something from a p2p node, or can this code path only be hit by a (privileged) RPC/GUI/REST client?

  • Could a bug here cause inflation or a deviation from consensus?

  • Are we introducing a new social engineering security risk? E.g. a new command/file that scammers can use to easily steal from a wallet

Are we introducing new DoS vectors?

Privacy

  • Could this make it easier to deanonymize transaction origin?

  • Could an attacker take advantage of this behavior to more easily analyze network topology? https://github.com/bitcoin/bitcoin/commit/acd6135b43941fa51d52f5fcdb2ce944280ad01e)

  • Are we creating a behavior that can be used to fingerprint this node?

  • Is privacy specifically worsened by this PR, or does it have a neutral effect within the context of an existing privacy leak?

Security By Component

  • p2p

    • Could this make it easier for a peer to eclipse us/ Monopolize our addrman?

    • Does this change eviction logic in a way that may allow attackers to trigger us to disconnect from an honest peer?

    • Is it possible for an attacker to censor somebody’s transaction by taking advantage of some behavior we have here?

    • Are we opening a new pinning vector?

    • Could it cause us to add transactions to our mempool that aren’t incentive-compatible, thereby causing it to deviate from miners’ mempools?

    • Does this leak information about when a transaction enters/leaves our mempool?

  • Wallet

    • Could a bug here cause someone to lose money?

    • Could this link transactions, UTXOs, addresses, etc. together?

Implementation

At first glance, this PR looks like a good improvement and the approach seems to make sense. Here is the Big danger now; it can have some nasty bugs in it. Don't "no behavior changes," sometimes a bug is hiding in a refactor: bitcoin/bitcoin#21160 (comment).

Common Bitcoin-Specific Bugs and Attacks

  • Are we using the txid when you should be using wtxid, and vice versa?

  • Are we using virtual bytes when we should be using bytes, and vice versa?

  • If this uses virtual bytes, is it compatible with something else that uses weight units, and vice versa?

  • Are we using fee when we should be using feerate, and vice versa?

  • What happens if there is a reorg?

  • For a wallet transaction or UTXO, should a transaction be treated differently if it is {confirmed, unconfirmed}? What if there is a conflicting transaction in the {mempool, chain, wallet}?

  • Is this code safe when the transaction’s txid != wtxid?

  • Can this have any impact on block propagation speed?

  • Does this break the guix build?

  • What impacts might this have on LN transactions or other L2 projects? bitcoin/bitcoin#22871 (review)

Performance

  • Are space and time complexity acceptable for the use case?

  • Are the data structures appropriate for this purpose?

  • Does it make use of the tools we have for parallelization?

  • If adding optional functionality, how does it impact the performance of a user’s node if they don’t opt in to it? E.g. USDT

Code Architecture

(this is a big topic, and isn’t unique to Bitcoin)

Thread safety, concurrency

  • Is it thread-safe?

  • Would it be thread-safe in the future if/when x and y were to be multi-threaded?

C++

  • Are we making copies of variables when we should be passing a reference? bitcoin/bitcoin#22677 (comment)

  • Are we casting variables unnecessarily? Should we be using auto instead?

  • Const things should be const

  • Free your memory, or just use smart pointers

  • Could this implementation easily be swapped for an STL algorithm or container?

  • Are you using boost when you should be using stdlib?

Examples:

Other

Testing

Are there enough tests?

  • Is there good test coverage? Is there any code that isn’t tested? example bitcoin/bitcoin#23443 (comment)

  • Are there normal-usage or edge cases that aren’t tested?

  • Is the type of test appropriate? (unit, fuzz, functional, bench)

  • Is it actually testing what it should be testing?

    • If you’re asserting that something doesn’t happen, is it because of a delay? e.g. bitcoin/bitcoin#21327 (comment)

    • If you’re asserting that something doesn’t happen, is it because it’s handled asynchronously?

    • Are you sure that you’re testing this code in isolation or is it buried under 999 other things that could be causing some behavior?

    • If you mutate the implementation, does the test fail?

Are the tests well-written?

  • Is it using the functional test framework to test a very low-level detail?

  • Are the tests unacceptably slow? bitcoin/bitcoin#23288

  • Do the tests rely on timing that may be delayed when they are run in parallel with other tests? Are the tests racy? bitcoin/bitcoin#11740 (comment)

  • Are the tests themselves maintainable? bitcoin/bitcoin#21280 (comment)

  • In functional tests: does the test have a wallet dependency when it’s not testing wallet behavior?

  • In functional tests: is it thread-safe?

  • Are the tests repeating stuff when they should be using utils?

  • Is the test adding helper code that should be added to test utils?

  • Is manual testing required and/or appropriate? Are any methods suggested?

Documentation

  • Is this well-documented enough?

    • Is the code easy to follow? Are the comments misleading, incorrect, obvious?

    • Are release notes needed?

    • Are potential footguns and user errors plastered with warnings? bitcoin/bitcoin#23201 (comment)

    • Are future TODOs documented well enough so that maintenance doesn’t require the author to personally remember to do something? E.g. deprecations

    • Are newly introduced functions, members, classes, etc. accompanied with a doxygen header comment?

  • If a future developer traced the history back to a particular commit, would they be able to understand why a change was made? Does the description match what the code does (will be in the merge commit log)? If not, it should be updated. bitcoin/bitcoin#22009 (comment)

  • If a future developer were to change this, might they make a mistake if they don't understand it fully? Can this be guarded - e.g. with static assertions, sanity checks, etc?

  • Does this implementation break any pre-existing API expectations? e.g. we might now call a function with different arguments that breaks its preconditions.

Meta

  • Who might need to be notified about this change?

  • Does this PR add new dependencies or increase usage of a dependency we’re trying to get rid of?