Add pallet-verifier #2234

davidsemakula · 2024-02-26T16:46:28Z

Project Abstract

This application is a response to the RFP for Static Analysis for Runtime Pallets.

pallet-verifier is a tool for static analysis of FRAME runtime pallets that are used for building Substrate-based blockchains.

pallet-verifier aims to detect common security vulnerabilities found in FRAME pallets using static program analysis techniques like data-flow analysis, abstract interpretation and symbolic execution.

Grant level

Level 1: Up to $10,000, 2 approvals
Level 2: Up to $30,000, 3 approvals
Level 3: Unlimited, 5 approvals (for >$100k: Web3 Foundation Council approval)

Application Checklist

The application template has been copied and aptly renamed (project_name.md).
I have read the application guidelines.
Payment details have been provided (Polkadot AssetHub (DOT, USDC & USDT) address in the application and bank details via email, if applicable).
I am aware that, in order to receive a grant, I (and the entity I represent) have to successfully complete a KYC/KYB check.
The software delivered for this grant will be released under an open-source license specified in the application.
The initial PR contains only one commit (squash and force-push if needed).
The grant will only be announced once the first milestone has been accepted (see the announcement guidelines).
I prefer the discussion of this application to take place in a private Element/Matrix channel. My username is: @_______:matrix.org (change the homeserver if you use a different one)

bhargavbh

Thanks for showing interest in the RFP. The proposal has sufficient technical depth, appreciate it. I have a few comments/questions:

you rightly mention some of the limitations of MIRAI when generics and macros are used. Whats your approach to tackle this? are you looking at modifying (simplifying) the complexity at the MIR level? could you please elaborate?
my personal experience with MIRAI for arithmetic overflows/underflows was that it produced a lot of false positives. This was specifically because "possible incomplete analysis of call because of a nested call to a function without a MIR body". Have you looked into this issue? Static analysers inherently generate false-positives as they over-approximate the program, and in general it is a very hard problem. I would also argue that an FP of 50-60% is still decent (as a developer i would be very happy if i find 1 real bug out of 2-3 warning that pallet-verifier generates). However it gets tricky if its higher. Can you include a small benchmark to quantify the accuracy of the tool?

davidsemakula · 2024-03-01T00:05:37Z

Thanks for showing interest in the RFP. The proposal has sufficient technical depth, appreciate it.

Hi @bhargavbh, thanks for taking the time to review, I appreciate the feedback 🙂

you rightly mention some of the limitations of MIRAI when generics and macros are used. Whats your approach to tackle this? are you looking at modifying (simplifying) the complexity at the MIR level? could you please elaborate?

The short answer is yes, but I actually describe the approach in detail in the "Finding tractable entry points" and "Generating tractable entry points automatically" subsections.
In general the whole "Identifying (and generating) tractable entry points" section is essentially dedicated to this problem.
The first 3 subsections (i.e. "The generics challenge", "Too many macro-generated entry points", "The TestExternalities and SubstrateHostFunctions challenge") outline the problems/challenges (as you noted here about generics and macros), and the last 2 subsections (i.e. "Finding tractable entry points" and "Generating tractable entry points automatically" - the ones linked above) detail the solution.
LMK if I should clarify anything from those subsections. 🙂

my personal experience with MIRAI for arithmetic overflows/underflows was that it produced a lot of false positives. This was specifically because "possible incomplete analysis of call because of a nested call to a function without a MIR body". Have you looked into this issue? Static analysers inherently generate false-positives as they over-approximate the program, and in general it is a very hard problem. I would also argue that an FP of 50-60% is still decent (as a developer i would be very happy if i find 1 real bug out of 2-3 warning that pallet-verifier generates). However it gets tricky if its higher. Can you include a small benchmark to quantify the accuracy of the tool?

Yes, so in general MIRAI emits a lot of warnings from the standard library and system calls when run at the paranoid diagnostic level. This is mostly because the paranoid level is maximally strict and so warns about "any issue that may be an error" by essentially disabling some of MIRAI's "normal" mechanisms for reducing false positives.
One of these mechanisms is "suppressing any diagnostics that arise within the standard library or third party code [... i.e. diagnostics that are [not] tied to source locations within the [current] crate]".

Something I've noticed is that most false positives only show up at the paranoid level, precisely because of this "suppression mechanism" being disabled. But at the same time, due to some complexity arising from indirection and macro call site information from our target domain, we most likely have to use the paranoid level to get MIRAI to emit/buffer our desired warnings, because they "seem" (to MIRAI) to arise from third-party code (i.e. frame_support) and so MIRAI suppresses them at lower diagnostics levels (i.e.. --diag=default|verify|library).

So I've detailed a hybrid approach in the "Using MIRAI as a library" section, that essentially allows us to access MIRAI's diagnostics buffer directly and decide which diagnostics we want to "suppress" or "promote" based our domain knowledge and the reported primary and related locations for the diagnostic. (the aforementioned section has some more details about this 🙂).

But beyond that, another lever at our disposal, specifically for the "function without a MIR body" issue, is to add "summaries/foreign contracts" for these missing "foreign functions" (e.g. standard library functions that aren't inlined and system calls).
This part of MIRAI is not that well documented, but at the same time it's not that hard to figure out how to add these, from the existing examples and related utilities which actually show that these "summaries/foreign contracts" can be defined outside MIRAI, which certainly gives us more flexibility without having to modify MIRAI itself.

Having said that, with the approach of bypassing TestExternalities and HostFunctions described in detail in the proposal (see links to relevant sections in the answer to the first question), combined with the ability to "suppress" and "promote" warnings as described above, I don't expect that we'll have to pull this lever that much.

Finally, when it comes to the panic and arithmetic overflow/underflow detection itself, MIRAI seems to be quite accurate at that for simple examples/functions, likely because Rust's type system and relative simplicity of these checks in MIR makes these analyses quite tractable compared to other languages.
So I think if we can effectively suppress unrelated warnings from "foreign functions" with missing MIR bodies (e.g. standard library functions that aren't inlined and system calls) or provide "summaries/foreign contracts" for the most problematic ones, we should achieve relatively high accuracy.
I'd personally expect us to do a lot better than 60% TBH 🙂

It's a bit late over here, so I'll think about the methodology for benchmarking the accuracy of the tool and share that tomorrow.
In the meantime, LMK if these responses and the details in the linked sections make sense. Thanks!

bhargavbh · 2024-03-02T08:09:27Z

Thanks @davidsemakula . I like your approach with adding summaries/ post-conditions (contracts) to frequently occurring external calls that are problematic to address the false positive rates.
My only suggestion would be adding a benchmark (need not be comprehensive) as part of the deliverable just to showcase the performance and accuracy metrics of the tool on a few production pallets. I'm happy to approve the proposal after this change.

davidsemakula · 2024-03-03T10:52:39Z

Thanks @bhargavbh ... I added the benchmark to the deliverables 🙂

bhargavbh

The proposal has sufficient technical depths. Main challenges and potential approaches to tackle them have been described. A developer productivity tool like this is needed in the ecosystem. Suggested changes have been incorporated. Happy to approve it.

davidsemakula · 2024-03-04T11:57:36Z

@bhargavbh Awesome! thanks for the swift review/approval and for the positive feedback and recommend improvements 🙂
cc @Noc2

Noc2 · 2024-03-04T19:23:46Z

I will share it again with the rest of the team.

keeganquigley

Thanks for the review @bhargavbh based on this and also your previous work @davidsemakula happy to approve.

takahser

LGTM.

semuelle · 2024-03-05T13:14:14Z

Hey @davidsemakula, are you aware that grants are now 30% in DOT? Just asking because you only provided one payment address.

In the meantime, could you fill out the KYC form (assuming you are applying as an individual, otherwise please use this form)?

davidsemakula · 2024-03-05T15:20:59Z

Hi @semuelle

are you aware that grants are now 30% in DOT? Just asking because you only provided one payment address.

Yes, I'm aware of the 30% split, but I assumed the same address could be use for both (DOT & USDT), LMK if that's not the case.

In the meantime, could you fill out the KYC form

I've completed the KYC form (for individuals). Thanks! 🙂

github-actions · 2024-03-05T19:32:37Z

Congratulations and welcome to the Web3 Foundation Grants Program! Please refer to our Milestone Delivery repository for instructions on how to submit milestones and invoices, our FAQ for frequently asked questions and the support section of our README for more ways to find answers to your questions.

Before you start, take a moment to read through our announcement guidelines for all communications related to the grant or make them known to the right person in your organisation. In particular, please don't announce the grant publicly before at least the first milestone of your project has been approved. At that point or shortly before, you can get in touch with us at [email protected] and we'll be happy to collaborate on an announcement about the work you’re doing.

Lastly, please remember to let us know in case you run into any delays or deviate from the deliverables in your application. You can either leave a comment here or directly request to amend your application via PR. We wish you luck with your project! 🚀

davidsemakula · 2024-07-11T09:07:06Z

Minor update: Should be able to wrap this up in the next 1-2 weeks!

keeganquigley · 2024-10-24T19:23:52Z

Hi @davidsemakula I don't believe we ever received any deliveries, are you still planning to deliver these milestones?

davidsemakula · 2024-10-25T15:49:33Z

Hi @keeganquigley
I'll make the deliveries next week, current progress is at https://github.com/davidsemakula/pallet-verifier
Mostly docs and tests (and may be a few std contracts) left. 🙂

keeganquigley · 2024-11-07T21:47:45Z

Hi @davidsemakula just checking in.

davidsemakula · 2024-11-12T09:26:10Z

Hi @keeganquigley
I just opened a PR for the delivery of both milestones 🙂

github-actions bot added the admin-review This application requires a review from an admin. label Feb 26, 2024

davidsemakula force-pushed the pallet-verifier branch 2 times, most recently from aa57ea2 to a042d81 Compare February 27, 2024 09:07

Add pallet-verifier

19b6b86

davidsemakula force-pushed the pallet-verifier branch from a042d81 to 19b6b86 Compare February 27, 2024 10:16

Noc2 requested a review from bhargavbh February 27, 2024 13:42

fix example predicate

bebe0d3

bhargavbh reviewed Feb 29, 2024

View reviewed changes

davidsemakula requested a review from bhargavbh March 1, 2024 00:24

davidsemakula added 2 commits March 3, 2024 11:33

pallet-verifier: add more info on foreign functions

b53ac4b

pallet-verifier: add benchmarks

51a550c

bhargavbh approved these changes Mar 4, 2024

View reviewed changes

Noc2 approved these changes Mar 4, 2024

View reviewed changes

Noc2 added the ready for review The project is ready to be reviewed by the committee members. label Mar 4, 2024

semuelle self-requested a review March 4, 2024 22:52

keeganquigley approved these changes Mar 5, 2024

View reviewed changes

takahser approved these changes Mar 5, 2024

View reviewed changes

keeganquigley removed the admin-review This application requires a review from an admin. label Mar 5, 2024

Noc2 merged commit 68ef847 into w3f:master Mar 5, 2024
10 of 12 checks passed

davidsemakula deleted the pallet-verifier branch March 5, 2024 20:41

davidsemakula mentioned this pull request Nov 12, 2024

pallet-verifier (milestone 1 & 2) w3f/Grant-Milestone-Delivery#1222

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add pallet-verifier #2234

Add pallet-verifier #2234

davidsemakula commented Feb 26, 2024 •

edited

Loading

bhargavbh left a comment

davidsemakula commented Mar 1, 2024 •

edited

Loading

bhargavbh commented Mar 2, 2024

davidsemakula commented Mar 3, 2024 •

edited

Loading

bhargavbh left a comment

davidsemakula commented Mar 4, 2024 •

edited

Loading

Noc2 commented Mar 4, 2024

keeganquigley left a comment

takahser left a comment

semuelle commented Mar 5, 2024

davidsemakula commented Mar 5, 2024

github-actions bot commented Mar 5, 2024

davidsemakula commented Jul 11, 2024

keeganquigley commented Oct 24, 2024

davidsemakula commented Oct 25, 2024

keeganquigley commented Nov 7, 2024

davidsemakula commented Nov 12, 2024

Add pallet-verifier #2234

Add pallet-verifier #2234

Conversation

davidsemakula commented Feb 26, 2024 • edited Loading

Project Abstract

Grant level

Application Checklist

bhargavbh left a comment

Choose a reason for hiding this comment

davidsemakula commented Mar 1, 2024 • edited Loading

bhargavbh commented Mar 2, 2024

davidsemakula commented Mar 3, 2024 • edited Loading

bhargavbh left a comment

Choose a reason for hiding this comment

davidsemakula commented Mar 4, 2024 • edited Loading

Noc2 commented Mar 4, 2024

keeganquigley left a comment

Choose a reason for hiding this comment

takahser left a comment

Choose a reason for hiding this comment

semuelle commented Mar 5, 2024

davidsemakula commented Mar 5, 2024

github-actions bot commented Mar 5, 2024

davidsemakula commented Jul 11, 2024

keeganquigley commented Oct 24, 2024

davidsemakula commented Oct 25, 2024

keeganquigley commented Nov 7, 2024

davidsemakula commented Nov 12, 2024

davidsemakula commented Feb 26, 2024 •

edited

Loading

davidsemakula commented Mar 1, 2024 •

edited

Loading

davidsemakula commented Mar 3, 2024 •

edited

Loading

davidsemakula commented Mar 4, 2024 •

edited

Loading