Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

analyze: null pointer support #816

Open
spernsteiner opened this issue Feb 7, 2023 · 3 comments
Open

analyze: null pointer support #816

spernsteiner opened this issue Feb 7, 2023 · 3 comments

Comments

@spernsteiner
Copy link
Collaborator

We should add support to c2rust-analyze for tracking (possibly) null pointers.

First, we should add a new pointer permission: PermissionSet::NON_NULL. A pointer with this permission is known to never be null at run time. NON_NULL is set (or not) when the pointer is created, and it flows forward along dataflow edges. The results of Rvalue::Ref and Rvalue::AddressOf should have the NON_NULL permission set; ptr::null(), ptr::null_mut(), and casting the constant 0 to a pointer type do not set the permission; and casting other integers to pointer types is (still) unsupported and therefore panics.

The permission is NON_NULL rather than NULL because we allow dropping permissions in any assignment. This means removing a permission from a pointer's PermissionSet must allow the pointer to take on more values, not restrict it to fewer values.

The reason to continue rejecting nonzero constant integer-to-pointer casts like 3 as *const u8 is to avoid losing information and potentially changing program behavior. The result of 3 as *const u8 can't be a value of type &u8 because it's not a valid pointer, and it can't be None because that would lose the fact that the value is 3 and not 0. We don't want 0 as *const u8 == 3 as *const u8 to be rewritten into None == None and wrongly evaluate to true.

Second, we'll need to update the type_desc and rewrite::ty modules to recognize the lack of NON_NULL and use Option<_> when needed.

Third, rewrite::expr will need to insert Some(_), None, and _.unwrap() as needed. Casts from NON_NULL to nullable pointers will need to wrap the pointer expression in Some(_). Operations that produce null pointers should be replaced with None. And dereferences of nullable pointers should call _.unwrap() first.

Some operations may need special handling. I think offset should require a non-null pointer (meaning rewrite::expr should unwrap() the input if it's nullable, like it does for deref), since it's UB to call it on invalid pointers. This means offset can unconditionally return NON_NULL regardless of its input. And is_null can be converted to is_some() if the pointer is nullable or replaced with literal true if it's NON_NULL. (Converting if p.is_null() { ... } to if let Some(p) = p { ... } would be nice to have, but isn't essential and would probably be a bit tricky.

It might be safe for us to define malloc and related functions to return NON_NULL, which would let us eliminate a lot of Options, because the Rust translations of those functions (Box::new/Vec::new) abort on OOM instead of returning null.

@kkysen
Copy link
Contributor

kkysen commented Feb 7, 2023

It might be safe for us to define malloc and related functions to return NON_NULL, which would let us eliminate a lot of Options, because the Rust translations of those functions (Box::new/Vec::new) abort on OOM instead of returning null.

I'm not sure about this. C malloc can pretty clearly return NULL, and there's a lot of code that's going to check for this.

I think there's two cases here:

  1. If the C code checks for NULL and never had UB, then we should use Box::try_new (unstable but we use other unstable things).

  2. If the C code doesn't check for NULL and just uses the pointer, then it has UB, so we can reinterpret that as panic on OOM. In that case, we're free to use Box::new and panic in the UB/OOM case, which is allowed due to the UB.

@kkysen
Copy link
Contributor

kkysen commented Feb 7, 2023

Also, similar to offset, I think we can also use UB in other cases. Whenever a pointer is dereferenced or a function/method called on/with it that requires it to be valid (e.x. offset), then it must be NON_NULL. Then "casts" from non-NON_NULL to NON_NULL will be .unwrap()s.

@spernsteiner
Copy link
Collaborator Author

It would be legal under our definition of correctness (in which we either preserve behavior exactly or introduce a panic) to assert after every malloc call that the result is non-null. This would be more in line with idiomatic Rust code, and I think intelligently handling OOM is rare even in C. So I think inserting panics here is justifiable and would help our analysis produce better results (fewer unnecessary Options).

If the C code doesn't check for NULL and just uses the pointer

Whenever a pointer is dereferenced or a function/method called on/with it that requires it to be valid (e.x. offset), then it must be NON_NULL.

I think these are covered by the discussion on the NON_NULL PR (#818)

kkysen added a commit that referenced this issue Apr 26, 2023
)

- Fixes the first step of #816.

This adds `PermissionSet::NON_NULL` and precise docs explaining its
meaning and intended usage, but doesn't contain any actual
implementation or use of `NON_NULL`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants