Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutation-based generator of arbitrary executable documents #880

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

SimonSapin
Copy link
Contributor

For a given valid schema, use a byte string typically provided by cargo-fuzz as a source of entropy to deterministically generate a valid executable document:

pub fn arbitrary_valid_executable_document(
    schema: &Valid<Schema>,
    arbitrary_bytes: &[u8],
) -> Valid<executable::ExecutableDocument>

Instead of using the arbitrary crate, parts of it are forked for reasons documented at the top of the new entropy.rs file.

At a high level, it creates a minimal operation then keeps adding selections to it until entropy is exhausted. As a result, providing a longer byte string tends to create larger documents. Hopefully this helps fuzzers better explore the space of possible documents, compared to apollo-smith which uses code like u.int_in_range(1..=5) to decide how many items to generate regardless of remaining entropy.

With this approach I believe we could enumerate all possible valid documents up to a certain "size" with bounded amounts of entropy. We’re not quite there yet for at least a few reasons:

  • All generated field selections get a unique alias, so field merging is trivial. We can easily sometimes re-use response keys, but maintaining "fields can merge" validity is tricky.
  • Every time the generator decides create a named fragment spread selection it creates a new named fragment definition. If re-using fragments we need to be careful not to introduce cycles but this may accidentally be already solved: for borrow-checking reasons we remove fragments from the map while modifying them, and this may be exactly the subset that would introduce a cycle if reused.
  • String values are generated like names (limited character set, and typically short). I think this is ok for the purpose of testing query planner as string values in executable documents are typically not meaningful for planning purpose. (It’s a different story for generating subgraph schemas with federation directives)

@SimonSapin
Copy link
Contributor Author

This PR is draft because I’m unsure if this generator should live inside apollo-rs. I think eventually yes, but at first we’ll likely want to tweak it a lot and going through a crates.io release for every change would be a significant obstacle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant