Skip to content

Proposal: script to pretty-print YAML cassettes #580

@gward

Description

@gward

I've been a happy user of VCR.py at a couple of jobs for several years. One little annoyance is that YAML cassettes written by are VCR.py are not as readable as they could be. Here's an example from our unit tests for a small API client library:

  request:
    [...omitted...]
  response:
    body:
      string: '{"created_at":"2021-03-29T14:17:52.494Z","userprincipalname":null,"trusted_idp_id":null,"manager_ad_id":null,"department":null,"email":"[email protected]","locked_un
til":null,"username":null,"comment":null,"password_changed_at":null,"group_id":null,"invitation_sent_at":null,"state":1,"title":null,"custom_attributes":{"fn_test_field":null,"fnperms
":"","customer_ids":null,"perms":null,"ns_contact_id":null},"company":null,"directory_id":null,"firstname":"Joe","lastname":"Slow","status":7,"role_ids":[],"activated_at":null,"member_of":null,"phone":null,"updated_at":"2021-03-29T14:17:52.494Z","distinguished_name":null,"external_id":null,"invalid_login_attempts":0,"last_login":null,"samaccountname":null,"preferred_locale_code":null,"manager_user_id":null,"id":128762714}'
    headers:
      cache-control:
      - no-cache
      content-length:
      - '776'
      content-type:
      - application/json; charset=utf-8
      date:
      - Mon, 29 Mar 2021 14:17:52 GMT
      status:
      - 201 Created
      [...more response headers...]

Good news: this accurately captures the request/response cycle, just as it's supposed to. VCR.py is working as advertised.

Bad news: the JSON response is a bit hard to read, and harder to modify. Sometimes it's useful to manually tweak a response to test an edge case, or because an API has added a new feature and it's too much trouble to capture new responses. Editing a compact multiline blob of JSON is annoying, and the resulting diff is useless.

This example is far from the worst. More complex/nested data structures are really hard to understand, but VCR cassettes are a great way to informally document APIs. ("Ohh, that's what the response to POST /user looks like!") When dealing with older/nastier APIs, it's common to see JSON wrapped in JSON, or a mix of XML and JSON responses. I've also had to deal with APIs that use multipart form requests (eg. for file uploads), and the resulting cassettes are really hard to read.

And gzip'ed responses are really annoying: there's nothing wrong with an API that returns a compressed response body, but trying to understand that in a VCR.py cassette is impossible.

So I wrote a hacky little script to pretty-print VCR.py cassettes. For example, if I run it on the above cassette, it notices that the response is JSON, and formats it accordingly:

  response:
    body:
      string: '{
      "created_at": "2021-03-29T14:17:52.494Z",
      "userprincipalname": null,
      "trusted_idp_id": null,
      "manager_ad_id": null,
      "department": null,
      "email": "[email protected]",
      [...more fields...]
    }'
    headers:
      cache-control:
      - no-cache
      content-length:
      - '776'
      content-type:
      - application/json; charset=utf-8

Good news: the JSON is much more readable and edit-friendly. This is still valid YAML that VCR.py happily accepts.

Bad news: the content-length header is a lie. More broadly, this is no longer a byte-precise capture of the response.

Anyways: I want to open-source this script. I think the best place for it is in VCR's own repo, maybe in a contrib/ directory. If that is agreeable to you, I'll open a PR. If you're not interested, please let me know and I'll create a tiny little project just for this script.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions