Skip to content

Conversation

andreypopp
Copy link
Collaborator

@andreypopp andreypopp commented Dec 5, 2024

Closes #36.

@andreypopp andreypopp requested a review from jchavarri December 5, 2024 08:36
Copy link
Member

@anmonteiro anmonteiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ppx_deriving_json_runtime is still reading Json.Decode.error which seems to be gone?

@jchavarri jchavarri mentioned this pull request Dec 14, 2024
Copy link
Member

@jchavarri jchavarri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great @andreypopp !

I fixed the tests (don't ask how 🙈 ) in #46. For now I use the deprecated APIs (also helpful to see how users will have to adapt their code). Once they are removed we could update the tests to use the new APIs because they're quite exhaustive.

As can be seen in the PR, some of the JSON error messages were more detailed before. We should bring these details back somehow, do you think it'll be too hard with the PPX approach?

src/Json.mli Outdated
module Of_json : sig
(** Provides a set of low level combinator primitives to decode
Js.Json.t data structures A decoder combinator will return the
decoded value if successful, or raise a [DecodeError of string] if
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not DecodeError anymore.

(** JSON can be encoded as JSON, trivially. *)

(** The type of a error which occurs during decoding JSON values. *)
type of_json_error = Json_error of string | Unexpected_variant of string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unexpected_variant currently always sends "unexpected variant" in the string payload. Should this payload be removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've wanted to include the tag name there.

* main:
  update setup-ocaml
@jchavarri
Copy link
Member

I think #46 is ready:

  • updates error messages so they show the value received in the JSON, not only the failure
  • improve errors in some cases like objects with new information about which key is failing to be parsed
  • remove duplication in the browser ppx runtime code, reusing functions from Json module
  • remove DecodeError exn altogether to make sure that any code relying on it fails to compile, as it should use now Oj_json_error exn
  • fix ci tests
  • silence deprecation alerts in examples

Re: the last point, maybe we should update the examples code to use the new functions, but we can probably do that once they're removed completely?

If you think #46 is good to merge here, feel free to do so. The only things missing would be changelog entry and we should probably test some existing package or project that uses this library to make sure everything works as expected (melange-atdgen-codec-runtime is a good candidate).

@andreypopp
Copy link
Collaborator Author

@jchavarri thanks!

I've merged your PR and then removed ppx_deriving_runtime libs completely, instead melange-json and a (new) melange-json-native is used as runtime libs instead.

@jchavarri
Copy link
Member

Cool! I started taking a look at upgrading melange atdgen codec runtime, but it seems that library blended DecodeErrors from this lib with its own DecodeErrors instances 😕 so having a hard time fixing the tests. Will see if I can take another look this week and report back with more details about how the migration looks like from the user perspective.

@jchavarri
Copy link
Member

here's a branch of melange-atdgen-codec-runtime that builds with latest main: ahrefs/melange-atdgen-codec-runtime#53. hopefully not many more changes are needed to make it work with this branch.

@lessp
Copy link

lessp commented Feb 4, 2025

Any blockers for this to go in? 👼

@andreypopp
Copy link
Collaborator Author

andreypopp commented Feb 4, 2025

Needs a rebase on top of latest changes in main.

@jchavarri
Copy link
Member

The last commit merged main into this branch, had to update a few things to accomodate the changes in #47 into this branch.

Tried to use @EmileTrotignon's new error system as much as possible. One small change in error msgs can be seen in Json_decode_test:

  • Before: Expected array of length 2, got array of length 1
  • Now: expected tuple as array of length 2 but got [4]

I think new format is better as it shows the actual value, but curious what you think.

After this, I'll check the melange-atdgen-runtime-codec branch to use the latest commit. A couple more things:

  • Noticed there are no decoding tests for result types
  • I keep high hopes that we can get rid of Jest soon 😄 and use cram tests instead. Not sure why, but the errors that Jest reports now are useless (see example shown below), so one has to add console.logs in the generated js file to understand what's going on.
  ● tuple2 › too small

    MelangeError: Json__Errors.Of_json_error/8

      795 |                         return Jest.pass;
      796 |                       }
    > 797 |                       throw new Caml_js_exceptions.MelangeError(exn.MEL_EXN_ID, exn);
          |                             ^
      798 |                     }
      799 |                     throw new Caml_js_exceptions.MelangeError(exn.MEL_EXN_ID, exn);
      800 |                   }

      at new MelangeError (src/__tests__/test/node_modules/melange.js/caml_js_exceptions.js:28:21)
      at src/__tests__/test/src/__tests__/Json_decode_test.js:797:29
      at Object._1 (src/__tests__/test/node_modules/melange.js/curry.js:31:12)
      at Object.<anonymous> (src/__tests__/test/node_modules/melange-jest.jest/jest.js:216:24)

@jchavarri
Copy link
Member

I updated ahrefs/melange-atdgen-codec-runtime#53 with this branch and everything seems to be ok.

Added a changelog entry and also a workaround to improve the tests, until we get rid of jest. From what I can see the only things missing are:

  • examples and tests are still using old Decode/Encode modules, but I guess this is fine until they get removed altogether
  • there was a comment about Unexpected_variant, not sure if we want to do something about it

otherwise, it looks good to mark as ready for review.

@@ wrap_exn (fun () ->
let (_ : int) = int (Encode.int inf) in
fail "should throw")
|> toEqual "expected an integer but got inf");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EmileTrotignon do you know how the error handling can print the OCaml expression name, rather than the JavaScript runtime value?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The javascript runtime value likely comes from Console.log, you need a custom printer to do something else, but I have a feeling this wasnt your question. Can you clarify ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured it out. Turns out this code returns the string inf:

let infinity = [%raw "Infinity"]

let () = Js.log(Stdlib.string_of_float(infinity))

@jchavarri jchavarri marked this pull request as ready for review February 20, 2025 08:03
@anmonteiro
Copy link
Member

I just merged #55 and now I don't know if that was such a good idea to merge before this one. There are a few merge conflicts because of that 😬

@EmileTrotignon
Copy link
Collaborator

Hopefully its just because files were moved around

@EmileTrotignon
Copy link
Collaborator

I'll try and solve the conflicts

* main:
  Add a feature to allow any json in a variant (#55)
@jchavarri
Copy link
Member

got main branch in, but the addition of include_subdirs seems to bork the runtest alias, see 6affb41.

@jchavarri
Copy link
Member

One thing i'd like to bring up again before merging is the split of JSON error types in two variants: Json_error and Unexpected_variant.

I don't see a lot of advantages of this split, and the big disadvantage is that it complicates the error type and downstream users code, because they have to either match on both values, or potentially miss errors on either of both.

Can we unify them back and remove the inner variant type for simplicity? Isn't "unexpected variant" also a "JSON error"?

@andreypopp
Copy link
Collaborator Author

I don't see a lot of advantages of this split, and the big disadvantage is that it complicates the error type and downstream users code, because they have to either match on both values, or potentially miss errors on either of both.

This is because of polymorphic variants scenario where one type extends another type:

type base = [`Base of int] [@@deriving json]
type ext = [base | `Ext of string] [@@deriving json]

How it works now is base_of_json does like this:

let base_of_json json =
  let tag = ... in
  match tag with
  | "Base" -> ...
  | tag -> raise (Of_json_error (Unexpected_variant tag))

then ext_of_json tries base_of_json first:

let ext_of_json json =
  match base_of_json json with
  | exception Of_json_error (Unexpected_variant _) -> (* do own matching here *)
  | v -> v

If we don't have a separate case for this, then ext_of_json has to catch any JSON error and do some extra work even for unrelated cases like ["Base", "not_an_int"] where the decoding problem is not related to the variant tag at all, but somewhere deeper in the payload. Maybe it's not a big deal, then we can indeed remove the variant for errors.

@jchavarri
Copy link
Member

@andreypopp Thanks! I think the reason you mention makes sense, I hadn't realized there was this optimization in place.

Do we need to use the same exn for both "user-originated errors" and "library-originated errors"? I took a stab at using two different exns in #59. It uses a specific exn for the variant case, which I assumed is something that users will never have to care about. It seems to simplify both the implementation and the ergonomics for users, but I'm not sure i'm missing st, please let me know what you think.

@andreypopp
Copy link
Collaborator Author

I took a stab at using two different exns in #59. It uses a specific exn for the variant case, which I assumed is something that users will never have to care about. It seems to simplify both the implementation and the ergonomics for users, but I'm not sure i'm missing st, please let me know what you think.

re ergonomic: in case of two exceptions users will have to care about these two when recovering from decoding errors, so it's still a breaking change

@jchavarri
Copy link
Member

Hm right. I was thinking that the unexpected variant exception would not "escape" the case we were discussing for polyvars extension, but it will do for regular polyvars that are not extended.

As this optimization only kicks in when there are decode operations that fail, and we can expect variants to not have crazy depth (reasonable to expect < 10 levels) would you be ok sidestepping it? I.e. use same exception everywhere? If so, I can update the branch with the changes so we can merge this (Ahrefs build is finally passing, using the latest commit).

@andreypopp
Copy link
Collaborator Author

As this optimization only kicks in when there are decode operations that fail, and we can expect variants to not have crazy depth (reasonable to expect < 10 levels) would you be ok sidestepping it?

To be honest, I'd not make this assumption: this might hold for the code we are writing but in general it might be arbitrary payloads.

I.e. use same exception everywhere?

But this is what's done in this PR no? Same exception but two cases of error within the exception.

Can you remind me what's the downside of the current approach in this PR?

The only downsides I'm aware of are:

  1. User need to match on two cases of error if they intend to handle specific cases. But I'd expect they usually just log the exception or convert it to string with the helper function we provide.

  2. If user write of_json for polyvariants by hand then they'd need to raise Of_json_error Unexpected_variant in case of tag mismatch, this is so the extension mechanism works and there will be a fallback implemented.

If so, I can update the branch with the changes so we can merge this (Ahrefs build is finally passing, using the latest commit).

Just note that there needs to be a special care always to re-raise the original exception (from the base case).

@jchavarri
Copy link
Member

To be honest, I'd not make this assumption: this might hold for the code we are writing but in general it might be arbitrary payloads.

Let's imagine there are cases out there with 10+ nested levels of types. We still agree that the optimization only kicks in for cases when there is an error in the payload? In other words, for the regular functioning of the library (parsing correct values) the optimization has no impact?

But this is what's done in this PR no? Same exception but two cases of error within the exception.

I meant one exception with just a string payload (e.g. Of_json_error of string).

  1. User need to match on two cases of error if they intend to handle specific cases. But I'd expect they usually just log the exception or convert it to string with the helper function we provide.

This is the main downside I was considering. For e.g. Ahrefs monorepo, there are ~20 instances of Of_json_error matches. After going through them, I realized in many cases I was only considering the Json_error case, and had forgotten about Unexpected_variant. This had already happened to me before, so I started wondering what were the upsides of having two errors, and restarted this conversation.

2. If user write of_json for polyvariants by hand then they'd need to raise Of_json_error Unexpected_variant in case of tag mismatch, this is so the extension mechanism works and there will be a fallback implemented.

I hadn't thought of this but I guess it's another downside.

Just note that there needs to be a special care always to re-raise the original exception (from the base case).

I updated the branch with a couple of changes:

  • In af274c1 added some tests that show how malformed extended variants would error
  • In 430253e the specific Unexpected_variant exn was removed, and the same exn is used everywhere

The changes in the tests included in the last commit show how the errors for the extended variant case were improved (unrelated to having going one way or another, but I guess some side benefit of this whole conversation).

@andreypopp
Copy link
Collaborator Author

I realized in many cases I was only considering the Json_error case, and had forgotten about Unexpected_variant.

What do you mean forgotten? Compiler should warn if you don't match on all cases, no? Anyway, as I've said, I don't think we should ever match on a specific error in user code — just convert to string and log.

@jchavarri
Copy link
Member

I was doing | Melange_json.Of_json_error(Json_error(error)) when pattern matching.

@jchavarri
Copy link
Member

I don't think we should ever match on a specific error in user code

This was code that existed already. We have to think about migration paths too.

@andreypopp
Copy link
Collaborator Author

This was code that existed already. We have to think about migration paths too.

Yeah, so the migration path is simple:

  • BEFORE: you matched on exception and got a string
  • NOW: you match on exception and get an error which you convert to string

Anyway, I don't feel strongly about keeping the two error cases, just instinctively feel that such backtracking in deserialisation code could be dangerous, though I can't think of a case where it will blow up at the moment.

@jchavarri
Copy link
Member

I added back the unexpected variant error in c0e1abf. I also fixed a case with missing errors, when decoding arrays where we were matching only on Json_error but not Unexpected_variant.

I had to add helper functions to make sure the unexpected variants carry relevant error information and not just "unexpected variant" strings.

Please take a look, if everything looks good, feel free to merge that PR into this one.

@jchavarri
Copy link
Member

We will have to remember to always match on Of_json_error err and use of_json_error_to_string. Maybe worth adding to CHANGES.md.

@jchavarri
Copy link
Member

Tests are actually failing in the branch... Looking.

@jchavarri
Copy link
Member

Should be fixed.

@jchavarri
Copy link
Member

jchavarri commented Mar 11, 2025

Summary of what the PR is doing now:

  • add tests that exercise errors for extended variants
  • make sure Unexpected_variant includes "expected..." strings with details about the expected JSON
  • add helper functions of_json_unexpected_variant and of_json_msg_unexpected_variant
  • some refactor on ppx to reuse more error handling code

@jchavarri
Copy link
Member

Confirmed with @andreypopp that we can merge this. I will prepare an opam release afterwards.

@jchavarri jchavarri merged commit ca89fd9 into main Mar 11, 2025
7 checks passed
@jchavarri jchavarri deleted the unify branch March 11, 2025 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unify Json and Ppx_deriving_runtime
5 participants