Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comment storage and formatting options #13

Open
wants to merge 64 commits into
base: master
Choose a base branch
from

Conversation

PersonTheCat
Copy link

@PersonTheCat PersonTheCat commented Jan 1, 2019

I was working on another project using Hjson-java and realized it was missing a few things I needed. Namely, the ability to retain comments (my project uses them very liberally), as well as a few formatting options that I felt would make certain kinds of data a little bit more concise. So, I did just that. Here are some of the pros and cons that would be associated with merging this branch:

New Features

  • Comments will be retained when parsed, and can also be added easily in-code.
  • There are three supported comment positions:
    • BOL: Beginning-of-line or before-line comments. These are placed on the line above any key or value. In the context of a root JsonObject, these will serve as the file's header.
    • EOL: End-of-line comments. These are placed after any value on the same line. In the context of a root JsonObject, these will serve as the file's footer and thus will be placed on a separate line.
    • INTERIOR: Essentially any other non-inline comment that isn't directly associated with a value. These might be used in an array or object with no fields or values.
  • Additionally, there are three supported comment styles:
    • HASH: Any text following a hash (#) symbol until the end of the current line. These do support multiple lines.
    • LINE: A C style line comment, indicated by two forward slashes (//). Like hash comments, these last until the end of the current line, but can technically be repeated on subsequent lines, with no issues.
    • BLOCK: These will encapsulate any text following a (/*) and preceding a (*/) and naturally can span multiple lines. Although users are free to include additional asterisks (*) on subsequent lines, these will not be placed when formatted by hjson-java.
  • Comments are stored inside of their respective JsonValue objects and ultimately must be added as such. Users can add comments in-code by calling JsonValue#setComment or any of its variants, as well as JsonObject#set, JsonObject#add, JsonObject#setComment or any of their variants, as well as the equivalent methods in JsonArray.
  • More information is saved about how configurations are formatted when parsing Hjson files, including:
    • Whether an array or object is "compact," i.e. whether the entire container is placed on a single line.
    • Whether multiple values in an array or object are placed per-line. If this number is inconsistent, the mean value will be taken and used, instead.
  • Users can apply a variety of new formatting options when using HjsonOptions and HjsonWriter.
    • indent: Indicates the number of spaces to be placed on each new line per-level.
    • commentIndent: Indicates the number of spaces to be placed before each comment that takes up its own line or lines.
    • nlBraces: Indicates whether braces and brackets should be placed on subsequent lines vs. the same line, as in the K&R or Java syntax.
    • allowCompact: Indicates whether values in an object or array should be allowed to remain compact (see above) when serialized.
    • allowMultiVal: Indicates whether multiple values in an object or array are allowed to be placed on a single line.
  • Values in a JsonObject can be sorted alphabetically by key by calling JsonObject#sort.
  • Setters inside of HjsonObjects now return this to enable chaining.
  • Removed unused imports.

Potential Cons

  • JsonObject#sort relies on List#sort, which requires a lambda expression. I did not see these used anywhere else in the code, so I suspect that this may increase the minimum Java requirements to version 8. This can of course be removed, however.
  • Comments and other new settings inside of JsonObject and JsonArray are not technically made immutable when JsonObject#unmodifiableObject or equivalent function is called. I'm not sure if this is a memory safety issue or more of just a measure intended to prevent the actual data contained by the objects from being changed. If this is an issue, please let me know how you would solve it, so I can make changes.
  • I was not able to figure out why the final character inside of footers was not getting written by HjsonWriter. My workaround was to include an additional space at the end of the string when calling HjsonWriter#reset. I'm not very happy with this solution.
  • I did not get HjsonWriter#readBetweenVals (replacement for skipWhiteSpace()) to ignore the final new line character at the end of some comments. As a result, I wound up calling String#trim on the value that gets returned. This is technically not ideal and, as such, I'm not very happy with that either.
  • Potentially others. You may want to review.

See it in Action

You can view this test input file to see almost everything I listed in-action. You can also view this output file which was generated by parsing a JsonObject from a file, adding a few new keys and comments, and finally serializing the new object to the disk.
And of course, if you do test this and discover any issues or notice anything right away that could be improved, please let me know and I may be able to work on it sometime soon.

-PersonTheCat

Edit: I see that the Travis CI build has failed due to an issue with using a lambda expression for JsonObject#sort. I'll wait to hear from someone else about whether this needs to be changed. Again, it could of course just be removed. See above.

@dqsully
Copy link
Member

dqsully commented Jan 2, 2019

This is really cool, thanks for putting so much work into this PR.

I think if possible we should keep the Java minimum requirements the same, especially if it's just one line of code that needs to change. I haven't written Java in several years so I can't help you with Java-specific features very much. Also, before I merge this code, I would like those bugs you identified to be fixed. I'll take a look and suggest some changes in the code when I get time later.

And finally, I think that we should try to keep the features standardized with the other implementations when possible. So the HjsonOptions with the HjsonWriter should use the same option names as hjson-js when applicable. And then if there is a way to preserve whitespace as well as comments that would be nice too. Maybe like just adding another comment type NEWLINE. hjson-js stores the comments and whitespace as raw text, but I like your idea with the addComment and etc. functions.

@PersonTheCat
Copy link
Author

PersonTheCat commented Jan 27, 2019

Sure thing. Sorry for taking so long. I had just started training at a new job and wound up busy for a while. I've made the following changes:

  • Refactored "nlBraces" to "bracesSameLine," which correctly has the opposite effect.
  • Renamed "compact" to "condensed."
  • Renamed "indent" and "commentIndent" to "space" and "commentSpace," adding overloaded functions for setting spacing with both integers and strings, matching hjson-js.
  • Updated Javadocs.
  • Avoided some potential issues when reading multiline comments with inconsistent indentation.
  • Replaced the anonymous class that was causing issues before with a standard class implementing Comparator<Member>.
  • Stopped using List#sort, which did not exist before JDK 1.8.
  • Edit: Fixed double quotes (") from getting placed when multiline strings are followed by comments.

The other issues I was mentioning are not really bugs, so much as minor inefficiencies that I would have preferred to avoid, but still have not had the time to really figure out.

I also see that a series of unit tests regarding Stringify have failed. As far as I can tell, all of these tests have failed due to intended changes that do not actually break functionality, so much as just change the formatting and retain comments.

Moving forward, I could look into retaining whitespace between fields, but it may be a while before I'm able to start working on that due to work and other projects. Please let me know how you would like to proceed or any other thoughts you have.

@dqsully
Copy link
Member

dqsully commented Feb 9, 2019

Well congrats on the new job first off! Work's been busy for me too, but I've finally found some good time to catch up on all my open source stuff. I just pushed a couple commits - one to allow users to configure if commenting is enabled or not, in part to fix some of the tests on Travis CI. The other just disables bracesSameLine by default for more consistency.

Also, it looks like a few bugs were introduced to do with object key/value commenting, array flattening, and a weird divide by zero case. I'll take a look at these later.

@PersonTheCat
Copy link
Author

Sure thing. Sounds great! I may be able to look into some of these issues this weekend, as well. Not sure where the divide by zero case could be occurring, though. And thanks!

@PersonTheCat
Copy link
Author

PersonTheCat commented Feb 25, 2019

I may have inadvertently fixed the divide by zero issue. I'm still not sure what caused it, but I no longer see it in the unit tests on Travis CI. There is still one ParseException, reporting Found a punctuator character ',' when expecting a quoteless string. I'll need to look at the actual tests to see if I can figure out what's going on. Everything else, however, is really just formatting differences. I'll try to work on a unifying a few of those in the near future.

Edit: I have fixed the ParseException. Had no idea that commas before values on new lines was a part of the spec. ¯\(ツ)/¯ At this point, there are three main holdups:

  • ML strings are printing an extra space before the new line.
  • Arrays that are parsed as condensed remain that way, and thus produce inconsistencies with the original tests that I've neglected to handle this whole time.
  • Containers now have additional spaces before and after all elements.

I will be changing the original tests to account for these differences. Let me know if you have any objections, of course, and I'll continue to work around them.

Edit 2: I have finished updating all formatting changes to the tests, all of which are passing now for jdk8. The jdk7 tests are reporting some pre-build type issue which appears to be unrelated to any changes I made. I'll try to take a look at that soon.

Edit 3: Finished some code refactoring now that things are starting to look finished again on my end. Also added a new unit test specifically for comment support. It seems like the only hold up now is with the JDK7 error on Travis-CI, which as far as I can tell, seems to be related to support being phased for older JDK versions. I will most likely wait to hear from you before proceeding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants