Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for range format #212

Open
ericdallo opened this issue Mar 16, 2021 · 14 comments
Open

Support for range format #212

ericdallo opened this issue Mar 16, 2021 · 14 comments

Comments

@ericdallo
Copy link

Hello, on clojure-lsp, we use cljfmt as our format feature and it works well for most cases, thank you for your work!

There is an issue though related to performance with huge buffers. The LSP protocol allows two kind of format requests: format the whole buffer and format a ranged code. clojure-lsp implement both but the range format seems to have some performance issues with large buffers since we don't correctly format the range code but the whole form like defn, def etc.

Here is how we format ATM, the reformat-node is not enough for us as if I select only the bar function of the code below, the spacings will not be correct:

(my-func (foo a
                      b
                      c))

(cljfmt/reformat-node foo-node)

(my-func (foo a
    b
    c))

This happens because cljfmt doesn't know about the previous code before the foo node.

My suggestion is to add another function on cljfmt that accept a line and column to format, just like we call to the full buffer format:

(cljfmt.core/reformat-string text line column)

WDYT?

@weavejester
Copy link
Owner

I don't understand how your proposal is supposed to work. Can you explain further?

@ericdallo
Copy link
Author

ericdallo commented Mar 16, 2021

Sorry for not be specific, I think my suggestion is missing the end range. You can see above an example where | means the start and end range selected by the editor:

(my-func (|foo a
                            b
               c|))

clojure-lsp would call:

(cljfmt.core/reformat-string full-text 1 12 3 16)

would result formatting only the selected range:

(my-func (foo a
              b
              c))

Does that makes sense?

@weavejester
Copy link
Owner

Why is the b so far right after formatting? Even if you're just formatting a section, shouldn't it still be in line?

@ericdallo
Copy link
Author

hahaha, yes, sorry, GitHub indentation is weird, fixed!

@weavejester
Copy link
Owner

If implemented, this should be a separate function, e.g. reformat-region. I also don't know how you'd format a region that is missing ending brackets or can't be parsed in isolation.

Have you considered pulling out just the current top level form, adding some unique markers to mark the start and end of the region, formatting and then snipping out the relevent area using the markers?

@ericdallo
Copy link
Author

That's how we do ATM, we format the whole top level form, but if we format only the parent form of the selected range, it will ignore the spacing

@weavejester
Copy link
Owner

but if we format only the parent form of the selected range, it will ignore the spacing

I don't understand what this means. Can you provide an example?

@ericdallo
Copy link
Author

Sure, so, as I said it here, If I format the code with the selected range between the |s:

(a
 (b
 |(c
          (+ 2 3)|)))

I'd expect to be formatted as:

(a
 (b
  (c
   (+ 2 3))))

But when we call (cljfmt/reformat-form c-node {}), we get:

(a
 (b
  (c
 (+ 2 3))))

You can see that the spacing before c was ignored because cljfmt only knows about the c-form, so the spaces that it adds is based as if c was a top level form.

That's why I suggested having a new function on cljfmt that we pass the whole text, the range and cljfmt take care of formatting only the range specified.

@weavejester
Copy link
Owner

Okay. How do you envision this new reformat-region function to work? How are you going to take into account regions that cut off at syntax boundaries? If you're not going to look at the top level form, how are you going to get the indentation?

I'm also curious as to why formatting the top level form doesn't give you the performance you need. How large are the forms that you're having problems with? Do you have examples?

@ericdallo
Copy link
Author

For the regions that cut off at syntax boundaries, cljfmt could just get the parent form of the start line/col maybe?
If we send the whole text, cljfmt would know the indentation getting the parent/top form to know the indentation, right?

The initial issue use simple forms, but I could not repro that, I can repro that on forms like this though, If I try to slurp/barf a parent, it'll call later the rangeFormatting of clojure-lsp, and for this case takes about 6s to format this top level form!

@ericdallo
Copy link
Author

ericdallo commented Mar 17, 2021

So, I just found that this could be a cljfmt/reformat-node issue, if I tell cljfmt to format this whole file, it takes about 1096ms, I think is expected as this buffer is kind of big.
But If a rangeFormat only this code, it takes 6539ms.

I didn't make a benchmark, those are numbers from clojure-lsp server <-> editor client request time, so probably it's adding some ms to that, even so, 6s seems too much to a reformat-node, am I wrong?

@weavejester
Copy link
Owner

Yeah, that's taking way too long. It should be a few milliseconds at most. There must be something that's causing it issues.

@snoe
Copy link
Contributor

snoe commented Mar 17, 2021

assuming there's no problem with reformat-node, for the proposal, I think it would help to introduce a function like reformat-enclosing-form instead of range based which probably wouldn't work well with rewrite-clj.

It could jump to a cursor position and call z/up if on a token node to get to an enclosing list, map, or vector node. Then also look at the parent form to apply indent rules and determine how much the enclosing form should be indented

clj-rewrite should be able to then reformat the enclosing node followed by pushing each subsequent line some number of spaces (or tabs) to the right according to the rules.

Right now, clients have to operate on a top level form because they don't know the indent rules for nested forms. Operating on the top level form of, say, big edn files can be pretty painful for clients that are autoindenting.

@weavejester
Copy link
Owner

That's fine, but the first step should be fixing the performance issue, as extra functions may be unnecessary if the existing functions are made to be performant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants