Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental line based editing #2

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

AidanTilgner
Copy link

So I think with LLMs, one way to potentially increase their accuracy is to give them more reference points to work with. The main meat of this update is:

  1. add line numbers to readFile so that it can reference them when making edits
  2. refactor the edit methods, so that there is now a dedicated editFileBySubstring and editFileByLines tool for the LLM to decide between
  3. added and refactored tests to fit new shape

I also think that this may be more experimental. I'm slightly worried that having the two different edit methods might confuse an LLM, but it could also just provide options. I think that editing by lines could potentially offer benefits for accuracy in larger blocks of changes, though. It's sort of inspired by ed, which has been referred to as:

the most user-hostile editor ever created

But I'm hoping might be more LLM-user-friendly.

Would be interested in feedback and thoughts, so far the project has been really fun to use and poke around with, so I'm looking to contribute more if that's welcome :)

@AidanTilgner
Copy link
Author

On this note. I have tested out the line-based editing a couple of times on live projects, and it seems to be working well.

@cjbest
Copy link
Member

cjbest commented Aug 1, 2024

Thank you for submitting a PR!!!!!!

This is very interesting! When I first built it, I started with line numbers because that seemed like the most natural way, but I found it was constantly messing them up by guessing an only-approximately-correct number, which is why I came up with the weird substring stuff to begin this.

But that was back with 4-turbo, maybe things have gotten better.

In order ot merge this I think I would want to either wait until I can do some testing with it myself to confirm (which I can do but realistically might take a while) OR if there were some sort of test. This would be really cool actually, if we had basically UC integration tests that run with a specific dialog, like "go edit this file to say blah" and then check if it does it right. (The challenges were meant to be like this but they are too confusing)

Anyway thank you again for your interest and the PR! Are you using UC? I'd love to know.

@AidanTilgner
Copy link
Author

I actually used UC a couple of times for a project today, and it performed suprisingly well. I think it does a great job at not just being a chatbot, but working really well at performing functions when guided correctly. I was actually really impressed by how well it works and I compliment you on the design! My plan is to keep testing it on other codebases and getting to know it better.

You mentioned that you tried this initially. I'm wondering, did the readFile tool also read the line numbers to the LLM? I could see the LLM be bad at getting line numbers right if it's just guessing based on text (it can't even count letters right lol). I would think that having the line numbers in the readFile output to the LLM allows the model to reference specific line numbers more accurately.

I am very curious whether this actually improves performance, because it definitely wouldn't be very helpful if it didn't lol. The challenges seem like a good idea, but I can see them be a little more on the confusing side. I think ideally an integration test would use a dialogical approach, do you think the answer might be an improvement on challenges, or a new type of test entirely?

I might need to poke around with challenges more to understand the issue though, I can definitely find a way to test it and get back to you on that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants