Experimental line based editing #2

AidanTilgner · 2024-07-25T17:03:56Z

So I think with LLMs, one way to potentially increase their accuracy is to give them more reference points to work with. The main meat of this update is:

add line numbers to readFile so that it can reference them when making edits
refactor the edit methods, so that there is now a dedicated editFileBySubstring and editFileByLines tool for the LLM to decide between
added and refactored tests to fit new shape

I also think that this may be more experimental. I'm slightly worried that having the two different edit methods might confuse an LLM, but it could also just provide options. I think that editing by lines could potentially offer benefits for accuracy in larger blocks of changes, though. It's sort of inspired by ed, which has been referred to as:

the most user-hostile editor ever created

But I'm hoping might be more LLM-user-friendly.

Would be interested in feedback and thoughts, so far the project has been really fun to use and poke around with, so I'm looking to contribute more if that's welcome :)

AidanTilgner · 2024-07-31T19:54:17Z

On this note. I have tested out the line-based editing a couple of times on live projects, and it seems to be working well.

cjbest · 2024-08-01T00:16:14Z

Thank you for submitting a PR!!!!!!

This is very interesting! When I first built it, I started with line numbers because that seemed like the most natural way, but I found it was constantly messing them up by guessing an only-approximately-correct number, which is why I came up with the weird substring stuff to begin this.

But that was back with 4-turbo, maybe things have gotten better.

In order ot merge this I think I would want to either wait until I can do some testing with it myself to confirm (which I can do but realistically might take a while) OR if there were some sort of test. This would be really cool actually, if we had basically UC integration tests that run with a specific dialog, like "go edit this file to say blah" and then check if it does it right. (The challenges were meant to be like this but they are too confusing)

Anyway thank you again for your interest and the PR! Are you using UC? I'd love to know.

AidanTilgner · 2024-08-01T00:42:41Z

I actually used UC a couple of times for a project today, and it performed suprisingly well. I think it does a great job at not just being a chatbot, but working really well at performing functions when guided correctly. I was actually really impressed by how well it works and I compliment you on the design! My plan is to keep testing it on other codebases and getting to know it better.

You mentioned that you tried this initially. I'm wondering, did the readFile tool also read the line numbers to the LLM? I could see the LLM be bad at getting line numbers right if it's just guessing based on text (it can't even count letters right lol). I would think that having the line numbers in the readFile output to the LLM allows the model to reference specific line numbers more accurately.

I am very curious whether this actually improves performance, because it definitely wouldn't be very helpful if it didn't lol. The challenges seem like a good idea, but I can see them be a little more on the confusing side. I think ideally an integration test would use a dialogical approach, do you think the answer might be an improvement on challenges, or a new type of test entirely?

I might need to poke around with challenges more to understand the issue though, I can definitely find a way to test it and get back to you on that.

AidanTilgner added 5 commits July 24, 2024 23:40

feat: started adding line-based editing

e886025

chore: remove log

53e44c2

feat: better prompt

6284c7c

feat: better prompting..again

52f6c97

fix: removed file I didn't mean to add

46417d9

AidanTilgner mentioned this pull request Aug 26, 2024

Integration tests or benchmarks? #3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Experimental line based editing #2

Experimental line based editing #2

Uh oh!

AidanTilgner commented Jul 25, 2024

Uh oh!

AidanTilgner commented Jul 31, 2024

Uh oh!

cjbest commented Aug 1, 2024

Uh oh!

AidanTilgner commented Aug 1, 2024

Uh oh!

Uh oh!

Experimental line based editing #2

Are you sure you want to change the base?

Experimental line based editing #2

Uh oh!

Conversation

AidanTilgner commented Jul 25, 2024

Uh oh!

AidanTilgner commented Jul 31, 2024

Uh oh!

cjbest commented Aug 1, 2024

Uh oh!

AidanTilgner commented Aug 1, 2024

Uh oh!

Uh oh!