run locally #9

ShasTheMass · 2024-11-06T19:39:29Z

Hello, great work here. But I wonder if we could make this run completely locally? e.g. with an Ollama based model? has anyone tried this? are models good enough (the small ones that fit on a, say 16GB mem, PC/MAC) to understand screenshots?

Hope to hear back from you @deedy

kediaharshit9 · 2024-11-08T07:05:48Z

One of the key enabler of computer control is the LM looking at the image and prediction the action with proper coordinates.
This feature is surprisingly accurate on the new claude-3.5-sonnet model.

Not very confident on the samller VLMs being able to do that accurately (as of today). Hopefully someone can create a finetune dataset and then we can have smaller/quantized models do accurately on this step (which might affect the reasoning capabilities then).

Wishing for a truely local future soon, cheers!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

run locally #9

run locally #9

ShasTheMass commented Nov 6, 2024

kediaharshit9 commented Nov 8, 2024

run locally #9

run locally #9

Comments

ShasTheMass commented Nov 6, 2024

kediaharshit9 commented Nov 8, 2024