Skip to content
@pravaco

Prava

Open sourced work:

API for computer use(prava-sdk, ca-server)

Models: Grounding+planning hardness for OSWorld. Privately I used codex+extensive planning to benchmark max but OSWorld scores were brittle.

Research:

a) Edited the vision encoder such that we would show the latest "compiled" screenshot instead of having to re-encode the whole thing again, which worked but was slightly slower

b) re-implemented MSFT's GUI-Actor which essentially trains a small attention head to try predict which patch/region instead of coordinate grounding


API: CUA Control API docs

Control API library implementation

Made the control api compatible with existing Claude cua/openai setups by reverse engineering their function calling

Models: prava-fc doesnt include latest models which use codex(for strong tool calling) + qwen3 for good grounding + codex to write mini bash scripts to enhance computer control

Research: Differential Vision Encoding for CUA

Coordinate-Free Visual Grounding

Popular repositories Loading

  1. cuaview-releases cuaview-releases Public

    native claude computer use app for mac

    Metal 61 3

  2. archon.app archon.app Public

    Archon is a small bar that sits at the bottom of your Mac/Windows screen where you can type what you want your computer to do in natural language.

    3

  3. .github .github Public

  4. prava-sdk prava-sdk Public

    API for Labor

    TypeScript

  5. ca-server ca-server Public

    CUA API Server

    TypeScript

Repositories

Showing 5 of 5 repositories

Top languages

Loading…

Most used topics

Loading…