Skip to content

Commit

Permalink
Major Refactor: Restructure backend to depend on LibBS (#22)
Browse files Browse the repository at this point in the history
* Running script in decompiler stubs

* More changes

* Working abstraction tested in IDA. Still need an `askKey` api

* Fixed the Binja side code

* Add some headless Ghidra code

* Ghidra, IDA, Binja working. Ghidra still needs a way to start UI without terminal

* Update the README, and GHIDRA WORKS!!!
  • Loading branch information
mahaloz authored Dec 4, 2023
1 parent a7708e1 commit acd43d0
Show file tree
Hide file tree
Showing 25 changed files with 666 additions and 1,300 deletions.
44 changes: 44 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
*.pyc
.idea/
*.egg-info/
testing/
*.o
*.so
*.a
.gdb_history
*.i64
*.idb
*.id0
*.id1
*.id2
*.nam
*.til
*.swp
*.dll
*.obj
*.lib
*.exp
*.pdb
*.ilk
angr/tests/*.png
screenlog.0
angr/tests/screenlog.0
angr/screenlog.0
.idea
*.egg-info
/build
/tags
MANIFEST
dist
.eggs
.vscode/
*.db
.DS_Store
.pytest_cache/
binsync/decompilers/ghidra/client/build/
binsync/decompilers/ghidra/client/dist/
binsync/decompilers/ghidra/client/bin/
binsync/decompilers/ghidra/client/.gradle/
binsync/decompilers/ghidra/client/.classpath
binsync/decompilers/ghidra/client/.project
binsync/decompilers/ghidra/client/Ghidra/
93 changes: 55 additions & 38 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,44 +1,73 @@
# DAILA
Decompiler Artificial Intelligence Language Assistant - Built on OpenAI.
Utilize OpenAI to improve your decompilation experience in most modern decompilers.
The Decompiler Artificial Intelligence Language Assistant (DAILA) is a unified interface for AI systems to be used in decompilers.
Using DAILA, you can utilize various AI systems, like local and remote LLMs, all in the same scripting and GUI interfaces.

![](./assets/ida_daila.png)
<img src="./assets/ida_daila.png" style="width: 50%;" alt="DAILA context menu"/>

DAILA's main purpose is to provide a unified interface for AI systems to be used in decompilers.
To accomplish this, DAILA provides a lifted interface, relying on the BinSync library [LibBS](https://github.com/binsync/libbs) to abstract away the decompiler.
**All decompilers supported in LibBS are supported in DAILA, which currently includes IDA, Ghidra, Binja, and angr-management.**

## Installation
Clone down this repo and pip install and use the daila installer:
Install our library backend through pip and our decompiler plugin through our installer:
```bash
pip3 install -e . && dailalib --install
pip3 install dailalib && daila --install
```

Depending on your decompiler, this will attempt to copy the script files into your decompiler and install
the DAILA core to your current Python. If you are using Binja or IDA, make sure your Python is the same
as the one you are using in your decompiler.

If you are using Ghidra, you may be required to enable the `$USER_HOME/ghidra_scripts` as a valid
scripts path.
### Ghidra Extras
You need to do a few extra steps to get Ghidra working.
Next, enable the DAILA plugin:
1. Start Ghidra and open a binary
2. Goto the `Windows > Script Manager` menu
3. Search for `daila` and enable the script

If your decompiler does not have access to the `OPENAI_API_KEY`, then you must use the decompiler option from
DAILA to set the API key. A popup will appear for you to enter your key.
You must have `python3` in your path for the Ghidra version to work. We quite literally call it from inside Python 2.
You may also need to enable the `$USER_HOME/ghidra_scripts` as a valid scripts path in Ghidra.

### Manual Install
### Manual Install (if above fails)
If the above fails, you will need to manually install.
To manually install, first `pip3 install -e .` on the repo, then copy the python file for your decompiler in your
decompilers plugins/scripts folder.
To manually install, first `pip3 install dailalib` on the repo, then copy the [daila_plugin.py](./dailalib/daila_plugin.py) file to your decompiler's plugin directory.

### Ghidra Gotchas
You must have `python3` in your path for the Ghidra version to work. We quite literally call it from inside Python 2.

## Usage
In your decompiler you can access the DAILA options in one of two ways:
1. If you are not in Ghidra, you can right-click a function and go to `Plugins` or directly use the `DAILA ...` menu.
2. If you are in Ghidra, use `Tools->DAILA` then use the operation selector
DAILA is designed to be used in two ways:
1. As a decompiler plugin with a GUI
2. As a scripting library in your decompiler

### Decompiler GUI
With the exception of Ghidra (see below), when you start your decompiler you will have a new context menu
which you can access when you right-click anywhere in a function:

<img src="./assets/ida_show_menu_daila.png" style="width: 50%;" alt="DAILA context menu"/>

If you are using Ghidra, go to `Tools->DAILA->Start DAILA Backend` to start the backend server.
After you've done this, you can use the context menu as shown above.

### Scripting
You can use DAILA in your own scripts by importing the `dailalib` package.
Here is an example using the OpenAI API:
```python
from dailalib import OpenAIAPI
from libbs.api import DecompilerInterface

deci = DecompilerInterface.discover_interface()
ai_api = OpenAIAPI(decompiler_interface=deci)
for function in deci.functions:
summary = ai_api.summarize_function(function)
```

All operations that DAILA can perform can be found from the DAILA context menu, which in some decompilers may just be
the menu described above.

![](./assets/ida_show_menu_daila.png)
## Supported AI Backends
### OpenAI
DAILA supports the OpenAI API. To use the OpenAI API, you must have an OpenAI API key.
If your decompiler does not have access to the `OPENAI_API_KEY` environment variable, then you must use the decompiler option from
DAILA to set the API key.

Comments will appear in the function header with the response or an error message.
Currently, DAILA supports the following prompts:
- Summarize a function
- Rename variables
- Rename function
- Identify the source of a function

## Supported Decompilers
- IDA
Expand All @@ -48,16 +77,4 @@ Comments will appear in the function header with the response or an error messag
![](./assets/binja_daila.png)

- Ghidra
![](./assets/ghidra_daila.png)

## Features
### Function Identification
We use ChatGPT to attempt to:
1. Identify which open-source project this decompilation could be a result of
2. Find a link to that said source if it exists

### Function Summarization
Summarizes in human-readable text what this function does

### Vulnerability Detection
Attempts to find and describe the vulnerability in the function
![](./assets/ghidra_daila.png)
32 changes: 31 additions & 1 deletion dailalib/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,31 @@
__version__ = "1.3.0"
__version__ = "2.0.0"

from .api import AIAPI, OpenAIAPI
from libbs.api import DecompilerInterface


def create_plugin(*args, **kwargs):

ai_api = OpenAIAPI(delay_init=True)
# create context menus for prompts
gui_ctx_menu_actions = {
f"DAILA/{prompt_name}": (prompt.desc, getattr(ai_api, prompt_name))
for prompt_name, prompt in ai_api.prompts_by_name.items()
}
# create context menus for others
gui_ctx_menu_actions["DAILA/Update API Key"] = ("Update API Key", ai_api.ask_api_key)

# create decompiler interface
force_decompiler = kwargs.pop("force_decompiler", None)
deci = DecompilerInterface.discover_interface(
force_decompiler=force_decompiler,
# decompiler-creation args
plugin_name="DAILA",
init_plugin=True,
gui_ctx_menu_actions=gui_ctx_menu_actions,
ui_init_args=args,
ui_init_kwargs=kwargs
)
ai_api.init_decompiler_interface(decompiler_interface=deci)

return deci.gui_plugin
26 changes: 13 additions & 13 deletions dailalib/__main__.py
Original file line number Diff line number Diff line change
@@ -1,38 +1,38 @@
import argparse

from .installer import DAILAInstaller
from .controller_server import DAILAServer
import dailalib


def main():
parser = argparse.ArgumentParser(
description="""
The DAILA Command Line Util.
The DAILA CLI is used to install, run, and host the DAILA plugin.
""",
epilog="""
Examples:
daila --install
daila install
"""
)
parser.add_argument(
"--install", action="store_true", help="""
Install the DAILA core to supported decompilers as plugins. This option will start an interactive
prompt asking for install paths for all supported decompilers. Each install path is optional and
will be skipped if not path is provided during install.
"""
"-i", "--install", action="store_true", help="Install DAILA into your decompilers"
)
parser.add_argument(
"-server", action="store_true", help="""
Starts the DAILA Server for use with Ghidra
"""
"-s", "--server", help="Run a a headless server for DAILA", choices=["ghidra"]
)
parser.add_argument(
"-v", "--version", action="version", version=f"DAILA {dailalib.__version__}"
)
args = parser.parse_args()

if args.install:
DAILAInstaller().install()
elif args.server:
if args.server != "ghidra":
raise NotImplementedError("Only Ghidra is supported for now")

if args.server:
DAILAServer(use_py2_exceptions=True).start_xmlrpc_server()
from dailalib import create_plugin
create_plugin(force_decompiler="ghidra")


if __name__ == "__main__":
Expand Down
2 changes: 2 additions & 0 deletions dailalib/api/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
from .ai_api import AIAPI
from .openai import OpenAIAPI, Prompt
99 changes: 99 additions & 0 deletions dailalib/api/ai_api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
from typing import Dict, Optional
from functools import wraps

from libbs.api import DecompilerInterface


class AIAPI:
def __init__(
self,
decompiler_interface: Optional[DecompilerInterface] = None,
decompiler_name: Optional[str] = None,
use_decompiler: bool = True,
delay_init: bool = False,
# size in bytes
min_func_size: int = 0x10,
max_func_size: int = 0xffff,
model=None,
):
# useful for initing after the creation of a decompiler interface
self._dec_interface: Optional[DecompilerInterface] = None
self._dec_name = None
if not delay_init:
self.init_decompiler_interface(decompiler_interface, decompiler_name, use_decompiler)

self._min_func_size = min_func_size
self._max_func_size = max_func_size
self.model = model or self.__class__.__name__

def init_decompiler_interface(
self,
decompiler_interface: Optional[DecompilerInterface] = None,
decompiler_name: Optional[str] = None,
use_decompiler: bool = True
):
self._dec_interface: DecompilerInterface = DecompilerInterface.discover_interface(force_decompiler=decompiler_name) \
if use_decompiler and decompiler_interface is None else decompiler_interface
self._dec_name = decompiler_name if decompiler_interface is None else decompiler_interface.name
if self._dec_interface is None and not self._dec_name:
raise ValueError("You must either provide a decompiler name or a decompiler interface.")

def info(self, msg):
if self._dec_interface is not None:
self._dec_interface.info(msg)

def debug(self, msg):
if self._dec_interface is not None:
self._dec_interface.debug(msg)

def warning(self, msg):
if self._dec_interface is not None:
self._dec_interface.warning(msg)

def error(self, msg):
if self._dec_interface is not None:
self._dec_interface.error(msg)

@property
def has_decompiler_gui(self):
return self._dec_interface is not None and not self._dec_interface.headless

@staticmethod
def requires_function(f):
"""
A wrapper function to make sure an API call has decompilation text to operate on and possibly a Function
object. There are really two modes any API call operates in:
1. Without Decompiler Backend: requires provided dec text
2. With Decompiler Backend:
2a. With UI: Function will be collected from the UI if not provided
2b. Without UI: requires a FunctionA
The Function collected from the UI is the one the use is currently looking at.
"""
@wraps(f)
def _requires_function(*args, ai_api: "AIAPI" = None, **kwargs):
function = kwargs.pop("function", None)
dec_text = kwargs.pop("dec_text", None)
use_dec = kwargs.pop("use_dec", True)

if not dec_text and not use_dec:
raise ValueError("You must provide decompile text if you are not using a dec backend")

# two mode constructions: with decompiler and without
# with decompiler backend
if use_dec:
if not ai_api.has_decompiler_gui and function is None:
raise ValueError("You must provide a Function when using this with a decompiler")

# we must have a UI if we have no func
if function is None:
function = ai_api._dec_interface.active_context()

# get new text with the function that is present
if dec_text is None:
dec_text = ai_api._dec_interface.decompile(function.addr)

return f(*args, function=function, dec_text=dec_text, use_dec=use_dec, **kwargs)

return _requires_function

2 changes: 2 additions & 0 deletions dailalib/api/openai/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
from .openai_api import OpenAIAPI
from .prompts import Prompt
Loading

0 comments on commit acd43d0

Please sign in to comment.