Skip to content

Commit

Permalink
Merge pull request #7 from quarkslab/doc/tuto-bionic
Browse files Browse the repository at this point in the history
Add the tutorial on Bionic
  • Loading branch information
DarkaMaul authored Dec 1, 2022
2 parents b63502a + 096d086 commit 322b783
Show file tree
Hide file tree
Showing 11 changed files with 94,134 additions and 12 deletions.
3 changes: 2 additions & 1 deletion bindings/python/quokka/analysis/arch.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,11 @@
import capstone

from quokka.types import List, RegType
from types import ModuleType


def make_enums(
capstone_module, items: List[str], blacklist: List[str], flags_enums: List[str]
capstone_module: ModuleType, items: List[str], blacklist: List[str], flags_enums: List[str]
) -> List:
"""Make enums from capstone module
Expand Down
2 changes: 1 addition & 1 deletion bindings/python/quokka/data.py
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ def get_data(self, address: AddressT) -> Data:
A Data
Raises:
ValueError if no data is found
ValueError: if no data is found
"""

# We have to iterate over every data because they are not sorted by offset
Expand Down
48 changes: 38 additions & 10 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
@@ -1,23 +1,51 @@
# Roadmap

A lot of things are missing from `Quokka`.
`Quokka` is not perfect and some features could be improved.
The list below is not a **Roadmap** _per se_ but more like a whishlist.

In no particular order :
Do not hesitate to open Issues for requesting features.

* export and use types information
* expose operands data in a comprehensive way
* performances improvements of the references

## Export Information

## TODO List
* [ ] Types Information
> For the moment, Quokka does not export types information. This feature would be super useful for various analyses.
* [ ] Stack Variable
> IDA defines stack variables in the function. Exporting them could be valuable for some workflows
* [ ] Decompiler
> Hex-Rays generates a pseudo C-code from binary code. Exporting it as well could also be nice
* [ ] Operands Data
> While the operands are exported, it is hard to understand them outside IDA without having the disassembler
> documentation. Exporting information on them could be interesting.
The list belows contains some task that should be tackled
## Refactor

* [ ] Rewrite the Reference Manager
> The `Reference Manager` is hard to understand, to maintain and to use. Plus, it has some performances issues. It has
> to be rewritten to be improved while not losing any functionalities.
* [ ] Remove the interface for Function Chunks
> A Function Chunk is an IDA abstraction for function parts. However, it is meaningless to expose them in the user
interface because users do not care about them.

* [ ] Use `weakref` for Program
* [ ] Use the CI for tests (since the plugin works with an IDA Free version, it should be possible to write end-to-end tests)
* [ ] Quokka for Ghidra / Binary Ninja (maybe a bit ambitious)
> `Program` has backref in most items in `Quokka`. However, we should use `weakref` to allow the garbage collector to
> do its magic when cleaning some parts of the program.
## Disassemblers

* [ ] Quokka for Ghidra / Binary Ninja
> While IDA works nicely, some researchers have moved to other disassemblers. Having an export working for Binary Ninja
> and Ghidra could help Quokka adoption!
## Misc

* [ ] Support [Fat binaries](https://en.wikipedia.org/wiki/Fat_binary)
> IDA supports disassembling Fat Binaries but Quokka will only export the first one. One nice feature would be to
> select which one to export
* [ ] Verify the support for unknown architectures
> Quokka should export any binary but it has been barely tested with other architectures.
## Documentation

* [ ] Document Export Modes
> Quokka has three export modes, but only one is properly documented (NORMAL)
3 changes: 3 additions & 0 deletions docs/tutorials/bionic/SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
* [Introduction](introduction.md)
* [Strategy](strategy.md)
* [Final](final.md)
76 changes: 76 additions & 0 deletions docs/tutorials/bionic/code/script.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Copyright 2022 Quarkslab
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


"""BIONIC User ID extractor
This snippet uses Quokka to extract the user ID mapping from a Bionic LibC
Usage:
python ./script <bionic_path>
Author:
Written by dm (Alexis Challande) in 2022.
"""

import quokka
from quokka import Data
from quokka.types import AddressT, DataType


def print_usertable(bionic: quokka.Program):
"""Extract the user table with a bionic libc"""

# Step1 : Find the function
getpwuid = bionic.get_function("getpwuid", approximative=False)

# Step 2: find the data ref
user_table: Data = getpwuid.data_references[1]

# Step 3: Read the first entry
users = []

first_user = bionic.executable.read_string(user_table.value)
first_id = bionic.get_data(user_table.address + 0x4).value
users.append((first_user, first_id))

# Read other entries
def read_userid(prog: quokka.Program, address: AddressT) -> int:
return prog.executable.read_data(
prog.addresser.file(address), DataType.DOUBLE_WORD
)

# Gather all components together
start = user_table.address + 0x8
while True:
data: Data = bionic.get_data(start)
if data.code_references:
break

user_name = bionic.executable.read_string(data.value)
user_id = read_userid(bionic, data.address + 0x4)

print(f"New user {user_name} with ID {user_id}")
users.append((user_name, user_id))

start += 0x8

# Print the user table
for user_name, user_id in users:
print(f"{user_name=} : {user_id=}")


if __name__ == "__main__":
program: quokka.Program = quokka.Program.from_binary(sys.argv[1])
print_usertable(program)
8 changes: 8 additions & 0 deletions docs/tutorials/bionic/final.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Final words

The final script looks like this

```python
--8<-- "docs/tutorials/bionic/code/script.py"
```

34 changes: 34 additions & 0 deletions docs/tutorials/bionic/introduction.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Bionic in Android

In this tutorial, we will learn how to extract the user mapping in the `Android` libc: `Bionic`.

## Context

Android, the mobile operating system, uses a custom `libc`: `bionic`. A few notable changes exist from the classic
implementation of the libc found on most desktop Linux systems. One of them is that the user table is embedded within
the binary.

## Objective

Automatically extract the **user mapping** from the binary[^1].

## Requirements

* A working Quokka Installation
* The [bionic library](https://raw.githubusercontent.com/quarkslab/quokka/main/docs/tutorials/bionic/samples/libc.so) (`sha256sum: 5975c8366fce5e47ccdf80f5d01f3e4521fee3b1dcf719243f4e4236d9699443`)
* An [export](https://raw.githubusercontent.com/quarkslab/quokka/main/docs/tutorials/bionic/samples/libc.quokka) of the bionic library

## Check requirements

```python
import quokka
bionic = quokka.Program("libc.quokka", "libc.so")
assert bionic is not None
```

## Final words

Once you are set, we can advance to the next steps.

[^1]:
This exercise is based on an idea from Robin David in his IDA scripting training.
Loading

0 comments on commit 322b783

Please sign in to comment.