From b06c0985b40de12e5a394eb6afcb9551fe79915d Mon Sep 17 00:00:00 2001 From: "github-actions[bot]" <41898282+github-actions[bot]@users.noreply.github.com> Date: Thu, 24 Oct 2024 16:49:59 +0000 Subject: [PATCH] Deployed cd27f01 with MkDocs version: 1.6.1 --- .nojekyll | 0 404.html | 1 + CNAME | 1 + applications/overview/index.html | 1 + .../program-reconstruction/index.html | 1 + applied-research/code-similarity/index.html | 1 + applied-research/overview/index.html | 1 + applied-research/symbol-recovery/index.html | 7 + .../vulnerability-discovery/index.html | 1 + assets/images/favicon.png | Bin 0 -> 1870 bytes assets/javascripts/bundle.83f73b43.min.js | 16 + assets/javascripts/bundle.83f73b43.min.js.map | 7 + assets/javascripts/lunr/min/lunr.ar.min.js | 1 + assets/javascripts/lunr/min/lunr.da.min.js | 18 + assets/javascripts/lunr/min/lunr.de.min.js | 18 + assets/javascripts/lunr/min/lunr.du.min.js | 18 + assets/javascripts/lunr/min/lunr.el.min.js | 1 + assets/javascripts/lunr/min/lunr.es.min.js | 18 + assets/javascripts/lunr/min/lunr.fi.min.js | 18 + assets/javascripts/lunr/min/lunr.fr.min.js | 18 + assets/javascripts/lunr/min/lunr.he.min.js | 1 + assets/javascripts/lunr/min/lunr.hi.min.js | 1 + assets/javascripts/lunr/min/lunr.hu.min.js | 18 + assets/javascripts/lunr/min/lunr.hy.min.js | 1 + assets/javascripts/lunr/min/lunr.it.min.js | 18 + assets/javascripts/lunr/min/lunr.ja.min.js | 1 + assets/javascripts/lunr/min/lunr.jp.min.js | 1 + assets/javascripts/lunr/min/lunr.kn.min.js | 1 + assets/javascripts/lunr/min/lunr.ko.min.js | 1 + assets/javascripts/lunr/min/lunr.multi.min.js | 1 + assets/javascripts/lunr/min/lunr.nl.min.js | 18 + assets/javascripts/lunr/min/lunr.no.min.js | 18 + assets/javascripts/lunr/min/lunr.pt.min.js | 18 + assets/javascripts/lunr/min/lunr.ro.min.js | 18 + assets/javascripts/lunr/min/lunr.ru.min.js | 18 + assets/javascripts/lunr/min/lunr.sa.min.js | 1 + .../lunr/min/lunr.stemmer.support.min.js | 1 + assets/javascripts/lunr/min/lunr.sv.min.js | 18 + assets/javascripts/lunr/min/lunr.ta.min.js | 1 + assets/javascripts/lunr/min/lunr.te.min.js | 1 + assets/javascripts/lunr/min/lunr.th.min.js | 1 + assets/javascripts/lunr/min/lunr.tr.min.js | 18 + assets/javascripts/lunr/min/lunr.vi.min.js | 1 + assets/javascripts/lunr/min/lunr.zh.min.js | 1 + assets/javascripts/lunr/tinyseg.js | 206 + assets/javascripts/lunr/wordcut.js | 6708 +++++++++++++++++ .../workers/search.6ce7567c.min.js | 42 + .../workers/search.6ce7567c.min.js.map | 7 + assets/stylesheets/main.0253249f.min.css | 1 + assets/stylesheets/main.0253249f.min.css.map | 1 + assets/stylesheets/palette.06af60db.min.css | 1 + .../stylesheets/palette.06af60db.min.css.map | 1 + contributing/index.html | 3 + decompilers/decompiler/angr/index.html | 1 + .../decompiler/binary_ninja/index.html | 1 + decompilers/decompiler/ghidra/index.html | 1 + decompilers/decompiler/ida_pro/index.html | 1 + decompilers/directory/index.html | 1 + decompilers/history/index.html | 1 + decompilers/tools/index.html | 1 + .../cfg-recovery/disassembly/index.html | 1 + .../cfg-recovery/function-recovery/index.html | 1 + .../cfg-recovery/jump-resolving/index.html | 2 + fundamentals/cfg-recovery/lifting/index.html | 52 + fundamentals/cfg-recovery/overview/index.html | 25 + fundamentals/evaluation/index.html | 1 + fundamentals/neural-decompilation/index.html | 1 + fundamentals/overview/index.html | 1 + fundamentals/structuring/gotoless/index.html | 1 + fundamentals/structuring/overview/index.html | 33 + .../structuring/schema-based/index.html | 1 + fundamentals/type-recovery/index.html | 41 + index.html | 1 + misc/blogs/index.html | 1 + misc/talks/index.html | 1 + search/search_index.json | 1 + sitemap.xml | 119 + sitemap.xml.gz | Bin 0 -> 465 bytes static/img/dcc_schema.png | Bin 0 -> 90167 bytes static/img/dec-pipeline.svg | 4 + static/img/disass_ex.svg | 3 + static/img/logo.png | Bin 0 -> 448701 bytes static/img/typing.svg | 4 + 83 files changed, 7595 insertions(+) create mode 100644 .nojekyll create mode 100644 404.html create mode 100644 CNAME create mode 100644 applications/overview/index.html create mode 100644 applications/program-reconstruction/index.html create mode 100644 applied-research/code-similarity/index.html create mode 100644 applied-research/overview/index.html create mode 100644 applied-research/symbol-recovery/index.html create mode 100644 applied-research/vulnerability-discovery/index.html create mode 100644 assets/images/favicon.png create mode 100644 assets/javascripts/bundle.83f73b43.min.js create mode 100644 assets/javascripts/bundle.83f73b43.min.js.map create mode 100644 assets/javascripts/lunr/min/lunr.ar.min.js create mode 100644 assets/javascripts/lunr/min/lunr.da.min.js create mode 100644 assets/javascripts/lunr/min/lunr.de.min.js create mode 100644 assets/javascripts/lunr/min/lunr.du.min.js create mode 100644 assets/javascripts/lunr/min/lunr.el.min.js create mode 100644 assets/javascripts/lunr/min/lunr.es.min.js create mode 100644 assets/javascripts/lunr/min/lunr.fi.min.js create mode 100644 assets/javascripts/lunr/min/lunr.fr.min.js create mode 100644 assets/javascripts/lunr/min/lunr.he.min.js create mode 100644 assets/javascripts/lunr/min/lunr.hi.min.js create mode 100644 assets/javascripts/lunr/min/lunr.hu.min.js create mode 100644 assets/javascripts/lunr/min/lunr.hy.min.js create mode 100644 assets/javascripts/lunr/min/lunr.it.min.js create mode 100644 assets/javascripts/lunr/min/lunr.ja.min.js create mode 100644 assets/javascripts/lunr/min/lunr.jp.min.js create mode 100644 assets/javascripts/lunr/min/lunr.kn.min.js create mode 100644 assets/javascripts/lunr/min/lunr.ko.min.js create mode 100644 assets/javascripts/lunr/min/lunr.multi.min.js create mode 100644 assets/javascripts/lunr/min/lunr.nl.min.js create mode 100644 assets/javascripts/lunr/min/lunr.no.min.js create mode 100644 assets/javascripts/lunr/min/lunr.pt.min.js create mode 100644 assets/javascripts/lunr/min/lunr.ro.min.js create mode 100644 assets/javascripts/lunr/min/lunr.ru.min.js create mode 100644 assets/javascripts/lunr/min/lunr.sa.min.js create mode 100644 assets/javascripts/lunr/min/lunr.stemmer.support.min.js create mode 100644 assets/javascripts/lunr/min/lunr.sv.min.js create mode 100644 assets/javascripts/lunr/min/lunr.ta.min.js create mode 100644 assets/javascripts/lunr/min/lunr.te.min.js create mode 100644 assets/javascripts/lunr/min/lunr.th.min.js create mode 100644 assets/javascripts/lunr/min/lunr.tr.min.js create mode 100644 assets/javascripts/lunr/min/lunr.vi.min.js create mode 100644 assets/javascripts/lunr/min/lunr.zh.min.js create mode 100644 assets/javascripts/lunr/tinyseg.js create mode 100644 assets/javascripts/lunr/wordcut.js create mode 100644 assets/javascripts/workers/search.6ce7567c.min.js create mode 100644 assets/javascripts/workers/search.6ce7567c.min.js.map create mode 100644 assets/stylesheets/main.0253249f.min.css create mode 100644 assets/stylesheets/main.0253249f.min.css.map create mode 100644 assets/stylesheets/palette.06af60db.min.css create mode 100644 assets/stylesheets/palette.06af60db.min.css.map create mode 100644 contributing/index.html create mode 100644 decompilers/decompiler/angr/index.html create mode 100644 decompilers/decompiler/binary_ninja/index.html create mode 100644 decompilers/decompiler/ghidra/index.html create mode 100644 decompilers/decompiler/ida_pro/index.html create mode 100644 decompilers/directory/index.html create mode 100644 decompilers/history/index.html create mode 100644 decompilers/tools/index.html create mode 100644 fundamentals/cfg-recovery/disassembly/index.html create mode 100644 fundamentals/cfg-recovery/function-recovery/index.html create mode 100644 fundamentals/cfg-recovery/jump-resolving/index.html create mode 100644 fundamentals/cfg-recovery/lifting/index.html create mode 100644 fundamentals/cfg-recovery/overview/index.html create mode 100644 fundamentals/evaluation/index.html create mode 100644 fundamentals/neural-decompilation/index.html create mode 100644 fundamentals/overview/index.html create mode 100644 fundamentals/structuring/gotoless/index.html create mode 100644 fundamentals/structuring/overview/index.html create mode 100644 fundamentals/structuring/schema-based/index.html create mode 100644 fundamentals/type-recovery/index.html create mode 100644 index.html create mode 100644 misc/blogs/index.html create mode 100644 misc/talks/index.html create mode 100644 search/search_index.json create mode 100644 sitemap.xml create mode 100644 sitemap.xml.gz create mode 100644 static/img/dcc_schema.png create mode 100644 static/img/dec-pipeline.svg create mode 100644 static/img/disass_ex.svg create mode 100644 static/img/logo.png create mode 100644 static/img/typing.svg diff --git a/.nojekyll b/.nojekyll new file mode 100644 index 0000000..e69de29 diff --git a/404.html b/404.html new file mode 100644 index 0000000..6607c1d --- /dev/null +++ b/404.html @@ -0,0 +1 @@ +
Across the internet, there are many ways people have used decompilers in the wild. In this section, you can find a collection of some of those use cases.
As an example, some decompilation uses include:
For links to full decompilers, see the Decompilers section.
When source code is unavailable for a compiled program, users may want to recover the source code, so they can make edits to it and recompile it. In the video game scene, this can be useful for modding. This can also be useful for porting a program to a new platform.
Reverse engineers whom also love playing video games often reverse their favorite games. In some cases, they go as far as attempting to recompile the entire project. However, to recompile the project, they first need to recover compilable code. In these cases, practitioners often use a decompiler to first get pseudo-C, then modify it to make it compilable.
The end goal of these projects is to recover program source that will recompile to a byte-match of the original binary. Most projects include a percent completion of the estimated program recompilation.
Many of the games in this list were collected from GitHub projects, individuals, or popular blog posts 1.
In cases such as malware identification, the ability to estimate code similarity among binaries is critical1. Research in this area generally looks at ways to improve the reliability of similarity detection among binaries.
There is little work in the direct use of decompilation for code similarity, however, the general work in the binary analysis is frequent. These works are included here since they often touch on or improve fundamental components in decompilation.
The most direct research in this area has utilized Ghidra decompilation to identify inlined functions in decompilation2.
Many works have progressed towards binary-based code similarity that do not explicitly use decompilation 13456. Most of these works have improved code similarity techniques indirectly by improving it for their specific uses cases. These uses have included malware identification1, duplicated bug hunting34, and code reuse5.
Recent work has suggested that machine learning has made significant strides in this area6.
Hu, Xin, Tzi-cker Chiueh, and Kang G. Shin. "Large-scale malware indexing using function-call graphs." Proceedings of the 16th ACM conference on Computer and communications security. 2009. ↩↩↩
Ahmed, Toufique, Premkumar Devanbu, and Anand Ashok Sawant. "Finding Inlined Functions in Optimized Binaries." arXiv preprint arXiv:2103.05221 (2021). ↩
Feng, Qian, et al. "Scalable graph-based bug search for firmware images." Proceedings of the 2016 ACM SIGSAC conference on computer and communications security. 2016. ↩↩
Eschweiler, Sebastian, Khaled Yakdan, and Elmar Gerhards-Padilla. "Discovre: Efficient cross-architecture identification of bugs in binary code." Ndss. Vol. 52. 2016. ↩↩
Mirzaei, Omid, et al. "Scrutinizer: Detecting code reuse in malware via decompilation and machine learning." Detection of Intrusions and Malware, and Vulnerability Assessment: 18th International Conference, DIMVA 2021, Virtual Event, July 14–16, 2021, Proceedings 18. Springer International Publishing, 2021. ↩↩
Marcelli, Andrea, et al. "How machine learning is solving the binary function similarity problem." 31st USENIX Security Symposium (USENIX Security 22). 2022. ↩↩
Decompiler research that does not neatly fit into one of the fundamental areas is defined here as applied research. Research in this area contributes to a specific use-case of decompilation that may not necessarily improve base decompilation.
As an example, most researchers would agree that variable name prediction in stripped binaries is an important research area1. However, as it stands, variable name prediction does not improve any fundamental research area (except neural decompilation). As such, we consider it an applied research area, with that target being human-comprehensible decompilation.
This section is ever-growing as new research areas are explored in decompilation. Currently, the following areas exist:
Some research areas don't have enough work to define a label for them. The following works are listed here:
Pal, Kuntal Kumar, et al. ""Len or index or count, anything but v1": Predicting Variable Names in Decompilation Output with Transfer Learning." 2024 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, 2024. ↩
Reiter, Pemma, et al. "Automatically mitigating vulnerabilities in x86 binary programs via partially recompilable decompilation." arXiv preprint arXiv:2202.12336 (2022). ↩
Verbeek, Freek, Pierre Olivier, and Binoy Ravindran. "Sound C Code Decompilation for a subset of x86-64 Binaries." Software Engineering and Formal Methods: 18th International Conference, SEFM 2020, Amsterdam, The Netherlands, September 14–18, 2020, Proceedings 18. Springer International Publishing, 2020. ↩
Schulte, Eric, et al. "Evolving exact decompilation." Workshop on Binary Analysis Research (BAR). 2018. ↩
Fokin, Alexander, et al. "SmartDec: approaching C++ decompilation." 2011 18th Working Conference on Reverse Engineering. IEEE, 2011. ↩
Wu, Ruoyu, et al. "{DnD}: A {Cross-Architecture} deep neural network decompiler." 31st USENIX Security Symposium (USENIX Security 22). 2022. ↩
Liu, Zhibo, et al. "Decompiling x86 deep neural network executables." 32nd USENIX Security Symposium (USENIX Security 23). 2023. ↩
A symbol, in the context of binaries, is a name associated with an object. In most cases, this is either function names or variable names. It is often useful for reverse engineering to have the original symbols to more quickly understand the purpose of an object.
Below is a snippet of a C program:
int mode;
+char* name;
+long long timezone;
+
After compiling and stripping, a common developer practice, the binary will be decompiled to something like:
int v1;
+char* v2;
+long long v3;
+
Assuming the types are recovered perfectly (hard), it is still hard to understand what these variables do.
Research in this area has been concerned with the recovery of both variable names124567 and function names35. Approaches have varied between using neural networks2367, machine translation4, probabilistic methods5, and BERT-based language models1. In many cases, the bottleneck of this work has been dataset generation1.
Pal, Kuntal Kumar, et al. ""Len or index or count, anything but v1": Predicting Variable Names in Decompilation Output with Transfer Learning." 2024 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, 2024. ↩↩↩
Dramko, Luke, et al. "DIRE and its data: Neural decompiled variable renamings with respect to software class." ACM Transactions on Software Engineering and Methodology 32.2 (2023): 1-34. ↩↩
Artuso, Fiorella, et al. "Function naming in stripped binaries using neural networks." arXiv preprint arXiv:1912.07946 (2019). ↩↩
Jaffe, Alan, et al. "Meaningful variable names for decompiled code: A machine translation approach." Proceedings of the 26th Conference on Program Comprehension. 2018. ↩↩
He, Jingxuan, et al. "Debin: Predicting debug information in stripped binaries." Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 2018. ↩↩↩
Chen, Qibin, et al. "Augmenting decompiler output with learned variable names and types." 31st USENIX Security Symposium (USENIX Security 22). 2022. ↩↩
In many uses of decompilation, humans, or machines, aim to understand if a program is safe. To verify if this program is safe, they attempt to do the opposite: find vulnerabilities in the program. Some decompilers, and their associated research, have attempted to tune their decompilers to be better at this task1. There has also been work at evaluating decompilers by how well they perform with source tools2.
Most research in this area has focused on static analysis123 and symbolic execution4 applied to decompilation. Since these tasks have often been researched with source, an application to binaries has been achieved through decompilation.
Botacin, Marcus, et al. "Revenge is a dish served cold: Debug-oriented malware decompilation and reassembly." Proceedings of the 3rd Reversing and Offensive-oriented Trends Symposium. 2019. ↩↩
Mantovani, Alessandro, et al. "The Convergence of Source Code and Binary Vulnerability Discovery--A Case Study." Proceedings of the 2022 ACM on Asia Conference on Computer and Communications Security. 2022. ↩↩
Park, Jihee, et al. "Static Analysis of JNI Programs via Binary Decompilation." IEEE Transactions on Software Engineering (2023). ↩
Han, HyungSeok, et al. "QueryX: Symbolic Query on Decompiled Code for Finding Bugs in COTS Binaries." 2023 IEEE Symposium on Security and Privacy (SP). IEEE, 2023. ↩