Skip to content

Commit fde4c4a

Browse files
committed
minor improvements in docs
Signed-off-by: Víctor Mayoral Vilches <[email protected]>
1 parent f5f3c42 commit fde4c4a

File tree

5 files changed

+53
-25
lines changed

5 files changed

+53
-25
lines changed

docs/benchmarking/attack_defense.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ In rigorous Attack & Defense CTF evaluations, **`alias1` consistently outperform
3333
<th style="text-align:center;"><b>Best Performance in Agent vs Agent A&D</b></th>
3434
</tr>
3535
<tr>
36-
<td align="center"><img src="../assets/images/stackplot.png" alt="A&D Performance Stack Plot" /></td>
36+
<td align="center"><img src="/assets/images/stackplot.png" alt="A&D Performance Stack Plot" /></td>
3737
</tr>
3838
</table>
3939

docs/benchmarking/jeopardy_ctfs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Jeopardy-style Capture The Flag (CTF) challenges evaluate AI agents on independe
1818
<th style="text-align:center;"><b>Model Performance in Jeopardy CTFs Base Benchmark</b></th>
1919
</tr>
2020
<tr>
21-
<td align="center"><img src="../assets/images/base_1col.png" alt="Base Benchmark Results" /></td>
21+
<td align="center"><img src="/assets/images/base_1col.png" alt="Base Benchmark Results" /></td>
2222
</tr>
2323
</table>
2424

docs/benchmarking/overview.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -37,16 +37,16 @@ AutoPenBench │
3737
<th style="text-align:center;"><b>Model Performance in Jeopardy CTFs</b></th>
3838
</tr>
3939
<tr>
40-
<td align="center"><img src="../assets/images/stackplot.png" alt="A&D Performance" width="100%" /></td>
41-
<td align="center"><img src="../assets/images/base_1col.png" alt="Jeopardy CTF Performance" width="100%" /></td>
40+
<td align="center"><img src="/assets/images/stackplot.png" alt="A&D Performance" width="100%" /></td>
41+
<td align="center"><img src="/assets/images/base_1col.png" alt="Jeopardy CTF Performance" width="100%" /></td>
4242
</tr>
4343
<tr>
4444
<th style="text-align:center;"><b>Model Performance in Privacy Benchmark</b></th>
4545
<th style="text-align:center;"><b>Overall Model Performance</b></th>
4646
</tr>
4747
<tr>
48-
<td align="center"><img src="../assets/images/cyberpii_benchmark.png" alt="Privacy Benchmark" width="100%" /></td>
49-
<td align="center"><img src="../assets/images/caibench_spider.png" alt="Overall Performance" width="100%" /></td>
48+
<td align="center"><img src="/assets/images/cyberpii_benchmark.png" alt="Privacy Benchmark" width="100%" /></td>
49+
<td align="center"><img src="/assets/images/caibench_spider.png" alt="Overall Performance" width="100%" /></td>
5050
</tr>
5151
</table>
5252

docs/benchmarking/privacy_benchmarks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Privacy benchmarks assess AI models' ability to handle sensitive information app
1313
<th style="text-align:center;"><b>Model Performance in CyberPII Privacy Benchmark</b></th>
1414
</tr>
1515
<tr>
16-
<td align="center"><img src="../assets/images/cyberpii_benchmark.png" alt="CyberPII Benchmark Results" /></td>
16+
<td align="center"><img src="/assets/images/cyberpii_benchmark.png" alt="CyberPII Benchmark Results" /></td>
1717
</tr>
1818
</table>
1919

docs/index.md

Lines changed: 46 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -168,27 +168,55 @@ Cybersecurity AI is a critical field, yet many groups are misguidedly pursuing i
168168
If you want to cite our work, please use the following:
169169

170170
```bibtex
171-
@misc{mayoralvilches2025caiopenbugbountyready,
172-
title={CAI: An Open, Bug Bounty-Ready Cybersecurity AI},
173-
author={Víctor Mayoral-Vilches and Luis Javier Navarrete-Lozano and María Sanz-Gómez and Lidia Salas Espejo and Martiño Crespo-Álvarez and Francisco Oca-Gonzalez and Francesco Balassone and Alfonso Glera-Picón and Unai Ayucar-Carbajo and Jon Ander Ruiz-Alcalde and Stefan Rass and Martin Pinzger and Endika Gil-Uriarte},
174-
year={2025},
175-
eprint={2504.06017},
176-
archivePrefix={arXiv},
177-
primaryClass={cs.CR},
178-
url={https://arxiv.org/abs/2504.06017},
171+
@article{mayoral2025cai,
172+
title={CAI: An Open, Bug Bounty-Ready Cybersecurity AI},
173+
author={Mayoral-Vilches, V{\'\i}ctor and Navarrete-Lozano, Luis Javier and Sanz-G{\'o}mez, Mar{\'\i}a and Espejo, Lidia Salas and Crespo-{\'A}lvarez, Marti{\~n}o and Oca-Gonzalez, Francisco and Balassone, Francesco and Glera-Pic{\'o}n, Alfonso and Ayucar-Carbajo, Unai and Ruiz-Alcalde, Jon Ander and Rass, Stefan and Pinzger, Martin and Gil-Uriarte, Endika},
174+
journal={arXiv preprint arXiv:2504.06017},
175+
year={2025}
179176
}
180-
```
181177
182-
```bibtex
183-
@misc{mayoralvilches2025cybersecurityaidangerousgap,
184-
title={Cybersecurity AI: The Dangerous Gap Between Automation and Autonomy},
185-
author={Víctor Mayoral-Vilches},
186-
year={2025},
187-
eprint={2506.23592},
188-
archivePrefix={arXiv},
189-
primaryClass={cs.CR},
190-
url={https://arxiv.org/abs/2506.23592},
178+
@article{mayoral2025automation,
179+
title={Cybersecurity AI: The Dangerous Gap Between Automation and Autonomy},
180+
author={Mayoral-Vilches, V{\'\i}ctor},
181+
journal={arXiv preprint arXiv:2506.23592},
182+
year={2025}
183+
}
184+
185+
@article{mayoral2025fluency,
186+
title={CAI Fluency: A Framework for Cybersecurity AI Fluency},
187+
author={Mayoral-Vilches, V{\'\i}ctor and Wachter, Jasmin and Chavez, Crist{\'o}bal RJ and Schachner, Cathrin and Navarrete-Lozano, Luis Javier and Sanz-G{\'o}mez, Mar{\'\i}a},
188+
journal={arXiv preprint arXiv:2508.13588},
189+
year={2025}
190+
}
191+
192+
@article{mayoral2025hacking,
193+
title={Cybersecurity AI: Hacking the AI Hackers via Prompt Injection},
194+
author={Mayoral-Vilches, V{\'\i}ctor and Rynning, Per Mannermaa},
195+
journal={arXiv preprint arXiv:2508.21669},
196+
year={2025}
191197
}
198+
199+
@article{mayoral2025humanoid,
200+
title={Cybersecurity AI: Humanoid Robots as Attack Vectors},
201+
author={Mayoral-Vilches, V{\'\i}ctor},
202+
journal={arXiv preprint arXiv:2509.14139},
203+
year={2025}
204+
}
205+
206+
@article{balassone2025evaluation,
207+
title={Cybersecurity AI: Evaluating Agentic Cybersecurity in Attack/Defense CTFs},
208+
author={Balassone, Francesco and Mayoral-Vilches, V{\'\i}ctor and Rass, Stefan and Pinzger, Martin and Perrone, Gaetano and Romano, Simon Pietro and Schartner, Peter},
209+
journal={arXiv preprint arXiv:2510.17521},
210+
year={2025}
211+
}
212+
213+
@article{mayoral2025caibench,
214+
title={CAIBench: A Meta-Benchmark for Evaluating Cybersecurity AI Agents},
215+
author={Mayoral-Vilches, V{\'\i}ctor and Balassone, Francesco and Navarrete-Lozano, Luis Javier and Sanz-G{\'o}mez, Mar{\'\i}a and Crespo-{\'A}lvarez, Marti{\~n}o and Rass, Stefan and Pinzger, Martin},
216+
journal={arXiv preprint arXiv:2510.24317},
217+
year={2025}
218+
}
219+
192220
```
193221

194222
## Acknowledgements

0 commit comments

Comments
 (0)