-
Notifications
You must be signed in to change notification settings - Fork 0
/
metrics.html
executable file
·185 lines (161 loc) · 9.86 KB
/
metrics.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
<!DOCTYPE html>
<html class="writer-html5" lang="en">
<head>
<meta charset="utf-8" /><meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Metric Module — easyjailbreak 0.1.0 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css?v=92fd9be5" />
<link rel="stylesheet" type="text/css" href="_static/css/theme.css?v=19f00094" />
<!--[if lt IE 9]>
<script src="_static/js/html5shiv.min.js"></script>
<![endif]-->
<script src="_static/jquery.js?v=5d32c60e"></script>
<script src="_static/_sphinx_javascript_frameworks_compat.js?v=2cd50e6c"></script>
<script data-url_root="./" id="documentation_options" src="_static/documentation_options.js?v=2389946f"></script>
<script src="_static/doctools.js?v=888ff710"></script>
<script src="_static/sphinx_highlight.js?v=4825356b"></script>
<script src="_static/js/theme.js"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Evaluator Module" href="evaluator.html" />
<link rel="prev" title="Datasets Module" href="datasets.html" />
</head>
<body class="wy-body-for-nav">
<div class="wy-grid-for-nav">
<nav data-toggle="wy-nav-shift" class="wy-nav-side">
<div class="wy-side-scroll">
<div class="wy-side-nav-search" >
<a href="index.html" class="icon icon-home">
easyjailbreak
</a>
<div role="search">
<form id="rtd-search-form" class="wy-form" action="search.html" method="get">
<input type="text" name="q" placeholder="Search docs" aria-label="Search docs" />
<input type="hidden" name="check_keywords" value="yes" />
<input type="hidden" name="area" value="default" />
</form>
</div>
</div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu">
<p class="caption" role="heading"><span class="caption-text">Contents:</span></p>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="attacker.html">Attacker Module</a></li>
<li class="toctree-l1"><a class="reference internal" href="constraint.html">Constraint Module</a></li>
<li class="toctree-l1"><a class="reference internal" href="datasets.html">Datasets Module</a></li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">Metric Module</a><ul>
<li class="toctree-l2"><a class="reference internal" href="#metric-asr">metric_ASR</a><ul>
<li class="toctree-l3"><a class="reference internal" href="#metrics-on-attacksuccessrate">Metrics on AttackSuccessRate</a></li>
</ul>
</li>
<li class="toctree-l2"><a class="reference internal" href="#metric-perplexit">metric_perplexit</a><ul>
<li class="toctree-l3"><a class="reference internal" href="#perplexity-metric">Perplexity Metric:</a></li>
</ul>
</li>
</ul>
</li>
<li class="toctree-l1"><a class="reference internal" href="evaluator.html">Evaluator Module</a></li>
<li class="toctree-l1"><a class="reference internal" href="Seed.html">Seed Module</a></li>
<li class="toctree-l1"><a class="reference internal" href="Selector.html">Selecotr Module</a></li>
</ul>
</div>
</div>
</nav>
<section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" >
<i data-toggle="wy-nav-top" class="fa fa-bars"></i>
<a href="index.html">easyjailbreak</a>
</nav>
<div class="wy-nav-content">
<div class="rst-content">
<div role="navigation" aria-label="Page navigation">
<ul class="wy-breadcrumbs">
<li><a href="index.html" class="icon icon-home" aria-label="Home"></a></li>
<li class="breadcrumb-item active">Metric Module</li>
<li class="wy-breadcrumbs-aside">
<a href="_sources/metrics.rst.txt" rel="nofollow"> View page source</a>
</li>
</ul>
<hr/>
</div>
<div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
<div itemprop="articleBody">
<section id="metric-module">
<h1>Metric Module<a class="headerlink" href="#metric-module" title="Permalink to this heading"></a></h1>
<p>Part of this document introduces the easyjailbreak.metrics.Metric of each module, used to score for calculating the final result.</p>
<section id="metric-asr">
<h2>metric_ASR<a class="headerlink" href="#metric-asr" title="Permalink to this heading"></a></h2>
<section id="metrics-on-attacksuccessrate">
<h3>Metrics on AttackSuccessRate<a class="headerlink" href="#metrics-on-attacksuccessrate" title="Permalink to this heading"></a></h3>
<p>This module contains the implementation of the AttackSuccessRate metric,
which is designed to evaluate the effectiveness of jailbreak attacks in a dataset.
It calculates the number of successful and failed attacks, and computes the overall
attack success rate.</p>
</section>
<dl class="py class">
<dt class="sig sig-object py">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">easyjailbreak.metrics.Metric.metric_ASR.</span></span><span class="sig-name descname"><span class="pre">AttackSuccessRate</span></span></dt>
<dd><p>A metric to evaluate the success rate of jailbreak attacks.
It calculates the number of successful and failed attacks within a dataset,
and determines the overall attack success rate.</p>
<dl class="py method">
<dt class="sig sig-object py">
<span class="sig-name descname"><span class="pre">calculate</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">dataset</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">JailbreakDataset</span></span></em><span class="sig-paren">)</span></dt>
<dd><p>Calculate the attack success rate from the given dataset.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><p><strong>dataset</strong> (<em>~JailbreakDataset</em>) – The dataset containing jailbreak attack results.</p>
</dd>
<dt class="field-even">Return dict<span class="colon">:</span></dt>
<dd class="field-even"><p>A dictionary containing the number of successful attacks, failed attacks,
and the attack success rate.</p>
</dd>
</dl>
</dd></dl>
</dd></dl>
</section>
<section id="metric-perplexit">
<h2>metric_perplexit<a class="headerlink" href="#metric-perplexit" title="Permalink to this heading"></a></h2>
<section id="perplexity-metric">
<h3>Perplexity Metric:<a class="headerlink" href="#perplexity-metric" title="Permalink to this heading"></a></h3>
<p>Class for calculating perplexity from Jailbreak_Dataset</p>
</section>
<dl class="py class">
<dt class="sig sig-object py">
<em class="property"><span class="pre">class</span><span class="w"> </span></em><span class="sig-prename descclassname"><span class="pre">easyjailbreak.metrics.Metric.metric_perplexit.</span></span><span class="sig-name descname"><span class="pre">Perplexity</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">model</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">WhiteBoxModelBase</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">max_length</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">512</span></span></em>, <em class="sig-param"><span class="n"><span class="pre">stride</span></span><span class="o"><span class="pre">=</span></span><span class="default_value"><span class="pre">512</span></span></em><span class="sig-paren">)</span></dt>
<dd><dl class="py method">
<dt class="sig sig-object py">
<span class="sig-name descname"><span class="pre">calculate</span></span><span class="sig-paren">(</span><em class="sig-param"><span class="n"><span class="pre">dataset</span></span><span class="p"><span class="pre">:</span></span><span class="w"> </span><span class="n"><span class="pre">JailbreakDataset</span></span></em><span class="sig-paren">)</span></dt>
<dd><p>Calculates average Perplexity on the final prompts generated by attacker using a
pre-trained small GPT-2 model.</p>
<dl class="field-list simple">
<dt class="field-odd">Parameters<span class="colon">:</span></dt>
<dd class="field-odd"><p><strong>dataset</strong> (<code class="docutils literal notranslate"><span class="pre">Jailbreak_Dataset</span></code> objects) – list of instances with attack results</p>
</dd>
</dl>
</dd></dl>
</dd></dl>
</section>
</section>
</div>
</div>
<footer><div class="rst-footer-buttons" role="navigation" aria-label="Footer">
<a href="datasets.html" class="btn btn-neutral float-left" title="Datasets Module" accesskey="p" rel="prev"><span class="fa fa-arrow-circle-left" aria-hidden="true"></span> Previous</a>
<a href="evaluator.html" class="btn btn-neutral float-right" title="Evaluator Module" accesskey="n" rel="next">Next <span class="fa fa-arrow-circle-right" aria-hidden="true"></span></a>
</div>
<hr/>
<div role="contentinfo">
<p>© Copyright 2024, zwk.</p>
</div>
Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a
<a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a>
provided by <a href="https://readthedocs.org">Read the Docs</a>.
</footer>
</div>
</div>
</section>
</div>
<script>
jQuery(function () {
SphinxRtdTheme.Navigation.enable(true);
});
</script>
</body>
</html>