Skip to content

Commit

Permalink
Add changes for 5b66b38
Browse files Browse the repository at this point in the history
  • Loading branch information
actions-user committed Jun 13, 2024
1 parent fbf050e commit 8ae994d
Show file tree
Hide file tree
Showing 8 changed files with 90 additions and 7 deletions.
16 changes: 16 additions & 0 deletions Reference/sr3_options.7.html
Original file line number Diff line number Diff line change
Expand Up @@ -1478,6 +1478,22 @@ <h4>retry_ttl &lt;duration&gt; (default: same as expire)<a class="headerlink" hr
a file before it is aged out of a the queue. Default is two days. If a file has not
been transferred after two days of attempts, it is discarded.</p>
</section>
<section id="runstatethreshold-cpuslow-count-default-0">
<h4>runStateThreshold_cpuSlow &lt;count&gt; (default: 0)<a class="headerlink" href="#runstatethreshold-cpuslow-count-default-0" title="Link to this heading"></a></h4>
<p>The <em>runStateThreshold_cpuSlow</em> setting sets the minimum rate of transfer expected for flow
processing messages. If the messages processed per cpu second rate drops below this threshold,
then the flow will be identified as “cpuSlow.” (shown as cpuS on the <em>sr3 status</em> display.)
This test will only apply if a flow is actually transferring messages.
The rate is only visible in <em>sr3 –full status</em></p>
<p>This may indicate that the routing is inordinately expensive or the transfers inordinately slow.
Examples that could contribute to this:</p>
<ul class="simple">
<li><p>one hundred regular expressions must be evaluated per message received. Regex’s, when cumulated, can get expensive.</p></li>
<li><p>a complex plugin that does heavy transformations on data in route.</p></li>
<li><p>repeating an operation for each message, when doing it once per batch would do.</p></li>
</ul>
<p>It defaults to inactive, but may be set to identify transient issues.</p>
</section>
<section id="runstatethreshold-hung-interval-default-450">
<h4>runStateThreshold_hung &lt;interval&gt; (default: 450)<a class="headerlink" href="#runstatethreshold-hung-interval-default-450" title="Link to this heading"></a></h4>
<p>The runStateThreshold_hung (formerly: <strong>sanity_log_dead</strong>) option sets how long to consider too long before restarting
Expand Down
9 changes: 5 additions & 4 deletions _modules/sarracenia/config.html
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,6 @@ <h1>Source code for sarracenia.config</h1><div class="highlight"><pre>
<span class="s1">&#39;dry_run&#39;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
<span class="s1">&#39;filename&#39;</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span>
<span class="s1">&#39;flowMain&#39;</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span>
<span class="s1">&#39;runStateThreshold_idle&#39;</span><span class="p">:</span> <span class="mi">900</span><span class="p">,</span>
<span class="s1">&#39;inflight&#39;</span><span class="p">:</span> <span class="kc">None</span><span class="p">,</span>
<span class="s1">&#39;inline&#39;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
<span class="s1">&#39;inlineOnly&#39;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
Expand All @@ -201,7 +200,6 @@ <h1>Source code for sarracenia.config</h1><div class="highlight"><pre>
<span class="s1">&#39;logMetrics&#39;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
<span class="s1">&#39;logStdout&#39;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
<span class="s1">&#39;metrics_writeInterval&#39;</span><span class="p">:</span> <span class="mi">5</span><span class="p">,</span>
<span class="s1">&#39;runStateThreshold_lag&#39;</span><span class="p">:</span> <span class="mi">30</span><span class="p">,</span>
<span class="s1">&#39;nodupe_driver&#39;</span><span class="p">:</span> <span class="s1">&#39;disk&#39;</span><span class="p">,</span>
<span class="s1">&#39;nodupe_ttl&#39;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
<span class="s1">&#39;overwrite&#39;</span><span class="p">:</span> <span class="kc">True</span><span class="p">,</span>
Expand All @@ -219,8 +217,11 @@ <h1>Source code for sarracenia.config</h1><div class="highlight"><pre>
<span class="s1">&#39;report&#39;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
<span class="s1">&#39;retryEmptyBeforeExit&#39;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
<span class="s1">&#39;retry_refilter&#39;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
<span class="s1">&#39;runStateThreshold_retry&#39;</span><span class="p">:</span> <span class="mi">1000</span><span class="p">,</span>
<span class="s1">&#39;runStateThreshold_cpuSlow&#39;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
<span class="s1">&#39;runStateThreshold_hung&#39;</span><span class="p">:</span> <span class="mi">450</span><span class="p">,</span>
<span class="s1">&#39;runStateThreshold_idle&#39;</span><span class="p">:</span> <span class="mi">900</span><span class="p">,</span>
<span class="s1">&#39;runStateThreshold_lag&#39;</span><span class="p">:</span> <span class="mi">30</span><span class="p">,</span>
<span class="s1">&#39;runStateThreshold_retry&#39;</span><span class="p">:</span> <span class="mi">1000</span><span class="p">,</span>
<span class="s1">&#39;runStateThreshold_slow&#39;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
<span class="s1">&#39;sourceFromExchange&#39;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
<span class="s1">&#39;sourceFromMessage&#39;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span>
Expand All @@ -236,7 +237,7 @@ <h1>Source code for sarracenia.config</h1><div class="highlight"><pre>
<span class="n">count_options</span> <span class="o">=</span> <span class="p">[</span>
<span class="s1">&#39;batch&#39;</span><span class="p">,</span> <span class="s1">&#39;count&#39;</span><span class="p">,</span> <span class="s1">&#39;exchangeSplit&#39;</span><span class="p">,</span> <span class="s1">&#39;instances&#39;</span><span class="p">,</span> <span class="s1">&#39;logRotateCount&#39;</span><span class="p">,</span> <span class="s1">&#39;no&#39;</span><span class="p">,</span>
<span class="s1">&#39;post_exchangeSplit&#39;</span><span class="p">,</span> <span class="s1">&#39;prefetch&#39;</span><span class="p">,</span> <span class="s1">&#39;messageCountMax&#39;</span><span class="p">,</span> <span class="s1">&#39;messageRateMax&#39;</span><span class="p">,</span>
<span class="s1">&#39;messageRateMin&#39;</span><span class="p">,</span> <span class="s1">&#39;runStateThreshold_reject&#39;</span><span class="p">,</span> <span class="s1">&#39;runStateThreshold_retry&#39;</span><span class="p">,</span> <span class="s1">&#39;runStateThreshold_slow&#39;</span>
<span class="s1">&#39;messageRateMin&#39;</span><span class="p">,</span> <span class="s1">&#39;runStateThreshold_cpuSlow&#39;</span><span class="p">,</span> <span class="s1">&#39;runStateThreshold_reject&#39;</span><span class="p">,</span> <span class="s1">&#39;runStateThreshold_retry&#39;</span><span class="p">,</span> <span class="s1">&#39;runStateThreshold_slow&#39;</span><span class="p">,</span>
<span class="p">]</span>


Expand Down
13 changes: 11 additions & 2 deletions _modules/sarracenia/flow.html
Original file line number Diff line number Diff line change
Expand Up @@ -311,7 +311,8 @@ <h1>Source code for sarracenia.flow</h1><div class="highlight"><pre>

<span class="bp">self</span><span class="o">.</span><span class="n">new_metrics</span> <span class="o">=</span> <span class="p">{</span> <span class="s1">&#39;flow&#39;</span><span class="p">:</span> <span class="p">{</span> <span class="s1">&#39;stop_requested&#39;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span> <span class="s1">&#39;last_housekeeping&#39;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
<span class="s1">&#39;transferConnected&#39;</span><span class="p">:</span> <span class="kc">False</span><span class="p">,</span> <span class="s1">&#39;transferConnectStart&#39;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s1">&#39;transferConnectTime&#39;</span><span class="p">:</span><span class="mi">0</span><span class="p">,</span>
<span class="s1">&#39;transferRxBytes&#39;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s1">&#39;transferTxBytes&#39;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s1">&#39;transferRxFiles&#39;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s1">&#39;transferTxFiles&#39;</span><span class="p">:</span> <span class="mi">0</span> <span class="p">}</span> <span class="p">}</span>
<span class="s1">&#39;transferRxBytes&#39;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s1">&#39;transferTxBytes&#39;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s1">&#39;transferRxFiles&#39;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s1">&#39;transferTxFiles&#39;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
<span class="s1">&#39;last_housekeeping_cpuTime&#39;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="s1">&#39;cpuTime&#39;</span> <span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="p">}</span> <span class="p">}</span>

<span class="c1"># carry over some metrics... that don&#39;t reset.</span>
<span class="k">if</span> <span class="nb">hasattr</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span><span class="s1">&#39;metrics&#39;</span><span class="p">):</span>
Expand Down Expand Up @@ -458,7 +459,9 @@ <h1>Source code for sarracenia.flow</h1><div class="highlight"><pre>
<span class="bp">self</span><span class="o">.</span><span class="n">metrics</span><span class="p">[</span><span class="n">module_name</span><span class="p">]</span> <span class="o">=</span> <span class="n">p</span><span class="p">()</span>
<span class="k">except</span> <span class="ne">Exception</span> <span class="k">as</span> <span class="n">ex</span><span class="p">:</span>
<span class="n">logger</span><span class="o">.</span><span class="n">error</span><span class="p">(</span> <span class="sa">f</span><span class="s1">&#39;flowCallback plugin </span><span class="si">{</span><span class="n">p</span><span class="si">}</span><span class="s1">/metricsReport crashed: </span><span class="si">{</span><span class="n">ex</span><span class="si">}</span><span class="s1">&#39;</span> <span class="p">)</span>
<span class="n">logger</span><span class="o">.</span><span class="n">debug</span><span class="p">(</span> <span class="s2">&quot;details:&quot;</span><span class="p">,</span> <span class="n">exc_info</span><span class="o">=</span><span class="kc">True</span> <span class="p">)</span></div>
<span class="n">logger</span><span class="o">.</span><span class="n">debug</span><span class="p">(</span> <span class="s2">&quot;details:&quot;</span><span class="p">,</span> <span class="n">exc_info</span><span class="o">=</span><span class="kc">True</span> <span class="p">)</span>
<span class="n">ost</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">times</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">metrics</span><span class="p">[</span><span class="s1">&#39;flow&#39;</span><span class="p">][</span><span class="s1">&#39;cpuTime&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">ost</span><span class="o">.</span><span class="n">user</span><span class="o">+</span><span class="n">ost</span><span class="o">.</span><span class="n">system</span><span class="o">-</span><span class="bp">self</span><span class="o">.</span><span class="n">metrics</span><span class="p">[</span><span class="s1">&#39;flow&#39;</span><span class="p">][</span><span class="s1">&#39;last_housekeeping_cpuTime&#39;</span><span class="p">]</span></div>


<span class="k">def</span> <span class="nf">_runCallbackPoll</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
Expand Down Expand Up @@ -510,6 +513,9 @@ <h1>Source code for sarracenia.flow</h1><div class="highlight"><pre>
<span class="bp">self</span><span class="o">.</span><span class="n">runCallbacksTime</span><span class="p">(</span><span class="s1">&#39;on_housekeeping&#39;</span><span class="p">)</span>
<span class="bp">self</span><span class="o">.</span><span class="n">metricsFlowReset</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">metrics</span><span class="p">[</span><span class="s1">&#39;flow&#39;</span><span class="p">][</span><span class="s1">&#39;last_housekeeping&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">now</span>
<span class="n">ost</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">times</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">metrics</span><span class="p">[</span><span class="s1">&#39;flow&#39;</span><span class="p">][</span><span class="s1">&#39;last_housekeeping_cpuTime&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">ost</span><span class="o">.</span><span class="n">user</span><span class="o">+</span><span class="n">ost</span><span class="o">.</span><span class="n">system</span>
<span class="bp">self</span><span class="o">.</span><span class="n">metrics</span><span class="p">[</span><span class="s1">&#39;flow&#39;</span><span class="p">][</span><span class="s1">&#39;cpuTime&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">ost</span><span class="o">.</span><span class="n">user</span><span class="o">+</span><span class="n">ost</span><span class="o">.</span><span class="n">system</span>

<span class="n">next_housekeeping</span> <span class="o">=</span> <span class="n">now</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">o</span><span class="o">.</span><span class="n">housekeeping</span>
<span class="bp">self</span><span class="o">.</span><span class="n">metrics</span><span class="p">[</span><span class="s1">&#39;flow&#39;</span><span class="p">][</span><span class="s1">&#39;next_housekeeping&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">next_housekeeping</span>
Expand Down Expand Up @@ -636,6 +642,8 @@ <h1>Source code for sarracenia.flow</h1><div class="highlight"><pre>
<span class="n">current_sleep</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">o</span><span class="o">.</span><span class="n">sleep</span>
<span class="n">last_time</span> <span class="o">=</span> <span class="n">start_time</span>
<span class="bp">self</span><span class="o">.</span><span class="n">metrics</span><span class="p">[</span><span class="s1">&#39;flow&#39;</span><span class="p">][</span><span class="s1">&#39;last_housekeeping&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">start_time</span>
<span class="n">ost</span><span class="o">=</span><span class="n">os</span><span class="o">.</span><span class="n">times</span><span class="p">()</span>
<span class="bp">self</span><span class="o">.</span><span class="n">metrics</span><span class="p">[</span><span class="s1">&#39;flow&#39;</span><span class="p">][</span><span class="s1">&#39;last_housekeeping_cpuTime&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">ost</span><span class="o">.</span><span class="n">user</span><span class="o">+</span><span class="n">ost</span><span class="o">.</span><span class="n">system</span>

<span class="k">if</span> <span class="bp">self</span><span class="o">.</span><span class="n">o</span><span class="o">.</span><span class="n">logLevel</span> <span class="o">==</span> <span class="s1">&#39;debug&#39;</span><span class="p">:</span>
<span class="n">logger</span><span class="o">.</span><span class="n">debug</span><span class="p">(</span><span class="s2">&quot;options:&quot;</span><span class="p">)</span>
Expand Down Expand Up @@ -707,6 +715,7 @@ <h1>Source code for sarracenia.flow</h1><div class="highlight"><pre>
<span class="n">elapsed</span> <span class="o">=</span> <span class="n">now</span> <span class="o">-</span> <span class="n">last_time</span>

<span class="bp">self</span><span class="o">.</span><span class="n">metrics</span><span class="p">[</span><span class="s1">&#39;flow&#39;</span><span class="p">][</span><span class="s1">&#39;msgRate&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">current_rate</span>
<span class="bp">self</span><span class="o">.</span><span class="n">metrics</span><span class="p">[</span><span class="s1">&#39;flow&#39;</span><span class="p">][</span><span class="s1">&#39;msgRateCpu&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">total_messages</span> <span class="o">/</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">metrics</span><span class="p">[</span><span class="s1">&#39;flow&#39;</span><span class="p">][</span><span class="s1">&#39;cpuTime&#39;</span><span class="p">]</span><span class="o">+</span><span class="bp">self</span><span class="o">.</span><span class="n">metrics</span><span class="p">[</span><span class="s1">&#39;flow&#39;</span><span class="p">][</span><span class="s1">&#39;last_housekeeping_cpuTime&#39;</span><span class="p">]</span> <span class="p">)</span>

<span class="k">if</span> <span class="p">(</span><span class="n">last_gather_len</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="ow">and</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">o</span><span class="o">.</span><span class="n">sleep</span> <span class="o">&lt;</span> <span class="mi">0</span><span class="p">):</span>
<span class="k">if</span> <span class="p">(</span><span class="bp">self</span><span class="o">.</span><span class="n">o</span><span class="o">.</span><span class="n">retryEmptyBeforeExit</span> <span class="ow">and</span> <span class="s2">&quot;retry&quot;</span> <span class="ow">in</span> <span class="bp">self</span><span class="o">.</span><span class="n">metrics</span>
Expand Down
21 changes: 21 additions & 0 deletions _sources/Reference/sr3_options.7.rst.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1591,6 +1591,27 @@ a file before it is aged out of a the queue. Default is two days. If a file ha
been transferred after two days of attempts, it is discarded.


runStateThreshold_cpuSlow <count> (default: 0)
----------------------------------------------

The *runStateThreshold_cpuSlow* setting sets the minimum rate of transfer expected for flow
processing messages. If the messages processed per cpu second rate drops below this threshold,
then the flow will be identified as "cpuSlow." (shown as cpuS on the *sr3 status* display.)
This test will only apply if a flow is actually transferring messages.
The rate is only visible in *sr3 --full status*

This may indicate that the routing is inordinately expensive or the transfers inordinately slow.
Examples that could contribute to this:

* one hundred regular expressions must be evaluated per message received. Regex's, when cumulated, can get expensive.

* a complex plugin that does heavy transformations on data in route.

* repeating an operation for each message, when doing it once per batch would do.


It defaults to inactive, but may be set to identify transient issues.

runStateThreshold_hung <interval> (default: 450)
------------------------------------------------

Expand Down
Loading

0 comments on commit 8ae994d

Please sign in to comment.