Skip to content

Commit

Permalink
Clarify how fence_kdump works (#415)
Browse files Browse the repository at this point in the history
* Clarify how fence_kdump works

bsc#1228931
jsc#DOCTEAM-537

* Apply suggestions from code review

Co-authored-by: Roger Zhou <[email protected]>

* Small edits

---------

Co-authored-by: Roger Zhou <[email protected]>
  • Loading branch information
tahliar and zzhou1 committed Sep 24, 2024
1 parent f1cefff commit 97c5027
Showing 1 changed file with 36 additions and 29 deletions.
65 changes: 36 additions & 29 deletions xml/ha_fencing.xml
Original file line number Diff line number Diff line change
Expand Up @@ -446,56 +446,62 @@ hostlist</screen>
<para>Kdump belongs to the <xref linkend="sec-ha-fencing-special"
xrefstyle="select:title"/> and is in fact the opposite of a fencing device.
The plug-in checks if a Kernel dump is in progress on a node. If so, it
returns true, and acts <emphasis>as if</emphasis> the node has been fenced.
returns true and acts <emphasis>as if</emphasis> the node has been fenced,
because the node will reboot after the Kdump is complete.
If not, it returns a failure and the next fencing device is triggered.
</para>
<para>
The Kdump plug-in must be used in concert with another, real &stonith;
device, for example, <literal>external/ipmi</literal>. For the fencing
mechanism to work properly, you must specify that Kdump is checked before
a real &stonith; device is triggered. Use <command>crm configure
fencing_topology</command> to specify the order of the fencing devices as
The Kdump plug-in must be used together with another, real &stonith;
device, for example, <literal>external/ipmi</literal>. It does
<emphasis>not</emphasis> work with SBD as the &stonith; device. For the fencing
mechanism to work properly, you must specify the order of the fencing devices
so that Kdump is checked before a real &stonith; device is triggered, as
shown in the following procedure.
</para>
<procedure>
<step>
<para>
Use the <literal>stonith:fence_kdump</literal> resource agent (provided
by the package <package>fence-agents</package>)
to monitor all nodes with the Kdump function enabled. Find a
configuration example for the resource below:
Use the <literal>stonith:fence_kdump</literal> fence agent.
A configuration example is shown below. For more information,
see <command>crm ra info stonith:fence_kdump</command>.
</para>
<screen>&prompt.root;<command>crm configure</command>
&prompt.crm.conf;<command>primitive st-kdump stonith:fence_kdump \
params nodename="&node1; "\ </command><co xml:id="co-ha-fenc-kdump-nodename"/>
<command>pcmk_host_check="static-list" \
params nodename="&node1; "\ </command><co xml:id="co-ha-fence-kdump-nodename"/>
<command>pcmk_host_list="&node1;" \
pcmk_host_check="static-list" \
pcmk_reboot_action="off" \
pcmk_monitor_action="metadata" \
pcmk_reboot_retries="1" \
timeout="60"</command>
timeout="60"</command><co xml:id="co-ha-fence-kdump-timeout"/>
&prompt.crm.conf;<command>commit</command></screen>
<calloutlist>
<callout arearefs="co-ha-fenc-kdump-nodename">
<callout arearefs="co-ha-fence-kdump-nodename">
<para>
Name of the node to be monitored. If you need to monitor more than one
node, configure more &stonith; resources. To prevent a specific node
from using a fencing device, add location constraints.
Name of the node to listen for a message from <literal>fence_kdump_send</literal>.
Configure more &stonith; resources for other nodes if needed.
</para>
</callout>
<callout arearefs="co-ha-fence-kdump-timeout">
<para>
Defines how long to wait for a message from <literal>fence_kdump_send</literal>.
If a message is received, then a Kdump is in progress and the fencing mechanism
considers the node to be fenced. If no message is received, <literal>fence_kdump</literal>
times out, which indicates that the fence operation failed. The next &stonith; device
in the <literal>fencing_topology</literal> eventually fences the node.
</para>
</callout>
</calloutlist>
<para>
The fencing action starts after the timeout of the resource.
</para>
</step>
<step>
<para>
In <filename>/etc/sysconfig/kdump</filename> on each node, configure
<literal>KDUMP_POSTSCRIPT</literal> to send a notification to all nodes
when the Kdump process is finished. For example:
On each node, configure <literal>fence_kdump_send</literal> to send a message to
all nodes when the Kdump process is finished. In <filename>/etc/sysconfig/kdump</filename>,
edit the <literal>KDUMP_POSTSCRIPT</literal> line. For example:
</para>
<screen>KDUMP_POSTSCRIPT="/usr/lib/fence_kdump_send -i <replaceable>INTERVAL</replaceable> -p <replaceable>PORT</replaceable> -c 1 &node1; &node2; &node3;"</screen>
<screen>KDUMP_POSTSCRIPT="/usr/lib/fence_kdump_send -i 10 -p 7410 -c 1 <replaceable>NODELIST</replaceable>"</screen>
<para>
The node that does a Kdump restarts automatically after Kdump is
finished.
Replace <replaceable>NODELIST</replaceable> with the host names of all the cluster nodes.
</para>
</step>
<step>
Expand All @@ -514,15 +520,16 @@ hostlist</screen>
</step>
<step>
<para>
To achieve that Kdump is checked before triggering a real fencing
To have Kdump checked before triggering a real fencing
mechanism (like <literal>external/ipmi</literal>),
use a configuration similar to the following:</para>
<screen>&prompt.crm.conf;<command>fencing_topology \
&node1;: kdump-node1 ipmi-node1 \
&node2;: kdump-node2 ipmi-node2</command></screen>
&node2;: kdump-node2 ipmi-node2</command>
&prompt.crm.conf;<command>commit</command></screen>
<para>For more details on <option>fencing_topology</option>:
</para>
<screen>&prompt.root;<command>crm configure help fencing_topology</command></screen>
<screen>&prompt.crm.conf;<command>help fencing_topology</command></screen>
</step>
</procedure>
</example>
Expand Down

0 comments on commit 97c5027

Please sign in to comment.