Erase transient attributes in attribute manager instead of controller #3801

kgaillot · 2025-01-14T22:51:36Z

No description provided.

... for reuse

Previously, when the attribute manager purged a node, it would purge the node's transient attributes only from memory, and assumed the controller would purge them from the CIB. Now, the writer will purge them from the CIB as well. This happens by setting the values to NULL rather than dropping them from memory. If there is a writer, it will also erase the node's entire transient attributes section. If there is no writer, the next elected writer will erase each value as part of its normal election-win write-out. In either case, all nodes will drop the NULL values from memory after getting the notification that the erasure succeeded. This fixes a variety of timing issues when multiple nodes including the attribute writer are shutting down. If the writer leaves before some other node, the DC wipes that other node's attributes from the CIB when that other node leaves the controller process group (or all other nodes do if the DC is the leaving node). If a new writer (possibly even the node itself) is elected before the node's attribute manager leaves the cluster layer, it will write the attributes back to the CIB. Once the other node leaves the cluster layer, all attribute managers remove its attributes from memory, but they are now "stuck" in the CIB. As of this commit, the controller still erases the attributes from the CIB when the node leaves the controller process group, which is redundant but doesn't cause any new problems and will be corrected in an upcoming commit. Note: This will cause an insignificant regression if backported to Pacemaker 2. The Pacemaker 2 controller purges attributes from the CIB for leaving DCs only if they are at version 1.1.13 or later, because earlier DCs will otherwise get fenced after a clean shutdown. Since the attribute manager doesn't know the DC or its version, the attributes would now always be wiped, so old leaving DCs will get fenced. The fencing would occur only in the highly unlikely situation of a rolling upgrade from Pacemaker 2-supported versions 1.1.11 or 1.1.12, and the upgrade would still succeed without any negative impact on resources. Fixes T138

This hasn't been needed since attrd_cib_updated_cb() started ignoring changes from trusted sources (!cib__client_triggers_refresh()). It's especially useless now that the attribute manager handles clearing of leaving nodes' attributes from the CIB.

Now that the attribute manager will erase transient attributes from the CIB when purging a node, we don't need to do that separately in the controller.

…from cache Nothing uses the new capability yet

With recent changes, the attribute manager now handles it when the node leaves the cluster, so the controller purge is redundant. This does alter the timing somewhat, since the controller's purge occurred when the node left the controller process group, while the attribute manager's purge occurs when it leaves the cluster, but that shouldn't make a significant difference. This fixes a problem when a node's controller crashes and is respawned while fencing is disabled. Previously, another node's controller would remove that node's transient attributes from the CIB, but they would remain in the attribute managers' memory. Now, the attributes are correctly retained in the CIB in this situation. Fixes T137 Fixes T139

... instead of wiping from CIB directly

It now boils down to a bool for whether we want only unlocked resources

... to controld_delete_node_history(), and controld_node_state_deletion_strings() to controld_node_history_deletion_strings(), since they only delete history now

kgaillot · 2025-01-14T22:52:56Z

@clumens @nrwahl2 @gao-yan , this should be the complete fix for T137/T138/T139, fully backportable to 2.1. Regression tests and a 100-run lab were clean, but it still needs some specific testing.

clumens · 2025-01-15T19:04:38Z

daemons/attrd/attrd_attributes.c

+ *                      combination; keep the value if it returns 0, drop
+ *                      the value and keep checking the attribute's other
+ *                      values if it returns 1, or drop value and stop checking
+ *                      the attribute's further values if it returns 2


I think it'd be worth making these constants or #defines.

kgaillot · 2025-01-15T20:00:03Z

Some simple rolling upgrade tests look good except for one minor issue. In a mixed-version cluster, a deleted attribute for a leaving node will show up in queries with an empty value rather than not showing up at all. That implies the NULL values aren't being dropped from memory as desired in drop_removed_values() when the node leaves. The (older) DC should still erase the transient attributes in that case (which appears to be working), and the CIB notification for that should still be received (which may not be) and handled correctly, so I'm not sure what's going on. The notification and handling is working when a single attribute is dropped.

Probably worth testing mixed-version clusters some more, with some log tracing. But an empty value is treated the same as a nonexistent one for all purposes other than those queries, so it shouldn't be a dealbreaker. (Rules based on attribute existence use what's in the CIB rather than the attribute manager and so are unaffected.)

@clumens, it would be worthwhile to test your remote attribute preservation code too.

clumens · 2025-01-15T21:16:01Z

Note to my later self - this is referring to 09058c9

kgaillot added 10 commits January 14, 2025 16:43

Refactor: pacemaker-attrd: functionize an attribute check

2a3b810

... for reuse

Low: controller: don't need to erase node attributes for remote nodes

0f72260

Now that the attribute manager will erase transient attributes from the CIB when purging a node, we don't need to do that separately in the controller.

Refactor: controller: allow purging node attributes without removing …

951ae56

…from cache Nothing uses the new capability yet

Low: controller: ask attribute manager to purge fenced nodes' attributes

e7104c7

... instead of wiping from CIB directly

Refactor: controller: drop no-longer-used section enum values

fae6ba9

Refactor: controller: drop node state section enum

35eeade

It now boils down to a bool for whether we want only unlocked resources

Refactor: controller: rename controld_delete_node_state()

1499670

... to controld_delete_node_history(), and controld_node_state_deletion_strings() to controld_node_history_deletion_strings(), since they only delete history now

clumens reviewed Jan 15, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Erase transient attributes in attribute manager instead of controller #3801

Erase transient attributes in attribute manager instead of controller #3801

kgaillot commented Jan 14, 2025

kgaillot commented Jan 14, 2025

clumens Jan 15, 2025

kgaillot commented Jan 15, 2025

clumens commented Jan 15, 2025

Erase transient attributes in attribute manager instead of controller #3801

Are you sure you want to change the base?

Erase transient attributes in attribute manager instead of controller #3801

Conversation

kgaillot commented Jan 14, 2025

kgaillot commented Jan 14, 2025

clumens Jan 15, 2025

Choose a reason for hiding this comment

kgaillot commented Jan 15, 2025

clumens commented Jan 15, 2025