Skip to content

Commit

Permalink
cells: ignore empty core domain uris propagated by zk
Browse files Browse the repository at this point in the history
Motivation:
When Zookeeper updates core domain infos, dCache will first kill the existing cell tunnels
and then later try to read and parse the new value. If the new value is an empty
string (for whatever reason), parsing will fail, but a new connection will not be
established. The corresponding error in the log:

18 Nov 2024 08:45:00 (c-dcache-head-xxx03_messageDomain-AAYmVA1LtnA-AAYmVA16phA) [dcache-head-xxx03_messageDomain,9.2.21,CORE] Error while reading from tunnel: java.net.SocketExceptio>
18 Nov 2024 08:45:43 (c-dcache-head-xxx03_messageDomain-AAYnKxn40fA) [] Uncaught exception in thread TunnelConnector-dcache-head-xxx03_messageDomain
java.lang.NullPointerException: null
        at java.base/java.net.Socket.<init>(Socket.java:448)
        at java.base/java.net.Socket.<init>(Socket.java:264)
        at java.base/javax.net.DefaultSocketFactory.createSocket(SocketFactory.java:277)
        at dmg.cells.network.LocationManagerConnector.connect(LocationManagerConnector.java:64)
        at dmg.cells.network.LocationManagerConnector.run(LocationManagerConnector.java:94)
        at dmg.cells.nucleus.CellNucleus.lambda$wrapLoggingContext$2(CellNucleus.java:725)
        at java.base/java.lang.Thread.run(Thread.java:829)

Modification:
before killing existing tunnel check that ZK didn't propagate empty
data.

Result:
More roust cell communication

NOTE: a non empty invalid data still accepted!!!

Fixes: #7696
Acked-by: Lea Morschel
Target: master, 10.2, 10.1, 10.0, 9.2
Require-book: no
Require-notes: yes
(cherry picked from commit 30829c9)
Signed-off-by: Tigran Mkrtchyan <[email protected]>
  • Loading branch information
kofemann authored and mksahakyan committed Nov 21, 2024
1 parent e5d1986 commit 34eade2
Showing 1 changed file with 16 additions and 2 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ public CoreDomainInfo(byte[] bytes) {
}
} catch (IOException ie) {
throw new IllegalArgumentException(
"Failed deserializing LocationManager Cores as uri: {}", ie.getCause());
"Failed deserializing LocationManager Cores as uri", ie);
}
}

Expand Down Expand Up @@ -513,6 +513,10 @@ public void close() {
public void reset(Mode mode, State state) {
}

private boolean hasNoData(ChildData data) {
return data == null || data.getData() == null || data.getData().length == 0;
}

public void update(PathChildrenCacheEvent event) {
LOGGER.info("{}", event);
String cell;
Expand All @@ -525,12 +529,22 @@ public void update(PathChildrenCacheEvent event) {
}
break;
case CHILD_UPDATED:
cell = connectors.remove(ZKPaths.getNodeFromPath(event.getData().getPath()));
if (hasNoData(event.getData())) {
LOGGER.warn("Ignoring empty data on UPDATED for {}", event.getData().getPath());
break;
}
cell = connectors.remove(
ZKPaths.getNodeFromPath(event.getData().getPath()));
if (cell != null) {
killConnector(cell);
}
// fall through
case CHILD_ADDED:
if (hasNoData(event.getData())) {
LOGGER.warn("Ignoring empty data on ADDED for {}", event.getData().getPath());
break;
}

//Log if the Core Domain Information received is incompatible with previous
CoreDomainInfo info = infoFromZKEvent(event);
String domain = ZKPaths.getNodeFromPath(event.getData().getPath());
Expand Down

0 comments on commit 34eade2

Please sign in to comment.