Skip to content

Commit 34eade2

Browse files
kofemannmksahakyan
authored andcommitted
cells: ignore empty core domain uris propagated by zk
Motivation: When Zookeeper updates core domain infos, dCache will first kill the existing cell tunnels and then later try to read and parse the new value. If the new value is an empty string (for whatever reason), parsing will fail, but a new connection will not be established. The corresponding error in the log: 18 Nov 2024 08:45:00 (c-dcache-head-xxx03_messageDomain-AAYmVA1LtnA-AAYmVA16phA) [dcache-head-xxx03_messageDomain,9.2.21,CORE] Error while reading from tunnel: java.net.SocketExceptio> 18 Nov 2024 08:45:43 (c-dcache-head-xxx03_messageDomain-AAYnKxn40fA) [] Uncaught exception in thread TunnelConnector-dcache-head-xxx03_messageDomain java.lang.NullPointerException: null at java.base/java.net.Socket.<init>(Socket.java:448) at java.base/java.net.Socket.<init>(Socket.java:264) at java.base/javax.net.DefaultSocketFactory.createSocket(SocketFactory.java:277) at dmg.cells.network.LocationManagerConnector.connect(LocationManagerConnector.java:64) at dmg.cells.network.LocationManagerConnector.run(LocationManagerConnector.java:94) at dmg.cells.nucleus.CellNucleus.lambda$wrapLoggingContext$2(CellNucleus.java:725) at java.base/java.lang.Thread.run(Thread.java:829) Modification: before killing existing tunnel check that ZK didn't propagate empty data. Result: More roust cell communication NOTE: a non empty invalid data still accepted!!! Fixes: #7696 Acked-by: Lea Morschel Target: master, 10.2, 10.1, 10.0, 9.2 Require-book: no Require-notes: yes (cherry picked from commit 30829c9) Signed-off-by: Tigran Mkrtchyan <[email protected]>
1 parent e5d1986 commit 34eade2

File tree

1 file changed

+16
-2
lines changed

1 file changed

+16
-2
lines changed

modules/cells/src/main/java/dmg/cells/services/LocationManager.java

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -225,7 +225,7 @@ public CoreDomainInfo(byte[] bytes) {
225225
}
226226
} catch (IOException ie) {
227227
throw new IllegalArgumentException(
228-
"Failed deserializing LocationManager Cores as uri: {}", ie.getCause());
228+
"Failed deserializing LocationManager Cores as uri", ie);
229229
}
230230
}
231231

@@ -513,6 +513,10 @@ public void close() {
513513
public void reset(Mode mode, State state) {
514514
}
515515

516+
private boolean hasNoData(ChildData data) {
517+
return data == null || data.getData() == null || data.getData().length == 0;
518+
}
519+
516520
public void update(PathChildrenCacheEvent event) {
517521
LOGGER.info("{}", event);
518522
String cell;
@@ -525,12 +529,22 @@ public void update(PathChildrenCacheEvent event) {
525529
}
526530
break;
527531
case CHILD_UPDATED:
528-
cell = connectors.remove(ZKPaths.getNodeFromPath(event.getData().getPath()));
532+
if (hasNoData(event.getData())) {
533+
LOGGER.warn("Ignoring empty data on UPDATED for {}", event.getData().getPath());
534+
break;
535+
}
536+
cell = connectors.remove(
537+
ZKPaths.getNodeFromPath(event.getData().getPath()));
529538
if (cell != null) {
530539
killConnector(cell);
531540
}
532541
// fall through
533542
case CHILD_ADDED:
543+
if (hasNoData(event.getData())) {
544+
LOGGER.warn("Ignoring empty data on ADDED for {}", event.getData().getPath());
545+
break;
546+
}
547+
534548
//Log if the Core Domain Information received is incompatible with previous
535549
CoreDomainInfo info = infoFromZKEvent(event);
536550
String domain = ZKPaths.getNodeFromPath(event.getData().getPath());

0 commit comments

Comments
 (0)