Erik's mods to EWD998 from class on 2022-07-12 #7

xkrogen · 2022-07-13T16:00:19Z

Primarily closely read the paper's description of the algorithm and made changes to match the spec to the text description. One of the biggest changes was that nodes get marked as tainted when they RECEIVE a message, not when they SEND a message. Whoops! There were also minor changes necessary to match the token's initial state (I think we previously marked it tainted at the start if the initiator is tainted, but that's not right, the token always starts untained and the initiator's taint status gets checked when the token makes its way back around).

I also had to make small adjustments to the ATD spec, most notably to change the WF criteria to only check for Terminate, instead of Next. Not sure if that's "cheating", but I think it makes sense based on my understanding: as with EWD998, we don't want to force nodes to send/rcv messages indefinitely, we just want to ensure that we detect termination.

I validated this up to 3 nodes, and am trying on 4, though it has been running for multiple hours and generated 10M+ states without finishing...

jasperjiaguo · 2022-07-13T16:25:51Z

EWD998.tla

-    \* /\ WF_vars(Next)
-    /\ \A i \in Node: WF_vars(PassToken(i) \/ InitiateToken(i))
-
+    /\ (\A i \in Node: WF_vars(PassToken(i) \/ InitiateToken(i) \/ Terminate(i))) 


I'm also wondering if we should enforce Terminate here, as our termination algorithm should ideally be safe and live even for a system that never terminates? But realistically a valid computation task should terminate.

I'm also thinking if WF termination might be too strong as it might ask the state chain to terminate sooner?

Primarily closely read the paper's description of the algorithm and made changes to match the spec to the text description. One of the biggest changes was that nodes get marked as tainted when they RECEIVE a message, not when they SEND a message. Whoops! There were also minor changes necessary to match the token's initial state (I think we previously marked it tainted at the start if the initiator is tainted, but that's not right, the token always starts untained and the initiator's taint status gets checked when the token makes its way back around). I also had to make small adjustments to the ATD spec, most notably to change the WF criteria to only check for Terminate, instead of Next. Not sure if that's "cheating", but I think it makes sense based on my understanding: as with EWD998, we don't want to force nodes to send/rcv messages indefinitely, we just want to ensure that we detect termination. I validated this up to 4 nodes. Note that it took about 2-3 hours to validate on a 2019 MBP for 4 nodes, generate 10M states and exploring over 350M. Below is the output for the following config file: ``` SPECIFICATION Spec INVARIANT TypeOK CONSTANT N = 4 CONSTRAINT Constraint CONSTANT Init <- MCInit ALIAS Alias INVARIANT IInv PROPERTY ATDSpec ``` ``` /Library/Java/JavaVirtualMachines/jdk11.0.13.8-msft.jdk/Contents/Home/bin/java -XX:+UseParallelGC -cp /Users/ekrogen/.vscode/extensions/alygin.vscode-tlaplus-nightly-2022.5.2815/tools/tla2tools.jar -Dtlc2.TLC.ide=vscode tlc2.TLC EWD998.tla -tool -modelcheck -config EWD998.cfg -deadlock -noTE TLC2 Version 2.18 of Day Month 20?? (rev: 6ee6e9f) Running breadth-first search Model-Checking with fp 57 and seed 3965354632801637035 with 1 worker on 16 cores with 7282MB heap and 64MB offheap memory [pid: 78154] (Mac OS X 12.4 x86_64, Microsoft 11.0.13 x86_64, MSBDiskFPSet, DiskStateQueue). Starting SANY... Parsing file /Users/ekrogen/dev/tlaplus-workshop/EWD998.tla ... Starting... (2022-07-12 18:06:50) Implied-temporal checking--satisfiability problem has 1 branches. .... Finished computing initial states: 16 distinct states generated at 2022-07-12 18:06:50. Checking temporal properties for the current state space with 2834 total distinct states at (2022-07-12 18:06:53) Finished checking temporal properties in 00s at 2022-07-12 18:06:53 ... Progress(5) at 2022-07-12 18:06:53: 90,980 states generated (90,980 s/min), 9,131 distinct states found (9,131 ds/min), 6,296 states left on queue. Checking temporal properties for the current state space with 96397 total distinct states at (2022-07-12 18:07:53) Finished checking temporal properties in 01s at 2022-07-12 18:07:55 .... Progress(26) at 2022-07-13 09:01:07: 346,168,712 states generated (4,791,239 s/min), 10,338,601 distinct states found (105,372 ds/min), 163,790 states left on queue. Progress(27) at 2022-07-13 09:02:07: 351,116,470 states generated (4,947,758 s/min), 10,434,714 distinct states found (96,113 ds/min), 110,216 states left on queue. Progress(29) at 2022-07-13 09:03:07: 355,834,008 states generated (4,717,538 s/min), 10,532,678 distinct states found (97,964 ds/min), 65,050 states left on queue. Checkpointing of run states/22-07-12-18-06-49.957 Checkpointing completed at (2022-07-13 09:04:08) Progress(32) at 2022-07-13 09:04:08: 360,659,672 states generated (4,825,664 s/min), 10,629,235 distinct states found (96,557 ds/min), 17,243 states left on queue. Progress(49) at 2022-07-13 09:04:25: 362,113,272 states generated, 10,656,918 distinct states found, 0 states left on queue. Checking temporal properties for the complete state space with 10656918 total distinct states at (2022-07-13 09:04:25) Finished checking temporal properties in 02min 36s at 2022-07-13 09:07:01 Model checking completed. No error has been found. Estimates of the probability that TLC did not check all reachable states because two distinct states had the same fingerprint: calculated (optimistic): val = 2.0E-4 based on the actual fingerprints: val = 6.6E-9 Progress(49) at 2022-07-13 09:07:02: 362,113,272 states generated (402,253 s/min), 10,656,918 distinct states found (11,838 ds/min), 0 states left on queue. 362113272 states generated, 10656918 distinct states found, 0 states left on queue. The depth of the complete state graph search is 49. The average outdegree of the complete state graph is 1 (minimum is 0, the maximum 18 and the 95th percentile is 3). Finished in 54013024ms at (2022-07-13 09:07:02) ```

…Termination action and apply fairness on that

jasperjiaguo reviewed Jul 13, 2022

View reviewed changes

xkrogen force-pushed the 072022-ekrogen-mods branch from a70049b to 4a8fa47 Compare July 13, 2022 17:47

Get rid of fairness on Terminate, instead break out a separate Detect…

5f9ca59

…Termination action and apply fairness on that

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Erik's mods to EWD998 from class on 2022-07-12 #7

Erik's mods to EWD998 from class on 2022-07-12 #7

xkrogen commented Jul 13, 2022

jasperjiaguo Jul 13, 2022

jasperjiaguo Jul 13, 2022

Erik's mods to EWD998 from class on 2022-07-12 #7

Are you sure you want to change the base?

Erik's mods to EWD998 from class on 2022-07-12 #7

Conversation

xkrogen commented Jul 13, 2022

jasperjiaguo Jul 13, 2022

Choose a reason for hiding this comment

jasperjiaguo Jul 13, 2022

Choose a reason for hiding this comment