-
Notifications
You must be signed in to change notification settings - Fork 267
Open
Description
Given 3 doozerd nodes listening on 8046, 8047, and 8048 (as in the fire drill), when 8048 is killed and restarted, after the timeout window one of the other nodes is evicted from the cluster.
This script will consistently produce these results for me: https://gist.github.com/bernerdschaefer/5714419
Hitting the web UI after the cluster state has stabilized shows something like this:
/
ctl/
cal/
0 (5) D4HVNXRRANR4YRGQ
1 (532)
2 (477) 73O2WLRB3DVRPG5V
node/
4MMJNJ76M5IDQSBQ/
applied (573) 572
73O2WLRB3DVRPG5V/
addr (273) 127.0.0.1:8048
applied (574) 573
hostname (276) precise64
version (279) 0.8+53+g985ed10
writable (538) true
D4HVNXRRANR4YRGQ/
addr (2) 127.0.0.1:8046
applied (575) 574
hostname (3) precise64
version (4) 0.9.0-alpha
writable (57) true
ns/
test/
4MMJNJ76M5IDQSBQ (123) 127.0.0.1:8047
6D32P3ZOQDJIMVEV (131) 127.0.0.1:8048
73O2WLRB3DVRPG5V (533) 127.0.0.1:8048
D4HVNXRRANR4YRGQ (6) 127.0.0.1:8046
err (541) rev mismatch
name (1) test
Where in this case, node 8047 has been (partially) evicted from the cluster: it's been removed from /cal/ctl, and everything except "applied" has been removed from the node info.
At this point, node 8047 is still running but produces no messages in the log. If node 8047 is killed and restarted, the cluster returns to normal operation.
Metadata
Metadata
Assignees
Labels
No labels