Skip to content

Node bounce causes healthy node to be dropped #64

@bernerdschaefer

Description

@bernerdschaefer

Given 3 doozerd nodes listening on 8046, 8047, and 8048 (as in the fire drill), when 8048 is killed and restarted, after the timeout window one of the other nodes is evicted from the cluster.

This script will consistently produce these results for me: https://gist.github.com/bernerdschaefer/5714419

Hitting the web UI after the cluster state has stabilized shows something like this:

/
    ctl/
        cal/
            0     (5)       D4HVNXRRANR4YRGQ
            1   (532)       
            2   (477)       73O2WLRB3DVRPG5V
        node/
            4MMJNJ76M5IDQSBQ/
                applied     (573)       572
            73O2WLRB3DVRPG5V/
                addr        (273)       127.0.0.1:8048
                applied     (574)       573
                hostname    (276)       precise64
                version     (279)       0.8+53+g985ed10
                writable    (538)       true
            D4HVNXRRANR4YRGQ/
                addr          (2)       127.0.0.1:8046
                applied     (575)       574
                hostname      (3)       precise64
                version       (4)       0.9.0-alpha
                writable     (57)       true
        ns/
            test/
                4MMJNJ76M5IDQSBQ    (123)       127.0.0.1:8047
                6D32P3ZOQDJIMVEV    (131)       127.0.0.1:8048
                73O2WLRB3DVRPG5V    (533)       127.0.0.1:8048
                D4HVNXRRANR4YRGQ      (6)       127.0.0.1:8046
        err     (541)       rev mismatch
        name      (1)       test

Where in this case, node 8047 has been (partially) evicted from the cluster: it's been removed from /cal/ctl, and everything except "applied" has been removed from the node info.

At this point, node 8047 is still running but produces no messages in the log. If node 8047 is killed and restarted, the cluster returns to normal operation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions