Skip to content

Document how to simulate a power loss failure #667

Open
@lni

Description

@lni

In a power loss failure, all written data not fsynced onto the persistent storage will be lost. The chaos testing framework should be able to test whether MatrixCube behave correctly with the presence of such power loss failures.

As discussed offline, such power loss failures can be simulated by cutting the network communication first, this isolates the local node from the outside world, meaning it won't be able to affect anyone else anymore. The ignore fsync flag of the vfs is then set (by calling fs.SetIgnoreSyncs(true)) to prevent any further fsync() operations to persistently sync stuff to the underlying storage device. Some random amount of wait time (i.e. sleep) can then be inserted here to accumulate some (un-fsynced) writes. After cube is stopped, fs.ResetToSyncedState() is called to clear all written contents that are not fsynced(), this will be followed by a call to fs.SetIgnoreSyncs(false) to reset the ignore fsync flag.

Provide demo code to show how this works.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions