Skip to content

Create formalized policy on default format version changes #709

Open
@alecgrieser

Description

@alecgrieser

As a bit of background, each FDBRecordStore has a "format version" on disk, that controls certain aspects of its on-disk behavior. Certain changes here are relatively benign: for example, if we add an additional field to the store header, then a new format version is introduced to protect against writing data to that store header field and then having newer versions of the code ignore it. (Which, in some circumstances, may lead to it doing the wrong thing.) Some of these changes are relatively major. For example, one of the format versions changes whether each record's "version" is stored in a separate subspace or next to the record. And consequently, some of the versions can be upgraded to for free with very little changes, and some of them require migrating existing data (which some cost associated with that).

Importantly, each FDBRecordStore object at build time is given a format version as determined by the setFormatVersion method on the builder. If none is set, then the default is used. Then when the record store is opened, it will check the format version of the store already on disk. It will then choose the maximum of whatever the current format version is on disk and the format version that was used to configure the record store and upgrade the record store to that version if necessary. (This means it never downgrades record stores, and it upgrades record stores when it sees them.)

At the moment, our default format version is pinned to the maximum supported version:

public static final int DEFAULT_FORMAT_VERSION = MAX_SUPPORTED_FORMAT_VERSION;

This means that if one isn't careful and uses the default format version on record stores, and then one upgrades from one version of the Record Layer to the next, if they have multiple instances and upgrade in a rolling fashion (which is usually fairly standard), they can run into problems during the deploy if:

  1. One of the upgraded instances reads a store and upgrades it to the default/newest format version.
  2. Another instance that has not yet been upgraded opens that store and sees it is at a format version that is newer than its MAX_FORMAT_VERSION.

This is bad, and we should probably adjust our defaults so that it isn't the default behavior. There is a way around it, and it is for the user to set the format version used by each store, but that is something a user has to be pro-active about.

There are a few ways we could get around this. I think there are roughly two ideas that I've seen when talking to people about this:

  • Start requiring users to set a format version. In other words, if they haven't set .setFormatVersion explicitly at .build time, then the FDBRecordStore should fail to build. Then this forces users to think about what they are doing and then to explicitly choose when to upgrade. The downside is that this might encourage users to refrain from upgrading even as we add new features, and it also increases the learning curve for new users as they might not know what a good format version for them should be. (And possibly the worst case would be for them to set it to the MAX_FORMAT_VERSION constant, and then we're in the same boat as now.) The upside is it encourages users to do the "right thing" from the beginning, and there may be ways to allow users who are new or don't care to still set a sensible default by continuing to have a DEFAULT_FORMAT_VERSION constant that we have users use, but that we promise certain guarantees about "how long we will wait before upgrading" (more on that below).
  • Add a policy that states that we only update the default format version after it has been around for n minor versions. If n is two, then this means that users can avoid this problem by never skipping a minor version, which isn't unheard of in distributed databases. (In this case, let's say we introduce a new format version in version x.y, then we make it the default in x.(y+2). When upgrading from x.y to x.(y+1), all stores stay at the old version, but after the upgrade, they can all read the new format version. Then when the upgrade from x.(y+1) to x.(y+2) happens, then the stores will start getting upgraded.) The downside here is that as some of the format version changes are "expensive" (or more expensive than the usual path), a user might deploy something and then be surprised when the upgrade itself starts triggering extra work. The upside is that it means users only have to follow a simple rule that they may be accustomed to anyway, and it doesn't require any work on their end that touches their code. In theory, the first option could be "combined" with this in that we set the DEFAULT_FORMAT_VERSION constant using this rule, but require the user set some format version, with the default an option available for those who want to (eyes wide open) choose to let us do it.

There are also some variations:

  • A different default format version for new stores. This doesn't solve the "how does one avoid writing format versions that other instances can't read" problem, but it might solve the "how does one avoid accidentally upgrading a store" problem. Then stores could be configured to not upgrade, which is safe, though possibly non-optimal, but new stores (where upgrading is free as they are not really upgrading from anything). It's possible we'd need to combine, say, the second option with this one, but we'd be less concerned about accidentally causing people to upgrade things on a new deployment.

There seems to be some amount of consensus that, for now, we should stop updating the default format version to avoid those problems, but we kind of need to pick something going forward. Opinions welcome!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions