-
Notifications
You must be signed in to change notification settings - Fork 183
DOC-12489 magma default storage engine #3813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release/8.0
Are you sure you want to change the base?
Conversation
* If you have deployment scripts that create buckets without specifying the storage engine, those scripts create Magma buckets with 128 vBuckets instead of Couchstore buckets after the upgrade. | ||
This may affect your deployment if you depend on buckets using the Couchstore storage engine. | ||
* You cannot use XDCR to replicate Magma buckets using 128 vBuckets with pre-8.0 clusters. | ||
XDCR in pre-8.0 clusters only supports replication between buckets that contain the same number of vBuckets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is for lines 95 and 96.
I think that this level of detail is good for What's New, but the technical detail is:
You can XDCR from a 128 vBucket on 8.0 to a pre-8.0 cluster since 8.0 XDCR supports creating a replication between buckets with different numbers of vBuckets. In XDCR, the source creates the replications. But, I think that "You cannot use XDCR to replicate Magma buckets using 128 vBuckets with pre-8.0 clusters" is fine for What's New since most people would want to replicate from an earlier version to the later version or bi-directionally -- so, it gets the most important point across.
|
||
Another concern is that versions of Couchbase Server earlier than 8.0 do not support XDCR replication between buckets with different numbers of vBuckets. | ||
Therefore, you cannot replicate between a bucket you create with the new default backend setting and buckets on an earlier server version. | ||
To able to replicate with a bucket on an earlier version of Couchbase Server, explicitly set the new bucket's storage backend to Couchstore or to Magma with 1024 vBuckets during creation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo -- missing a word -- "To able to" should be "To be able to".
These behavior changes could cause issues if you rely on the prior behavior, especially if you use deployment scripts. | ||
If you have deployment scripts that create buckets, review them to determine if you need to make changes. | ||
|
||
For example, suppose your deployment script does not specify the storage backend when it creates a bucket that you intend to use with the xref:views/views-mapreduce-intro.adoc[] feature. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are giving a link to the MapReduce Views page here, could we have the same note that we have in Views intro page saying that Views are deprecated and will be removed in a future version in the MapReduce Views page?
Link to Views intro page -- https://docs.couchbase.com/server/current/learn/views/views-intro.html
|
||
[abstract] | ||
{description} | ||
It is important to understand which backend storage is best suited to your requirements. | ||
These storage engines organize the data both on disk ad in memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: organize the data both on disk ad in memory
ad should be and
|
||
Magma can work with very low amounts of memory for large datasets: a minimum memory-to-data ratio of 1% is required. | ||
For example, if a node is holding 5{nbsp}TB of data, Magma can be used with only 64{nbsp}GB RAM. | ||
Magma using 1024 vBuckets has a minimum memory quota of 1{nbsp}GiB per node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked the "Which Storage Engine Should You Use?", but since that section is further down in the page, I'm thinking that this may be a good place to just note here:
If you can allocate at least 1 GiB memory per node to your bucket, you should choose the 1024 vBucket option for Magma as you will get better performance at scale.
@@ -1,5 +1,5 @@ | |||
= vBuckets | |||
:description: pass:q[_vBuckets_ are virtual buckets that help distribute data effectively across a cluster, and support replication across multiple nodes.] | |||
:description: pass:q[vBuckets are virtual buckets that break bucket data into smaller pieces to make distributing data the cluster and replicating data across multiple nodes easier.] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo?
... that break bucket data into smaller pieces to make distributing data the cluster and replicating data across multiple nodes easier.
... to make distributing data in the cluster ... ?
|
||
[#vbucket_to_node_mapping] | ||
image::buckets-memory-and-storage/vbucketToNodeMapping.png[,820,align=left] | ||
|
||
Thus, an authorized client attempting to access data performs a hash operation on the appropriate key, and thereby calculates the number of the vBucket that owns the key. | ||
The client then examines the vBucket map to determine the server-node on which the vBucket resides; and finally performs its operation directly on that server-node. | ||
When accessing data via keys, a client hashes the key to calculate which the vBucket contains the key. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: ... which the vBucket contains the key.
... which vBucket contains the key.
The client then examines the vBucket map to determine the server-node on which the vBucket resides; and finally performs its operation directly on that server-node. | ||
When accessing data via keys, a client hashes the key to calculate which the vBucket contains the key. | ||
It checks the vBucket map it got from the Cluster Manager to find the node containing the active vBucket. | ||
The then client directly connects to the node to read or modify the data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: The then client
Then the client
|
||
Buckets organize their documents into xref:learn:data/scopes-and-collections.adoc[Scopes and Collections]. | ||
Scopes and collections do not affect the way in which keys are allocated to vBuckets. | ||
However, each vBucket is aware of the scope and collection containing of each of its keys. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: ... containing of each of its keys.
...containing each of its keys.
=== Active and Replica vBuckets | ||
|
||
The vBuckets that Couchbase Server uses to access and store data in a bucket are called active vBuckets. | ||
If you enable replication for a bucket, each replica uses another set vBuckets, called replica vBuckets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo:
If you enable replication for a bucket, each replica uses another set __ vBuckets, called replica vBuckets.
If you enable replicas for a bucket, each replica uses another set of vBuckets, called replica vBuckets.
The active vBucket and its replicas are always on different nodes in the cluster to protect against data loss from node failovers. | ||
|
||
For example, suppose you have a Magma bucket configured with 1024 vBuckets and two replicas on Linux. | ||
Then, Couchbase Server has have a total of 3072 vBuckets distributed across the cluster for the bucket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: Then, Couchbase Server has have a total ...
Then, Couchbase Server has a total ...
@@ -0,0 +1,3 @@ | |||
NOTE: Versions of Couchbase Server before 8.0 do not support XDCR replication between buckets with different numbers of vBuckets. | |||
They also do not support Magma buckets with 128 vBuckets. | |||
Due to both these limitations, you cannot replicate a Magma bucket with 128 vBuckets on an 8.0 or later cluster to a bucket on a pre-8.0 cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to both these limitations, you cannot replicate a bucket from a pre-8.0 cluster to a Magma bucket with 128 vBuckets on an 8.0 or later cluster. While, technically, you can replicate a Magma bucket with 128 vBuckets on an 8.0 or later cluster to a bucket on a pre-8.0 cluster, you should avoid such scenarios since bi-directional replication will not be possible.
cc @nelio2k (Neil, please comment if the above is not correct. I think that it's better to go into a bit more detail than to just tell customers that they cannot do XDCR between 8.0+ Magma 128 vBucket buckets and pre-8.0 cluster buckets -- since it's possible that customers may see these behaviors for themselves and get confused.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Sounds good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
|
||
|
||
[#xdcr-migration] | ||
== XDCR Storage Backend Migration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section (XDCR Storage Backend Migration) looks good to me but would like Neil Huang, the XDCR eng team manager to take a look as well.
cc @nelio2k
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
+ | ||
This change in the default storage engine does not affect existing buckets. | ||
You can still create buckets that use the Couchstore storage engine or the Magma st5orage engine with 1024 vBuckets by explicitly specifying them when you create the bucket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: st5orage instead of storage
@@ -46,6 +46,7 @@ curl -X POST -u <administrator>:<password> | |||
-d bucketType=[ couchbase | ephemeral | memcached ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bucketType = memcached won't exist in 8.0
(I am guessing that when the removal of memcached bucket types documentation changes are merged, these will go away.)
This PR covers the impact of MB-62777 making Magma with 128 vBuckets the default storage backend.
It also tackles some of the work for DOC-12778 Data Settings guidance for reader/writer threads change to 'disk i/o optimized' needs to be revised because it was in the the same area of the doc being updated anyhow. The other areas of that ticket are being handled in the DOC-12485 prevent bucket from running out of space PR.
The following pages were updated for this PR (links lead to preview site. Here's the username/password for the preview)