You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This changeset adds basic time-series support.
Because of the verifier’s `_id` dependency, this verifies buckets rather than logical measurements. This implies a requirement that the migration copy time-series collections via buckets as well since a logical replication would not preserve artifacts like bucket `_id`, and possibly not even the grouping of measurements.
Because of the bucket-level verification, details in mismatch reports are not very useful for time-series because they reference bucket-level fields that the logical API doesn’t expose.
This works with per-shard verification (i.e., it can verify with or without a view). It _does not_ currently support namespace filtering.
Copy file name to clipboardExpand all lines: README.md
+10-2Lines changed: 10 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -357,8 +357,16 @@ Additionally, because the amount of data sent to migration-verifier doesn’t ac
357
357
358
358
- If the server’s memory usage rises after generation 0, try reducing `recheckMaxSizeMB`. This will shrink the queries that the verifier sends, which in turn should reduce the server’s memory usage. (The number of actual queries sent will rise, of course.)
359
359
360
+
## Time-Series Collections
361
+
362
+
Because the verifier compares documents by `_id`, it cannot compare logical time-series measurements (i.e., the data that users actually insert). Instead it compares the server’s internal time-series “buckets”. Unfortunately, this makes mismatch details essentially useless with time-series since they will be details about time-series buckets, which users generally don’t see.
363
+
364
+
It also requires that migrations replicate the raw buckets rather than the logical measurements. This is because a logical migration would cause `_id` mismatches between source & destination buckets. A user application wouldn’t care (since it never sees the buckets’ `_id`s), but verification does.
365
+
366
+
NB: Given bucket documents’ size, hashed document comparison can be especially useful with time-series.
367
+
360
368
# Limitations
361
369
362
-
- The verifier’s iterative process can handle data changes while it is running, until you hit the writesOff endpoint. However, it cannot handle DDL commands. If the verifier receives a DDL change stream event, the verification will fail.
370
+
- The verifier’s iterative process can handle data changes while it is running, until you hit the writesOff endpoint. However, it cannot handle DDL commands. If the verifier receives a DDL change stream event from the source, the verification will fail permanently.
363
371
364
-
- The verifier crashes if it tries to compare time-series collections. The error will include a phrase like “Collection has nil UUID (most probably is a view)” and also mention “timeseries”.
372
+
- The verifier cannot verify time-series collections under namespace filtering.
0 commit comments