-
Notifications
You must be signed in to change notification settings - Fork 462
prevent offline all system tables #5433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|| tableName.equals(AccumuloTable.ROOT.tableName())) { | ||
throw new AccumuloException("Cannot set table to offline state"); | ||
} | ||
NOT_BUILTIN_TABLE.validate(tableName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, this would have thrown an AccumuloException
. Now it throws an IllegalArgumentException
. This is user-facing client side behavior. I think IllegalArgumentException
is okay here, but I would be curious to know what the 2.1 behavior did before merging this to determine if it's an acceptable change in behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but I would be curious to know what the 2.1 behavior did before merging this to determine if it's an acceptable change in behavior.
Could add an IT in 2.1 that attempts to take a system table offline and see what the exception is. Then run the same IT w/ this PR and see what the exception is. See what comes across thrift in both cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IllegalArgumentException
is the same exception and message a user would get if they tried to delete
a system table, so assumed it was a fine replacement. Edit: double checked, delete
throws an AccumuloException
but same message
Tested 2.1... Offlining a system table is an acceptable operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can target 2.1 with these changes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did some more digging. The difference in Exceptions is where they are thrown, if validated in thrift (FateServiceHandler
) will get the AccumuloException
. So can move this validation from TableOperationsImpl
to FateServiceHandler
, where most of the table validation is done.
As for what ops are/aren't acceptable on system tables, in 2.1, can online/offline
any table that's not the ROOT table, can deleteRows
of any table that's not the METADATA table, and can bulkImportv2
any table that's not the ROOT table. The other ops like delete
, create
, etc. ensure that it's not a built in table (which is just META and ROOT in 2.1, I believe)
These were the 3 that stood out as weird to me. Seems like these should be all system tables, not just META or ROOT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could be optimized using the AccumuloTable
class, and have a contains
method on it? To see if it's a built-in system table in the accumulo
namespace. I also saw that there was a possibility that the scan ref table could actually be deleted, so it may not be safe to assume these tables are always existing. Since this is just an optimization, to short-circuit a check for built-in tables that should always exist, then it might be okay to just check these metadata tables and do the longer check for everything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding something to AccumuloTable
seems good to me.
I also saw that there was a possibility that the scan ref table could actually be deleted
I don't believe this can be done:
accumulo/server/manager/src/main/java/org/apache/accumulo/manager/FateServiceHandler.java
Lines 373 to 377 in 8353f3e
case TABLE_DELETE: { | |
TableOperation tableOp = TableOperation.DELETE; | |
validateArgumentCount(arguments, tableOp, 1); | |
String tableName = | |
validateName(arguments.get(0), tableOp, EXISTING_TABLE_NAME.and(NOT_BUILTIN_TABLE)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding something to
AccumuloTable
seems good to me.I also saw that there was a possibility that the scan ref table could actually be deleted
I don't believe this can be done:
accumulo/server/manager/src/main/java/org/apache/accumulo/manager/FateServiceHandler.java
Lines 373 to 377 in 8353f3e
case TABLE_DELETE: { TableOperation tableOp = TableOperation.DELETE; validateArgumentCount(arguments, tableOp, 1); String tableName = validateName(arguments.get(0), tableOp, EXISTING_TABLE_NAME.and(NOT_BUILTIN_TABLE));
Okay, that will ensure we can't do it in our API. Where I saw it, TableManager was being called directly, to delete its ZooKeeper contents. I guess that's not an issue from the public API or Fate RPC because of what you linked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. Yeah the concern is with the public API.
I'm still not entirely sure what should be done for this. Should probably make changes to both 2.1 and 4.0. The main concerns are:
- What ops are/aren't acceptable on system tables.
in 2.1, can online/offline any table that's not the ROOT table, can deleteRows of any table that's not the METADATA table, and can bulkImportv2 any table that's not the ROOT table... Seems like these should be all system tables, not just META or ROOT.
I'm wondering if it would be better to move all validation for the table names (e.g., ensuring it's not a system table, ensuring the table exists, doesn't exists, etc.) into TableOperationsImpl. Seems like some things are validated in TableOperationsImpl and some are validated in FateServiceHandler or both, which is a bit confusing. If we always validate in TableOperationsImpl we can avoid the RPC.
If FateServiceHandler.executeFateOperation
is called somewhere else other than table operations, this may not work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a separate discussion, I had mentioned to you directly @ctubbsii in regards to
in 2.1, can online/offline any table that's not the ROOT table, can deleteRows of any table that's not the METADATA table, and can bulkImportv2 any table that's not the ROOT table.
That I was wrong about offline
here, but I was thinking of deleteRows
. So here is the corrected statement:
in 2.1, can online/offline any table that's not the ROOT table, can deleteRows of any table that's not a METADATA table (METADATA, ROOT) (so this is correct in 2.1), and can bulkImportV2 to any table that's not the ROOT table
Since you shouldn't be able to offline
any system table through the table ops API, will change this PR to target 2.1.
This PR only pertains to offline
so discussion around the other ops will be pushed to other tickets/PRs most likely as discussion in the PR I make for #5427
Previously, could offline any table that was not ROOT. This commit prevents offline of any system table
6ab9108
to
0c6677e
Compare
I've updated this PR to target 2.1, as could offline METADATA in 2.1. Discussion around other potential issues with table operations has been pushed to #5474 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the semantics of the RPC arguments in a bugfix release. It also too aggressively prevents all built-in tables from going offline/online. This prevents the correct behavior of the replication table in 2.1, which by default is offline, but can be turned online to enable replication. While this feature is deprecated and is removed in 3.0 and later, this completely breaks it in 2.1, since it is a built-in table that must be able to be turned online/offline on demand.
@@ -1463,7 +1463,7 @@ public void offline(String tableName, boolean wait) | |||
EXISTING_TABLE_NAME.validate(tableName); | |||
|
|||
TableId tableId = context.getTableId(tableName); | |||
List<ByteBuffer> args = Arrays.asList(ByteBuffer.wrap(tableId.canonical().getBytes(UTF_8))); | |||
List<ByteBuffer> args = Arrays.asList(ByteBuffer.wrap(tableName.getBytes(UTF_8))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes the semantics of the RPC argument from that of a tableId to that of a tableName. This makes this RPC incompatible across a bugfix release, preventing rolling upgrades, which is something we generally support for bugfix releases.
final TableId tableId = | ||
ClientServiceHandler.checkTableId(manager.getContext(), tableName, tableOp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tableId was already resolved on the client side. Resolving it a second time here, makes it possible for the client side to refer to a different table than that of the server side if some table renames/creates happened concurrently. It also creates an extra lookup that shouldn't be necessary.
The current code using the table ID was fine, but just needed to verify that the TableId didn't refer to the metadata table. Other builtin tables may still be available to go online/offline.
@ctubbsii - Thanks for the info. I was unaware of the replication table - thought the only built in tables in 2.1 were I avoid any RPC changes now, and added a new Validator. Thinking when this is merged into 4.0 will have to change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks correct to me, and I think changing to prevent all of the tables in the accumulo namespace when merging to main is sensible.
In 2.1: * Previously, could offline any table that was not ROOT. * This commit prevents offlining all system tables except for the replication table (this includes METADATA and ROOT). Does not prevent offlining all system tables in 2.1, as offlining the replication table is a desired feature. In main/4.x: * Previously, could offline any table that was not ROOT or METADATA. * This commit prevents offlining all system tables (this currently includes METADATA, ROOT, FATE, and SCAN_REF). The replication table is removed in 4.x, and we want to prevent offlining any of the currently existing system tables in 4.x.
noticed when working on #5427