Skip to content

MDEV-36554 Assertion `is_wsrep() == wsrep_on(mysql_thd)' failed #508

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: 11.4-retry-applying
Choose a base branch
from

Conversation

plampio
Copy link

@plampio plampio commented Jun 13, 2025

       in void trx_t::commit_in_memory(const mtr_t*)

This bug fix prevents debug assertion failure when Galera feature retry applying is enabled. This patch introduces also a general mechanism for temporarily disabling some debug assertions in InnoDB code.

The unwanted debug assertion failure may occur when a slave tries to apply again a WSREP transaction. The initial step in the retry is to rollback (locally) the failed transaction. During the rollback the WSREP transaction is temporarily marked as a non-WSREP transaction. When the rollback is completed, the transaction is marked a WSREP transaction again.

The failing assertion in InnoDB code (trx/trx0trx.cc) detects that a transaction that was flagged a WSREP transaction at creation is now a non-WSREP transaction. This causes the assertion failure.

           in void trx_t::commit_in_memory(const mtr_t*)

This bug fix prevents debug assertion failure when Galera feature
retry applying is enabled. This patch introduces also a general
mechanism for temporarily disabling some debug assertions in InnoDB
code.
@plampio plampio requested review from sjaakola and temeo June 13, 2025 12:19
@janlindstrom
Copy link

@plampio Can you improve commit message why assertion does not hold in retry applying and why we can't instead get it to hold on retry applying or why not?

@@ -1501,7 +1505,9 @@ TRANSACTIONAL_INLINE inline void trx_t::commit_in_memory(const mtr_t *mtr)
trx_finalize_for_fts(this, undo_no != 0);

#ifdef WITH_WSREP
ut_ad(is_wsrep() == wsrep_on(mysql_thd));
if (!mysql_thd || !wsrep_get_assert(mysql_thd, WSREP_ASSERT_INNODB_TRX)) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a function in wsrep service just for avoiding this assert seems a bit excessive to me. Would there be some more lightweight solution for this, like

/* Assert that any transaction which is started as a wsrep transaction, also commits
* or rolls back as a wsrep transaction. BF transactions may turn wsrep off temporarily
* for rollback when they are retried. */
ut_ad(is_wsrep() == wsrep_on(mysql_thd) || (in_rollback && wsrep_thd_is_BF(mysql_thd));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree Teemu and comment above makes things a lot clearer.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is a more light-weight solution.
I suggested this new function in anticipation that there would be similar problems in the future:
this new function would allow solving them easily.

Your suggestion is certainly simpler and nicer if we think that this problem is just one of a kind.

Copy link
Author

@plampio plampio Jun 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented @temeo 's suggestion but unfortunately it does not work.

The problem is that the trx->in_rollback flag is already false when the failing assert is reached:
the flag is false at the beginning of the trx_t::rollback_finish() from which the failing trx_t::commit() is called. Here is the call stack:

`#12 0x00007fe66fab7e96 in __GI___assert_fail (assertion=0x5571ab7dc298 "is_wsrep() == wsrep_on(mysql_thd) || (in_rollback && wsrep_thd_is_BF(mysql_thd, false))",
file=0x5571ab7dae18 "/home/pekka/git/mariadb-server4/storage/innobase/trx/trx0trx.cc", line=1507, function=0x5571ab7dc1a0 "void trx_t::commit_in_memory(const mtr_t*)") at ./assert/assert.c:103

#13 0x00005571aaf8dd9f in trx_t::commit_in_memory (mtr=0x0, this=0x7fe63f3fec00) at /home/pekka/git/mariadb-server4/storage/innobase/trx/trx0trx.cc:1507

#14 trx_t::commit_low (this=0x7fe63f3fec00, mtr=0x0) at /home/pekka/git/mariadb-server4/storage/innobase/trx/trx0trx.cc:1602

#15 0x00005571aaf8deff in trx_t::commit_persist (this=0x7fe63f3fec00) at /home/pekka/git/mariadb-server4/storage/innobase/trx/trx0trx.cc:1616

#16 0x00005571aaf8e020 in trx_t::commit (this=0x7fe63f3fec00) at /home/pekka/git/mariadb-server4/storage/innobase/trx/trx0trx.cc:1625

#17 0x00005571aaf7c38c in trx_t::rollback_finish (this=0x7fe63f3fec00) at /home/pekka/git/mariadb-server4/storage/innobase/trx/trx0roll.cc:65
`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plampio Why trx->in_rollback flag setting to false can't be moved later ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janlindstrom Moving the trx->in_rollback flag setting to false later does not help because it is never set to true during this particular rollback. This flag is set to true only in some code paths associated with rollback.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-implemented the fix:

Modified the bug fix for MDEV-36554 by replacing the general
mechanism for disabling assertions in InnoDB code with solution
specific to this rollback issue.

The appliers sets a flag in the applier thread for the duration of the
local rollback. This is used in InnoDB code to detect a local
rollback.

mechanishm for disabling assertions in InnoDB code with solution
specific to this rollback issue.

The appliers sets a flag in the applier thread for the duration of the
local rollback. This is used in InnoDB code to detect a local
rollback.
Copy link

@janlindstrom janlindstrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@plampio plampio self-assigned this Jul 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants