Previously, an active production Zulip server would experience a class
of deadlocks caused by two or more concurrent bulk update operations
on the UserMessage table.
This is because UPDATE ... SET ... WHERE statements that execute in
parallel take row-level UPDATE locks as they get results; since the
query plans may result in getting rows in different orders between two
queries, this can result in deadlocks.
Some databases allow ORDER BY on their UPDATE ... WHERE statements;
PostgreSQL does not. In PostgreSQL, the answer is to do a sub-select
with an ORDER BY ... FOR UPDATE to ensure consistent ordering on row
locks.
We do this all code paths using bitand or bitor as part of bulk
editing message flags, which should ensure that these concurrent
operations obtain row level locks on the table in the same order.
Fixes#19054.
This reverts commit 40fcf5a633.
This commit triggers bug that we haven't fully tracked down, where web
app clients will continually send `update_message_flags` requests,
that then send out via the events system "0 messages were marked as
read" notices, eventually leading to a load spike.
The Tornado part can likely be fixed by checking if
updated_message_ids is empty, but we need to track down the frontend
bug as well.
We were blindly adding / removing flag from UserMessages without
check if they even need to be updated.
This caused server to repeatedly update flags for messages which
already had been updated, creating a confusion for other clients
like mobile.
Fixes#22164