Commit Graph

9 Commits

Author SHA1 Message Date
Alex Vandiver 5654d051f7 worker: Split into separate files.
This makes each worker faster to start up.
2024-04-16 23:00:02 -07:00
Anders Kaseorg a50eb2e809 mypy: Enable new error explicit-override.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-10-12 12:28:41 -07:00
Anders Kaseorg 6988622fe8 ruff: Enable B023 Function definition does not bind loop variable.
Python’s loop scoping is misdesigned, resulting in a very common
gotcha for functions that close over loop variables [1].  The general
problem is so bad that even the Go developers plan to break
compatibility in order to fix the same design mistake in their
language [2].

Enable the Ruff rule function-uses-loop-variable (B023) [3], which
conservatively prohibits functions from binding loop variables at all.

[1] https://docs.python-guide.org/writing/gotchas/#late-binding-closures
[2] https://go.dev/s/loopvar-design
[3] https://beta.ruff.rs/docs/rules/function-uses-loop-variable/

Signed-off-by: Anders Kaseorg <anders@zulip.com>
2023-09-11 18:03:45 -07:00
Alex Vandiver 9f231322c9 workers: Pass down if they are running multi-threaded.
This allows them to decide for themselves if they should enable
timeouts.
2023-05-16 14:05:01 -07:00
Alex Vandiver 800e38016a queue_rate: Output to CSV, and run multiple prefetch values. 2021-11-16 11:48:50 -08:00
Anders Kaseorg 11741543da python: Reformat with Black, except quotes.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2021-02-12 13:11:19 -08:00
Anders Kaseorg 72d6ff3c3b docs: Fix more capitalization issues.
Signed-off-by: Anders Kaseorg <anders@zulip.com>
2020-10-23 11:46:55 -07:00
Alex Vandiver d47637fa40 queue: Set a max consume timeout with SIGALRM.
SIGALRM is the simplest way to set a specific maximum duration that
queue workers can take to handle a specific message.  This only works
in non-threaded environments, however, as signal handlers are
per-process, not per-thread.

The MAX_CONSUME_SECONDS is set quite high, at 10s -- the longest
average worker consume time is embed_links, which hovers near 1s.
Since just knowing the recent mean does not give much information[1],
it is difficult to know how much variance is expected.  As such, we
set the threshold to be such that only events which are significant
outliers will be timed out.  This can be tuned downwards as more
statistics are gathered on the runtime of the workers.

The exception to this is DeferredWorker, which deals with quite-long
requests, and thus has no enforceable SLO.

[1] https://www.autodesk.com/research/publications/same-stats-different-graphs
2020-10-06 17:26:14 -07:00
Alex Vandiver 8cf37a0d4b queue: Add a tool to profile no-op enqueue and dequeue actions. 2020-10-06 17:26:14 -07:00