forked from rDrama/rDrama
h/t @official-techsupport for digging into the regex performance and coming up with one that greatly reduces backtracking. We see an approximately 2x speedup under typical loads, which proves to be a major overall savings in performance. Previously, censor_slurs was, second to ORM DB accesses, by far the most time-consuming function in the codebase under typical loads. It's still not ideal, but it is much better. Future options to improve this critical path further would be: 1) Precompute a slur-replaced HTML, rather than recomputing each pageload. Storage is cheap. 2) Tokenize the HTML and replace plaintext words using O(1) exact-match lookups to a dict. |
||
---|---|---|
.. | ||
classes | ||
helpers | ||
routes | ||
templates | ||
tests | ||
__init__.py | ||
__main__.py | ||
cli.py |