fix(start): guard db_logs vector transform against null regex capture#5126
Conversation
The `db_logs` transform aborts with `expected string, got null` in `upcase!()` when `parse_regex` matches an event message but the `level` named group resolves to null. The fallback branch only covers regex failure (`err != null || parsed == null`), leaving a third path where the match succeeds but the capture is null. That path overwrites the would-be fallback with null and crashes on upcase. Observed under routine local dev load (Next.js dev server issuing service-role Postgres queries): 3,000+ aborted transforms in two minutes, cascading into Vector retry storms and Logflare `ErlSysMon` message-queue backpressure. Extend the fallback condition to also fire when `parsed.level` is null, and guard the assignment in the match branch, so `error_severity` always has a non-null string before `upcase!`.
096cd92 to
09cd623
Compare
|
Verified the fix locally against the same vector image ( Test config (
|
avallete
left a comment
There was a problem hiding this comment.
Thanks for your contribution !
Coverage Report for CI Build 24920463273Coverage decreased (-0.02%) to 63.734%Details
Uncovered ChangesNo uncovered changes found. Coverage Regressions5 previously-covered lines in 1 file lost coverage.
Coverage Stats
💛 - Coveralls |
What
In the
db_logstransform of the Vector config,upcase!()abortswith
expected string, got nullwhenparse_regexmatches an eventmessage but the
levelnamed group resolves to null. The existingfallback branch only covers regex failure
(
err != null || parsed == null), so a third path (match succeeded,capture null) overwrites the
"info"fallback with null and crasheson upcase.
Why
Running a Next.js dev server against the local Supabase stack with
routine service-role Postgres queries produces 3,000+ aborted
db_logstransforms in ~2 minutes. Vector retries the failures,Logflare's ingest queue backs up (
ErlSysMonwarnings for:long_message_queueand:long_scheduleat 746ms), and theanalytics container sustains 100% CPU. On a resource-constrained
host (laptop, Docker at 8 GB) this cascades into swap thrashing and
host unresponsiveness.
How
Extend the fallback condition to also fire when
parsed.levelisnull, and guard the match-path assignment symmetrically, so
error_severityalways holds a non-null string beforeupcase!().Verification
Before: thousands of
function call error for "upcase" at (476:516): expected string, got nullerrors fromdb_logsin the Vectorcontainer logs during dev activity.
After: zero transform failures under the same workload.