@pherjung does the above info help to add the bug to the next community meeting?
Things that have changed:
- reproducer in the works, not yet reproduced (race)
- watchdog trial does not catch all ‘deadlocks’
BUT - there is information about the state ngfd reaches when it hits the bug:
- main thread has its stack corrupt, while GStreamer is involved.
- ASAN build shows that the thing allocatedin
n_core_register_sink
is overridden byn_core_play_request
specifically for the libngf_gst plugin - ASAN build crashes → so the bug is effectively gone, because systemd restarts ngfd.
I would even go as far and ship ngfd with these ASAN changes since it 1. fixes the bug by crash + restart 2. has the chance to send rich-core data if crash reports are enabled:) (the only perceived downside is the startup which is slower)
(I’m also pinging @jusa if you are kind to assess the utility of the above stack, or advise how it can be improved).