[@opentelmetry/instrumentation-fetch] Internal span setTimeout race condition prevents some metrics from being traced #4683
Labels
bug
Something isn't working
needs:reproducer
This bug/feature is in need of a minimal reproducer
priority:p2
Bugs and spec inconsistencies which cause telemetry to be incomplete or incorrect
What happened?
Steps to Reproduce
Expected Result
All network calls for that endpoint are traced
Actual Result
A percentage of these network calls are not traced
Additional Details
We have seen that one of our metrics is not being reported 100% of the time and have narrowed it down to the usage of
setTimeout
inopentelemetry-instrumentation-fetch
as the reason why. Sometimes the routing happens quick enough that the timeout function hasn't run and the span is never ended and therefore can never be flushed or collected on our side.There is a comment in the fetch instrumentation code about the reasoning for the timeout being waiting for "observer to collect information about resources", but I think overall we would prefer to get metrics that are lacking these resources than none at all. If the
setTimeout
has to stick around, it would be nice if the fetch instrumentation could know to end these spans when the page is being navigated away from so we can get a higher count of spansSome docs about
setTimeout
not be an overly consistent method to use for exact timing: https://developer.mozilla.org/en-US/docs/Web/API/setTimeout#reasons_for_delays_longer_than_specifiedThe text was updated successfully, but these errors were encountered: