stopAndAccumulate/Cancel methods of FOG TimingDistributions coerce null timerids to 0
Categories
(Data Platform and Tools :: Glean: SDK, defect, P3)
Tracking
(Not tracked)
People
(Reporter: perry.mcmanis, Assigned: perry.mcmanis)
References
Details
Attachments
(1 file)
Based on our discovery in 1882110 we notice unexpected behavior in FOG.
Specifically, null
appears to coerce to 0
in the TimingDistribution metrics stopAndAccumulate
and cancel
methods. This is an issue because 0
is actually not only a valid timer ID, it is the first one assigned.
As a result, sometimes we are calling stop/cancel in abstracted stopwatches, which causes submission of a null
timerid. If 0 was already stopped, this results in an invalidState
error being recorded and in fact that is precisely what we observe with the two Fenix metrics Geckoview.page_reload_time
and Geckoview.page_load_time
There are a few related things that likely need to come from this:
- 1882110 mitigated in the short term by guarding stop/cancel
- [this bug] See if FOG can be modified so this behavior does not occur
- start timerIDs at 1 if not, in the SDK itself.
Comment 1•2 years ago
|
||
Maybe we can fix this in FOG to prevent or detect the coercion of null
to 0
. This is a correct conversion as it follows the WebIDL spec (which uses ECMA's ToNumber algorithm). At the cost of making the interface less sensible to someone reading it, we could try other mechanisms:
- Use a nullable timerid, so
null
gets reported to the C++ impl as null. It does give us a problem about what to do withnull
in C++ though (taps the bug 1691073 sign). - Use an opaque
TimerId
structure of some kind. This more closely follows the intent of the interface and it lets us hide the implementation of TimerId. It'd likely be a weightier thing to construct, store, and use these objects than a primitive.
But we could also (or instead) change the SDK to make errors of this sort harder to stumble over.
- We could, for instance, reserve TimerID
0
and never supply it. Then any method receiving it would reject it asinvalid_value
and be about its business. Not too different from the status quo where the misuse ofnull
would maybe work at most once, but then beinvalid_state
from then on. - We could make the TimerId type properly opaque at the glean-core layer. LBs would then have to decide how to supply timerids to their consumers if they're unable to supply opaque types themselves, ensuring they have solitary responsibility for managing any primitive values that might be coerced. This approach could have its down sides, though (taps the bug 1691073 sign again).
Given the constraints on a FOG-level solution and the rarity of coming across this in the wild at present, this is more properly an SDK bug to begin.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Comment 2•2 years ago
|
||
Comment 3•2 years ago
|
||
Assignee | ||
Updated•2 years ago
|
Description
•