Proposed Best Practice: Metrics #16

mdelaurentis · 2017-03-15T19:23:30Z

As a best practice, a Tap or Target should emit structured logging to a file, which can be used for monitoring or other analysis.

Both a Tap and a Target should accept an optional "metrics_path" key in its configuration file. If that key is present, the Tap or Target should write out a JSON map every time it makes an external request, for example to an API or database. If a Tap or Target that supports metrics is run without a "metrics_path" key, it should not fail, it should just not write metrics.

It is expected that whatever program runs the Tap or Target will provide the name of a file to write the metrics to, and that some program will consume the metrics from that location and store them somewhere. We could use either a named pipe or a regular file. If a named pipe is used, the wrapper program must ensure that something reads from the pipe continuously so as not to block the Tap. If a regular file is used, the wrapper program must ensure that it does not cause the disk to fill up.

The metric structure is designed to specifically capture information relevant to the types of external calls that a Tap or Target would typically make. All fields are optional. If some of these fields are hard for a particular Tap to track, we would rather have it report something than nothing.

success - Whether the call succeeded
duration - How long the request or call took, in milliseconds
num_bytes - Size of request or call payload in bytes
num_records - Size of request or call payload in terms of number of records
tags - (optional) Mapping from string keys to string values providing additional description of the request

Examples

Successful API call from a Tap:

{"success": true,
 "duration": 432,
 "num_bytes": 12345
 "num_records": 1234
 "tags": {
     "endpoint": "/users/{}/orders",
     "http_method": "GET",
     "http_response_code": 200
 }
}

Successful SELECT statement from a Tap:

{"success": true,
 "duration": 432,
 "num_bytes": 12345
 "num_records": 1234
 "tags": {
     "table": "users",
      "sql_command": "SELECT",
 }
}

Questions

Would it be simpler to get rid of the "tags" key and instead say that the metrics can include any arbitrary properties, which will be treated as tags?
This metrics format is specific to timing and counting requests to external systems. It also bundles several values (count, duration, success, bytes, records) into one data structure. Would it be better to emit a separate metric object for each of those values?

The text was updated successfully, but these errors were encountered:

karstendick · 2017-03-20T15:48:12Z

I'd rename duration to duration_ms to be explicit.

For SaaS integrations, it may be helpful to specify both an endpoint tag and a table tag, as some endpoints provide data for multiple tables.

Some endpoints provide no data, only metadata. In that case, would num_records be 0? And would num_bytes be the size of the http response or still 0?

I'd ditch the tags key and just make the interface be a bag of properties to keep things simple and flexible. I'd rather keep each metric record together rather than emitting a separate object for each key-value pair. It's meaningful that all the values refer to the same event (e.g. one http query or one SQL query), and it's worthwhile to keep that in one data point.

mdelaurentis · 2017-03-20T18:33:38Z

Thanks for the feedback @karstendick . I agree with most of your suggestion.

rename duration to duration_ms

yes

For SaaS integrations, it may be helpful to specify both an endpoint tag and a table tag, as some endpoints provide data for multiple tables.

That's a good point. If we ditch the tags key, then I suppose we would have both "endpoint" and "table" keys, and Taps could decide whether to use one, both, or neither. Alternatively, should we use a more generic name for the source of the data? Maybe just "source". That could be a URL or a table name.

Some endpoints provide no data, only metadata. In that case, would num_records be 0? And would num_bytes be the size of the http response or still 0?

I suppose either 0 or just leave it null. That way we can count that we made a request, but not suggest that that request would impact the record count.

I'd ditch the tags key and just make the interface be a bag of properties to keep things simple and flexible. I'd rather keep each metric record together rather than emitting a separate object for each key-value pair. It's meaningful that all the values refer to the same event (e.g. one http query or one SQL query), and it's worthwhile to keep that in one data point.

Yeah, I agree. Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposed Best Practice: Metrics #16

Proposed Best Practice: Metrics #16

mdelaurentis commented Mar 15, 2017 •

edited

karstendick commented Mar 20, 2017

mdelaurentis commented Mar 20, 2017

Proposed Best Practice: Metrics #16

Proposed Best Practice: Metrics #16

Comments

mdelaurentis commented Mar 15, 2017 • edited

Examples

Questions

karstendick commented Mar 20, 2017

mdelaurentis commented Mar 20, 2017

mdelaurentis commented Mar 15, 2017 •

edited