Logs: Allow the registered processor to be an independent pipeline #4010

pellared · 2024-04-24T08:57:50Z

What are you trying to achieve?

I want to allow users can have different processing for different exporters.
I want to allow users to use a processor for sampling.
For Go: I want to reduce the amount of heap allocations.

For Go Logs SDK, we want to have each registered processor to work as a separate, independent log record processing pipeline.

The records passed to OnEmit are passed by value (not by pointer). Therefore, any modification made to the record passed to a processor's OnEmit method will not be applied to the subsequent registered processor.

Users can use processor decorators to pass a modified log record to a wrapped processor, e.g.:

type mutatingProcessor struct {
  wrapped log.Processor
}

func (mutatingProcessor p) OnEmit(ctx context.Context, r Record) error {
  r.SetBody(log.StringValue("REDACTED"))
  return p.wrapped.OnEmit(ctx, r.Clone())
}

Thanks to it, the users can register two pipelines which do not interfere with each other. The user can have one pipeline which redacts secrets and exports (without batching) to stdout and then a second pipeline which exports via OTLP with batching (and without redaction):

log.NewLoggerProvider(
  log.WithProcessor(mutatingProcessor{ log.NewSimpleProcessor(stdoutExporter) }), // STDOUT exporter with simple processor, the log records' body fields are redacted
  log.WithProcessor(log.NewBatchProcessor(otlpExporter)), // OTLP exporter with a batch processor
)

What is more the Processor could be used then also to use it for sampling decisions. E.g. this is how a "minimum severity level processor" could look like: https://github.com/open-telemetry/opentelemetry-go/blob/65e032dd16b20e376eae5a7a5956fa811783389b/sdk/log/min_sev.go

By having the Processors independent by default the user could still use different composition patterns (fan-in, fan-out, middleware) to get the desired processing pipeline similarly to what can be achieved e.g. using slog.Handler (example helpers: https://github.com/samber/slog-multi).

At last, this design usually reduces the number of heap allocations (by 1 per each emit). If we pass a pointer to Record to allow mutations, then Go will allocate it on the heap (I was making a lot of experiments on the Processor interface and Record struct when working on open-telemetry/opentelemetry-go#4955).

Additional context.

The specification says that processors are part of the pipeline:

LogRecordProcessors can be registered directly on SDK LoggerProvider and they are invoked in the same order as they were registered.

Each processor registered on LoggerProvider is part of a pipeline that consists of a processor and optional exporter.

"part of a pipeline" is unclear.

The specification requires that log processors be provided a LogRecordProcessor:

OnEmit

[...]

Parameters:

logRecord - a ReadWriteLogRecord for the emitted LogRecord.

Does it mean that a modification done in one registered exporter MUST be applied (and visible) to the next registered processor?
If so then the OTel Go Logs SDK design would be against the specification. But then, how the user can have different processing for different exporters?

Or does it mean that we only mean to be able to modify the record locally so that e.g. we can pass it to a decorated processor?
If so then should we clarify the specification?

The proposed design is more usable, safer, more performant (for Go) compared to a design where we a log record is passed by a pointer instead passing by value. I believe that the proposed design is "in the spirit" of OpenTelemetry even though it would differ from the "tracing processors" design.

It is worth mentioning that this is the way how stable OTel C++ and alpha OTel Rust designed the Log Processors for performance reasons (to avoid unnecessary allocations).

Side note: This design would also allow more flexibility on handling Logger.Enabled if it would be supported in SDK by adding Enabled method to Processor (related to #3917).

Related issue: open-telemetry/opentelemetry-go#5219

Proposed solution

My proposal is to update the specification to allow such SDK design/implementation for (at least) performance reasons.

Alternative solution.

Even currently it should be possible to have different processing for different exporters and a processor for sampling. Nothing prevents users to create a processor decorator and "clone" records before calling wrapped processor (to mutate only for a "single" pipeline) or not call a wrapped processor (to achieve sampling).

I consider it more like a hack and for Go such design would be less performant as it would always cause the log record to be allocated on the heap.

I believe for other languages that are more "reference type oriented" it is not common to have so much care around memory allocations. However, I find it very important at least for C++, Go, Rust.

The text was updated successfully, but these errors were encountered:

pellared · 2024-04-24T09:12:18Z

CC @open-telemetry/specs-logs-approvers

pellared · 2024-05-06T07:37:15Z

I think it is designed in the same way in Rust Logs SDK (alpha status): https://github.com/open-telemetry/opentelemetry-rust/blob/main/opentelemetry-sdk/src/logs/log_processor.rs

@open-telemetry/rust-maintainers, can you please confirm?

lalitb · 2024-05-06T17:11:07Z

@open-telemetry/rust-maintainers, can you please confirm?

In the current otel-rust implementation, LogRecord copy/clone is passed to each processor, and they can modify it individually without affecting the other processor. However, there are some discussions to change it to the mutable reference to enable one processor result to be passed to another.

lalitb · 2024-05-06T17:41:53Z

And to add, the otel-cpp processor pipelines are also independent of each other. This is more to do with the fact that otel-cpp logs implementation avoids any intermediate (temporary) LogRecord allocation in the SDK, and the log information as provided by the application is directly serialized in the exporter format.

pellared · 2024-05-07T06:37:16Z

This is more to do with the fact that otel-cpp logs implementation avoids any intermediate (temporary) LogRecord allocation in the SDK

This is also one of the reasons behind the current OTel Go LogProcessor design.

I believe for other languages that are more "reference type oriented" this proposal may seem strange where it is not common to have so much care around memory allocations. However, I find it very important at least for C++, Go, Rust.

My proposal is to update the specification to allow such SDK design/implementation for (at least) performance reasons.

pellared · 2024-05-07T15:30:37Z

We can also say that if the "log record processor" is designed in such way that it accepts log records by value then it should be named "log record handler" (instead of "processor") to make the processing difference more visible.

tsloughter · 2024-05-07T20:29:52Z

I wanted to add that I just don't like how Processor is seemingly different for Logs than Spans. I'd much rather see Log Processors match Span Processors.

I also noticed Log Processors have similar language to what was once in Span Processor's spec but was removed, "The SDK MUST allow users to implement and configure custom processors and decorate built-in processors for advanced scenarios such as enriching with attributes." -- Span Processors also used to talk about "decorators" but it was removed because it was never fleshed out.

Defining decorators, or whatever, and how individual pipelines are built for spans and logs that can mutate the data as it flows through the individual pipeline makes much more sense to me than to have Log Processors get a ReadWriteRecord.

If nothing else maybe that "decorator" wording should be removed as it sounds like you are to use Processors themselves to "decorate" logs? Does anyone implement "decorators" on logs?

I'd also expect the diagram for Log Processors to be different from Span Processors to make clear that log records may change. Right now it looks as if there are 2 Processors (simple and batch) each receiving the exact same LogRecord that they then pass to an exporter. I know it is be showing that the processor could be one or the other and not that they both are in use there, but I think it could be misinterpreted.

tigrannajaryan · 2024-05-08T15:45:06Z

This was discussed in spec meeting and also in the TC meeting.

I believe the spec is not clear enough and can be interpreted in different ways leading to very different implementations (e.g. chained processors that can mutate data and see mutations of previous processor or parallel processors that don't see mutations of other processors).

We likely need to fix the spec. It is in a stable doc but is vague enough that I think it is a bug and should be fixed.

Since the spec is vague I am also worried that existing implementations in different language SDKs may have interpreted it differently and have made different choices. IMO, we need to start with auditing existing implementations and see where we stand currently and then decide how exactly we want to fix the spec (one of the options is that both "chained-allow-mutation" and "fanout-copy-to-parallel" should be possible like they are in the Collector pipelines).

CC @jmacd @jack-berg if you want to add something.

jmacd · 2024-05-08T16:13:48Z

Thank you @tigrannajaryan, I think this is a good summary. I have been working on problems associated with Samplers and have identified a number of connected issues. There is a strong desire to improve the ways that we compose SDKs and we recognize the need to improve our specification for how (a) components are plugins for modifying telemetry as-it-happens, such that all observers see their side-effects, (b) components are composed into independent pipelines capable of filtering and mutating data in pipeline-specific ways, (c) SDKs are permitted to offer memory-optimized SDK-APIs that recognize both of these factors (e.g., use of structs vs references).

I volunteered to work on this topic and will return with a proposal for fixing this bug in the spec. Thanks for bringing this up in the first place, @pellared.

pellared · 2024-05-10T07:58:53Z

Notice that the implementations which use a "log record references" (like Java, JavaScript, Python, .NET) can also achieve independent processor pipeline by creating a processor decorator which makes a deep-copy of the record and pass it to the decorated processor.

My proposal is to leave the implementation choice to the language SIG.

pellared · 2024-05-16T13:51:43Z

During the 05/14/2024 Spec SIG meeting @jmacd summarized:

We discussed this in a technical committee. We think it's an actual bug. [...] What we're going to conclude in that particular case is that there's freedom at the SDK level to change how the implementation of these processors works. There's no requirement that you have a reference that is a read-only pointer to a thing.

Should I create a PR to include this in the Logs SDK spec? I think this is inline with #4010 (comment)

pellared · 2024-05-22T18:26:09Z

Before proposing anything to the spec I plan to also explore the consequences of defining the log.Processor method as:

OnEmit(ctx context.Context, record Record) (Record, error)

instead of

OnEmit(ctx context.Context, record Record) error

The the SDK would pass the record returned by a processor to the subsequent registered processor.

pellared · 2024-05-23T09:27:59Z

Assigning myself per conversation with @jmacd.

pellared · 2024-05-23T09:59:48Z

Personally, I find

OnEmit(ctx context.Context, record Record) (Record, error)

as a bad design as the user have two ways of returning a mutated log record

by using a decorator pattern and passing the mutated record to the the wrapped processor
by returning the mutated record so that it is passed to the subsequent registered processors

The one bootstrapping the logger provider would have to know how it working to make sure it is properly composed to achieved the desired log processing pipeline.

Additionally, I feel it is strange that "sink processors" such as BatchProcessor nad SimpleProcessor would have to return anything.

I feel that just relying on decorators for processing is more consistent. If the users would like to mutate the record for many processors they could simply use/create a FanOutProcessor:

type FanoutProcessor struct {
	processors []log.Processor
}

// Fanout distributes records to multiple log processors.
func Fanout(processors...log.Processor) *FanoutProcessor {
	return &FanoutProcessor{
		processors : processors,
	}
}

func (p *FanoutProcessor) OnEmit(ctx context.Context, record Record) error {
	for _, processor := range p.processors {
		processor.OnEmit(ctx, record)
	}
}

Then the usage could look like following (mutate for all exporters):

log.NewLoggerProvider(
  log.WithProcessor(mutatingProcessor{ Fanout( // both exporters have mutated log records
    log.NewSimpleProcessor(stdoutExporter),
    log.NewBatchProcessor(otlpExporter),
  }),
)

Other users could still use the same mutatingProcessor if they would like to mutate log record for one exporter:

log.NewLoggerProvider(
  log.WithProcessor(mutatingProcessor{ log.NewSimpleProcessor(stdoutExporter) }), // STDOUT exporter with simple processor, the log records' body fields are redacted
  log.WithProcessor(log.NewBatchProcessor(otlpExporter)), // OTLP exporter with a batch processor
)

When following this design, the there is only one way of "processing/mutating" and the one who composes the log provider decides how the log processing pipeline should look like.

I want to also say that I find such design idiomatic as it follows the design of https://pkg.go.dev/log/slog#Handler and creating pipelines is very similar. What is more, the official Go Wiki has https://github.com/samber/slog-multi in its Resources for slog which uses the same patterns for composing log pipelines/handlers.

This is also the way how stable OTel C++ and alpha OTel Rust are designed.

pellared · 2024-05-23T10:16:05Z

@open-telemetry/technical-committee, is this still triage:deciding:community-feedback ?
Can you consider making it as accepted so that I can come up with a PR?

EDIT: I created #4062

pellared added the spec:logs Related to the specification/logs directory label Apr 24, 2024

pellared mentioned this issue Apr 24, 2024

Log Records passed to log processor OnEmit is non-mutable open-telemetry/opentelemetry-go#5219

Open

austinlparker added the triage:deciding:community-feedback label Apr 30, 2024

jmacd mentioned this issue May 9, 2024

Sensitive Data Redaction open-telemetry/oteps#255

Draft

jmacd mentioned this issue May 20, 2024

Add ability for SpanProcessor to mutate spans on end #4024

Open

5 tasks

pellared self-assigned this May 23, 2024

This was referenced May 28, 2024

Add isolating log record processor #4062

Open

Log record mutations are visible in next registered processors #4065

Open

[PoC] Change Processor.OnEmit to return record passed to next registered processor open-telemetry/opentelemetry-go#5470

Draft

pellared mentioned this issue Jun 5, 2024

Log record mutations do not have to be visible in next registered processors #4067

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logs: Allow the registered processor to be an independent pipeline #4010

Logs: Allow the registered processor to be an independent pipeline #4010

pellared commented Apr 24, 2024 •

edited

OnEmit

pellared commented Apr 24, 2024

pellared commented May 6, 2024

lalitb commented May 6, 2024

lalitb commented May 6, 2024 •

edited

pellared commented May 7, 2024 •

edited

pellared commented May 7, 2024

tsloughter commented May 7, 2024

tigrannajaryan commented May 8, 2024

jmacd commented May 8, 2024 •

edited

pellared commented May 10, 2024

pellared commented May 16, 2024 •

edited

pellared commented May 22, 2024 •

edited

pellared commented May 23, 2024

pellared commented May 23, 2024 •

edited

pellared commented May 23, 2024 •

edited

Logs: Allow the registered processor to be an independent pipeline #4010

Logs: Allow the registered processor to be an independent pipeline #4010

Comments

pellared commented Apr 24, 2024 • edited

OnEmit

pellared commented Apr 24, 2024

pellared commented May 6, 2024

lalitb commented May 6, 2024

lalitb commented May 6, 2024 • edited

pellared commented May 7, 2024 • edited

pellared commented May 7, 2024

tsloughter commented May 7, 2024

tigrannajaryan commented May 8, 2024

jmacd commented May 8, 2024 • edited

pellared commented May 10, 2024

pellared commented May 16, 2024 • edited

pellared commented May 22, 2024 • edited

pellared commented May 23, 2024

pellared commented May 23, 2024 • edited

pellared commented May 23, 2024 • edited

pellared commented Apr 24, 2024 •

edited

lalitb commented May 6, 2024 •

edited

pellared commented May 7, 2024 •

edited

jmacd commented May 8, 2024 •

edited

pellared commented May 16, 2024 •

edited

pellared commented May 22, 2024 •

edited

pellared commented May 23, 2024 •

edited

pellared commented May 23, 2024 •

edited