Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

exec and http sources add a .timestamp field to the event after decoding #20404

Open
brucejxz opened this issue Apr 30, 2024 · 3 comments
Open
Labels
meta: good first issue Anything that is good for new contributors. source: exec Anything `exec` source related source: http_server Anything `http_server` source related type: bug A code related bug.

Comments

@brucejxz
Copy link

brucejxz commented Apr 30, 2024

A note for the community

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

When log namespacing is enabled, I expect only the event to be in .

For the socket source, this is the case no matter if decoding configuration is specified or not.

For the exec and http sources, when decoding configuration is added, there is an additional .timestamp field. Also, when there is no decoding configuration, the message is available inside a .message key rather than at ..

Maybe there is a reason for the inconsistency but it was definitely surprising to me. Also, I'm not sure what this additional timestamp field represents since its value is different to %vector.ingest_timestamp .

Configuration

schema:
  log_namespace: true

sources:
  source_exec:
    type: exec
    command: ['echo', '{"log": "123"}']
    mode: scheduled
  source_http_server:
    type: http_server
    address: 0.0.0.0:80
  source_socket:
    type: socket
    address: 0.0.0.0:9000
    mode: tcp

sinks:
  sink_console:
    inputs:
      - source_*
    type: console
    encoding:
      codec: json

Version

0.37.1

Debug Output

No response

Example Data

Commands to generate data

*exec runs automatically*
curl -XPOST -H "Content-Type: application/json" --data '{"log": "123"}' http://127.0.0.1
echo '{"log": "123"}' | nc localhost 9000

Without decoding

{"message":"{\"log\": \"123\"}"}
{"message":"{\"log\": \"123\"}"}
"{\"log\": \"123\"}"

With decoding

{"log":"123","timestamp":"2024-04-30T21:14:58.942264817Z"}
{"log":"123","timestamp":"2024-04-30T21:15:01.208653497Z"}
{"log":"123"}

Additional Context

No response

References

No response

@brucejxz brucejxz added the type: bug A code related bug. label Apr 30, 2024
@jszwedko jszwedko added source: http_server Anything `http_server` source related source: exec Anything `exec` source related labels May 1, 2024
@jszwedko
Copy link
Member

jszwedko commented May 1, 2024

Thanks for this report @brucejxz . This does seem like a bug. The expected behavior, as you note, is that just the event ends up in . when log namespacing is enabled. The rest of the fields should be in metadata.

This added .timestamp is likely the same an %vector.ingest_timestamp.

@jszwedko jszwedko added the meta: good first issue Anything that is good for new contributors. label May 1, 2024
@brucejxz
Copy link
Author

brucejxz commented May 1, 2024

Hey @jszwedko. Thanks for the reply. Glad to hear that this isn't expected. Also, I added this transform so you can see that the .timestamp field is actually different:

transforms:
  transform_remap:
    inputs:
      - source_*
    type: remap
    source: |
      .@vector = %vector

Which results in:

{"@vector":{"ingest_timestamp":"2024-05-01T20:27:35.876003474Z","source_type":"exec"},"log":"123","timestamp":"2024-05-01T20:27:35.875959041Z"}
{"@vector":{"ingest_timestamp":"2024-05-01T20:27:39.332853800Z","source_type":"http_server"},"log":"123","timestamp":"2024-05-01T20:27:39.332841489Z"}
{"@vector":{"ingest_timestamp":"2024-05-01T20:27:45.064226909Z","source_type":"socket"},"log":"123"}

@jszwedko
Copy link
Member

jszwedko commented May 1, 2024

Ah, yes, sorry I meant to say "mostly the same" since they are likely both set to "now".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta: good first issue Anything that is good for new contributors. source: exec Anything `exec` source related source: http_server Anything `http_server` source related type: bug A code related bug.
Projects
None yet
Development

No branches or pull requests

2 participants