Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCP PubSub Source: wont resume fetching messages if queue was empty for a while (vector needs to be restarted) #20324

Open
vmm-afonso opened this issue Apr 17, 2024 · 0 comments
Labels
source: gcp_pubsub Anything `gcp_pubsub` source related type: bug A code related bug.

Comments

@vmm-afonso
Copy link

A note for the community

  • Please vote on this issue by adding a 馃憤 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

We are using Vector (v.0.37.0 of helm chart running in gke) to fetch events from a pubsub topic to an elasticsearch (running inside a gke cluster) sink. Problem here is that our pubsub topic only receives messages when a CloudRun instance runs, which is not always and not necessarily at fixed times and it seems that Vector stops retrying to process events from this source if the queue is empty for a while. I know I might be able to get around this problem by changing the retention time on the topic and scheduling restarts to the vector pod but that is the costly option. Ideally vector would constantly retry connecting to this source every x minutes in case new messages were added to the queue.

Configuration

api:

  address: 127.0.0.1:8686
  enabled: true
  playground: false

data_dir: /vector-data-dir

sinks:

  elasticsearch:
    api_version: auto
    auth: <auth>
    bulk:
      index: <index_name>-%Y-%m-%d
    compression: none
    doc_type: _doc
    endpoints:
    - <endpoint>
    healthcheck: true
    inputs:
    - transform_json
    mode: bulk
    suppress_type_name: true
    type: elasticsearch

sources:

  pubsub:
    endpoint: https://pubsub.googleapis.com
    project: <gcp_project>
    subscription: <pubsub_subscription>
    type: gcp_pubsub

transforms:

  transform_json:
    type: remap
    inputs:
      - elasticsearch
    drop_on_abort: true
    source: |
       . = parse_json!(.message)

Version

CHART: vector-0.32.0 APP VERSION: 0.37.0-distroless-libc

Debug Output

No response

Example Data

No response

Additional Context

No response

References

No response

@vmm-afonso vmm-afonso added the type: bug A code related bug. label Apr 17, 2024
@jszwedko jszwedko added the source: gcp_pubsub Anything `gcp_pubsub` source related label Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
source: gcp_pubsub Anything `gcp_pubsub` source related type: bug A code related bug.
Projects
None yet
Development

No branches or pull requests

2 participants