Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Avoid initializing the snowplow tracker when usage stats are disabled or emitters can't be initialized #8256

Closed
jaceksan opened this issue Nov 7, 2023 · 13 comments

Comments

@jaceksan
Copy link

jaceksan commented Nov 7, 2023

Feature scope

CLI (options, error messages, logging, etc.)

Description

I need to share the same virtual env with Meltano and dbt, because I want to orchestrate them with Dagster.

Currently, the snowplow-tracker is required by both Meltano and dbt, but they require different incompatible versions.

Please, stop initializing the tracker if the usage stats are disabled.

@tayloramurphy
Copy link
Collaborator

@edgarrmondragon I know you did work on this. As I understand it there's not much we can do here b/c even though no events are sent the package is still there. Would we have to ship a version of Meltano w/o snowplow as a dependency to achieve this?

@edgarrmondragon
Copy link
Collaborator

@edgarrmondragon I know you did work on this. As I understand it there's not much we can do here b/c even though no events are sent the package is still there. Would we have to ship a version of Meltano w/o snowplow as a dependency to achieve this?

I think we should probably catch any errors raised during the initialization of the tracker, log an event at the warning level, then continue to execute the application. The biggest change would be updating all the places where the tracker is called to fire an event to make it instead a no-op.

Related:

@WillDaSilva
Copy link
Member

@edgarrmondragon To avoid updating every piece of code that calls/uses the tracker, we could stub its functionality within our tracker class if the underlying Snowplow tracker fails to initialize. The code that calls the tracker doesn't need to ensure it actually runs, so no functionality should be affected, save for the telemetry being disabled of course.

@edgarrmondragon
Copy link
Collaborator

@edgarrmondragon To avoid updating every piece of code that calls/uses the tracker, we could stub its functionality within our tracker class if the underlying Snowplow tracker fails to initialize. The code that calls the tracker doesn't need to ensure it actually runs, so no functionality should be affected, save for the telemetry being disabled of course.

@WillDaSilva That makes sense! So we could

  1. Catch any exceptions when initializing emitters here:
    for endpoint in endpoints:
    if not check_url(endpoint):
    logger.warning("invalid_snowplow_endpoint", endpoint=endpoint)
    continue
    parsed_url = urlparse(endpoint)
    emitters.append(
    Emitter(
    endpoint=parsed_url.hostname + parsed_url.path,
    protocol=parsed_url.scheme or "http",
    port=parsed_url.port,
    request_timeout=request_timeout,
    ),
    )
  2. Log a warning that the tracker will not be initialized

With this, Tracker.can_track will already make calls from the cli code no-ops.

cc @jaceksan

@edgarrmondragon edgarrmondragon changed the title feature: avoid initializing the snowplow tracker when usage stats are disabled feature: Avoid initializing the snowplow tracker when usage stats are disabled or emitters can't be initialized Nov 8, 2023
@dresslife-shbh
Copy link

@edgarrmondragon can i work on this?

@edgarrmondragon
Copy link
Collaborator

@edgarrmondragon can i work on this?

Yes, go for it 🙂

@edgarrmondragon
Copy link
Collaborator

edgarrmondragon commented Nov 24, 2023

Note that this will only fix some of the issues of installing dbt-core in the same virtual environment as Meltano. For example:

$ virtualenv venv
$ source venv/bin/activate
(venv) $ pip install meltano dbt-core
(venv) $ meltano init --no-usage-stats --force .
...
ImportError: cannot import name 'SelfDescribing' from 'snowplow_tracker'

The core issue is that both snowplow-tracker (required by meltano) and minimal-snowplow-tracker (required by dbt-core) install to the same path, i.e. site-packages/snowplow_tracker.

@jaceksan
Copy link
Author

I try to develop a solution based on your notes.
I may take some time, I am very busy with my official duties, you know ;-)

@edgarrmondragon
Copy link
Collaborator

I try to develop a solution based on your notes. I may take some time, I am very busy with my official duties, you know ;-)

No worries @jaceksan. I see you linked a dbt issue where they discuss bumping their own tracker package. I think that'd fix all the issues folks currently face, so hopefully it gets shipped 😄 (though dbt-labs/dbt-core#8680 is probably ngmi).

jaceksan added a commit to jaceksan/meltano that referenced this issue Dec 5, 2023
Closes issue meltano#8256

It may be a temporary woraround -
it may be removed once dbt-labs/dbt-core#8680 is merged.
jaceksan added a commit to jaceksan/meltano that referenced this issue Dec 5, 2023
Closes issue meltano#8256

It may be a temporary workaround -
it may be removed once dbt-labs/dbt-core#8680 is merged.
@nauxliu
Copy link

nauxliu commented Dec 31, 2023

I'm running into the same problem when trying to orchestrate dbt and meltano with Dagster.
Any updates on this? Or is there a workaround?

@edgarrmondragon
Copy link
Collaborator

I'm running into the same problem when trying to orchestrate dbt and meltano with Dagster. Any updates on this? Or is there a workaround?

@nauxliu I'm not familiar with dagster to know how you'd execute Meltano or dbt from within, but the recommendation at the moment is to ensure Meltano and dbt are installed each in its own separate virtual environment if possible.

jaceksan added a commit to jaceksan/meltano that referenced this issue Jan 30, 2024
Closes issue meltano#8256

It may be a temporary workaround -
it may be removed once dbt-labs/dbt-core#8680 is merged.
jaceksan added a commit to jaceksan/meltano that referenced this issue Jan 30, 2024
Closes issue meltano#8256

It may be a temporary workaround -
it may be removed once dbt-labs/dbt-core#8680 is merged.
jaceksan added a commit to jaceksan/meltano that referenced this issue Jan 30, 2024
Closes issue meltano#8256

It may be a temporary workaround -
it may be removed once dbt-labs/dbt-core#8680 is merged.
jaceksan added a commit to jaceksan/meltano that referenced this issue Jan 30, 2024
Closes issue meltano#8256

It may be a temporary workaround -
it may be removed once dbt-labs/dbt-core#8680 is merged.
jaceksan added a commit to jaceksan/meltano that referenced this issue Jan 30, 2024
Closes issue meltano#8256

It may be a temporary workaround -
it may be removed once dbt-labs/dbt-core#8680 is merged.
@edgarrmondragon
Copy link
Collaborator

FYI starting with dbt 1.8.0b3, it should be possible to run Meltano and dbt side by side in the same virtualenv, provided that Meltano is installed after dbt:

$ uv venv .venv --python python3.11
$ source .venv/bin/activate
$ (dbt-meltano) uv pip install 'dbt-core>=1.8.0b3'
$ (dbt-meltano) uv pip install meltano
$ (dbt-meltano) python -c 'from dbt.tracking import tracker'

@edgarrmondragon
Copy link
Collaborator

FYI starting with dbt 1.8.0b3, it should be possible to run Meltano and dbt side by side in the same virtualenv, provided that Meltano is installed after dbt:

$ uv venv .venv --python python3.11
$ source .venv/bin/activate
$ (dbt-meltano) uv pip install 'dbt-core>=1.8.0b3'
$ (dbt-meltano) uv pip install meltano
$ (dbt-meltano) python -c 'from dbt.tracking import tracker'

With this in mind, I'll close this issue. Feel free to comment if you're still encountering issues with the Snowplow tracker when dbt and Meltano are installed together.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants