Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SVG visualisation breaks on dependencies between tasks with the same name #3249

Open
ewallace-RI opened this issue Aug 26, 2023 · 6 comments

Comments

@ewallace-RI
Copy link
Contributor

I recently updated from 3.0.0 to 3.3.0 and found that in the SVG visualiser, my existing graphs are either hopelessly scrambled or fail to render anything except the top task, which turns out to be due to changes made in this PR. There are two basic issues:

  1. Visualisation is broken in the presence of dependencies between tasks with the same name, which the original PR intended to have fall back to the old style. There are two failure modes (details below) - either the same-name dependency is not recognized and the new style gives a nonsensical representation, or it is recognized, and the fallback to the old style fails to render anything.

  2. The new style of visualisation is much less readable in my use cases. I can see how it's an improvement for the example case in the PR, but for a graph where many different tasks with different names are expected to run concurrently, forcing each task onto a separate vertical layer (especially with randomized horizontal placement) makes the graph extremely difficult to parse visually. There should at least be an option to revert to the previous style, independent of any automatic behavior.

The two failure modes in point 1 depend on whether two tasks with the same name depend on each other directly, or only indirectly, separated by other tasks. The indirect case might look something like this:

import luigi

class TestTask(luigi.Task):
    def run(self):
        with self.output().open('w') as f:
            pass

    def output(self):
        return luigi.local_target.LocalTarget(path=f'{self.task_id}.done')



class Foo(TestTask):
    def requires(self):
        return Bar(second_time=True)


class Bar(TestTask):
    second_time = luigi.BoolParameter()
    def requires(self):
        if self.second_time:
            return Baz()

class Baz(TestTask):
    def requires(self):
        return Bar()

Here, we want to run Bar once, then run Baz, then run Bar again, then run Foo. The old style of visualization correctly renders this as a straight line:
image

In the current version, Bar(second_time=False) is not recognized as a dependency of Bar(second_time=True), and they are forced to the same depth, breaking the directedness of the graph:
image

In the case of a direct dependency, it gets worse. If we remove the intermediary Baz() task and let Bar(second_time=True) directly require Bar(second_time=False), like so:

import luigi

class TestTask(luigi.Task):
    def run(self):
        with self.output().open('w') as f:
            pass

    def output(self):
        return luigi.local_target.LocalTarget(path=f'{self.task_id}.done')



class Foo(TestTask):
    def requires(self):
        return Bar(second_time=True)


class Bar(TestTask):
    second_time = luigi.BoolParameter()
    def requires(self):
        if self.second_time:
            return Bar()

we get this:
image

Here, the same-name dependency is recognized, triggering the intended fallback to the old behavior, which computes the positions of all child tasks as (NaN, NaN):

image

@starhel
Copy link
Contributor

starhel commented May 7, 2024

I have encountered the same issue - I have tasks that iteratively trigger themselves (aggregating data in 7-day buckets -> 28-day buckets -> N-day buckets). @dlstadther would there be a chance for you to take a look at it when you have a moment (as reviewer of #3122)?

@dlstadther
Copy link
Collaborator

Based on this issue, i understand there to be rendering issues with the SVG visualization here in Luigi. I have no insight to provide towards a solution. I welcome any contributions to the project and I'll do my best to provide reviews (though time has been in short supply for a few years now as my day job no longer uses Luigi).

@ewallace-RI
Copy link
Contributor Author

To my mind, the solution would simply be to revert the changes from #3122, which broke the existing functionality for the sake of someone's aesthetic preference in their use case. Given that the change had been in place for 2 years by the time I upgraded and noticed it, I figured that would be an unpopular solution, and opted to just overwrite that .js file in my local installation.

I didn't see an obvious way to make the fallback to the old behavior in the presence of dependencies between same-named tasks work while keeping the new behavior otherwise, and since the new behavior is useless to me, I didn't have much motivation to put more effort into finding a fix that would preserve it.

@dlstadther
Copy link
Collaborator

@joooeey , being the author of the PR which is causing issues for the use cases mentioned above by @ewallace-RI and @starhel , do you see a clear path to resolution?

@starhel
Copy link
Contributor

starhel commented May 9, 2024

I've fixed case with (NaN, NaN) in; #3287
For now I have no time to find solution for non-directly dependent tasks.

@joooeey
Copy link
Contributor

joooeey commented May 10, 2024

I don't use Luigi anymore, so I don't have the time to contribute a tested solution. However, I can see that this line in the visualizer code only checks for direct dependencies to tasks of the same name. To also capture indirect dependencies, the $.each loop would have to be adjusted to build up an array of known names. Then child_node.name can be compared with that array.

#3287 is also necessary and it should be checked how that PR interacts with the change I propose in the previous paragraph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants