Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dvc metrics diff --all: on same branch is empty #10429

Open
MaximilianTunk opened this issue May 16, 2024 · 2 comments
Open

dvc metrics diff --all: on same branch is empty #10429

MaximilianTunk opened this issue May 16, 2024 · 2 comments
Labels
A: metrics Related to dvc metrics bug Did we break something? p3-nice-to-have It should be done this or next sprint

Comments

@MaximilianTunk
Copy link

Bug Report

Description

Hello,
we found that dvc metrics diff --alloutputs nothing, if a_revand b_rev refer to the same git commit. No matter if they are exactly the same or different types of references (HEAD vs branch_name, etc.)

Reproduce

  • setup any dvc stage with metrics.
  • run dvc metrics diff --all $(git rev-parse --abbrev-ref HEAD) HEAD

Expected

output metrics-diff table with all values with diff = 0.0

Environment information

Output of dvc doctor:

-------------------------
Platform: Python 3.8.10 on Linux-6.5.0-28-generic-x86_64-with-glibc2.34
Subprojects:
        dvc_data = 2.22.6
        dvc_objects = 1.4.9
        dvc_render = 0.7.0
        dvc_task = 0.4.0
        scmrepo = 1.6.0
Supports:
        http (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
        https (aiohttp = 3.9.3, aiohttp-retry = 2.8.3),
        s3 (s3fs = 2024.3.1, boto3 = 1.34.51)
Config:
        Global: /home/mtunkowitsch/.config/dvc
        System: /etc/xdg/xdg-ubuntu/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/nvme0n1p6
Caches: local
Remotes: s3
Workspace directory: ext4 on /dev/nvme0n1p6
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/6a15ed68432f0e8e4dba7e407082545a

Additional Information (if any):

Running the debugger, we noticed that metrics/diff.py:diff expects the results of metrics.show() to have the exact rev keys extracted. However metrics/show.py:show uses the brancher to extract the revs to use.
However the brancher groups revs with the same sha and joins them.

This means that when we call dvc metrics diff --all main main the brancher would group main and main and return main,main.
Hence the repo.metrics.show() outputs all metrics with the key main,main and the repo.metrics.diff() doesn't find results for main

@shcheklein shcheklein added the A: metrics Related to dvc metrics label May 19, 2024
@dberenbaum dberenbaum added bug Did we break something? p3-nice-to-have It should be done this or next sprint labels May 20, 2024
@dberenbaum
Copy link
Contributor

@MaximilianTunk Great research! Since you have already dug this far, do you think you would be able to contribute a fix?

@MaximilianTunk
Copy link
Author

@dberenbaum Thank you :) I'm interested in contributing a fix, but i have to admit it'll be my first one ever to an open source github project. I have a really nice colleague that is here to help any time if I need help or have questions, so fixing it should'nt be any problem. If I have any questions or updates about possible solutions, I'll add a comment to this thread!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: metrics Related to dvc metrics bug Did we break something? p3-nice-to-have It should be done this or next sprint
Projects
None yet
Development

No branches or pull requests

3 participants