Skip to content
This repository has been archived by the owner on Mar 30, 2023. It is now read-only.

[feature request] create a dashboard for clone data #288

Open
caniszczyk opened this issue Mar 8, 2021 · 7 comments
Open

[feature request] create a dashboard for clone data #288

caniszczyk opened this issue Mar 8, 2021 · 7 comments
Assignees

Comments

@caniszczyk
Copy link

GitHub has this info available via their builtin dashboards, e.g., https://github.com/cncf/devstats/graphs/traffic

I don't know what the API looks like to pull this but since we have data for stars and forks, maybe we add that to the dashboard: https://kubevirt.devstats.cncf.io/d/3/stars-and-forks-by-repository?orgId=1

Maybe we call it 'stars-forks-and-clones' ;)? or a separate one for just clones

@lukaszgryglicki
Copy link
Member

I'll research this on Friday, is this OK? We don't use GitHub API in DevStats - we use GitHub archives data.

@caniszczyk
Copy link
Author

caniszczyk commented Mar 9, 2021 via email

@lukaszgryglicki
Copy link
Member

lukaszgryglicki commented Mar 10, 2021

Doing some research, but I'm quite sure we don't have that data in GitHub archives (which is DevStats' data source), created this issue/question/feature request in the meantime to confirm (now I'm digging several hundreds of megabytes of GHA JSONs to see if there were any data format updates to includ ethis info).

@lukaszgryglicki
Copy link
Member

I've checked few huge JSONs with a few grep-like approaches (they're over 2.5G in size when converted from ndjson to a correct JSONs) I don't see any data that makes this feature request possible, will also wait for any feedback on my feature request/issue from the previous post.

All I can consider here is to do a hybrid approach - make DevStats also call GitHub APi to get this data - but even if I do so, I can only get last 14 days clones (see API docs) - so I won't be able to get any historical data.

Should I proceed with that hybrid approach @caniszczyk ? If so - then it will take a rather long time - it's somethign. totally new to be implemented.

Will hold until I get feedback - what do do.

@lukaszgryglicki
Copy link
Member

So @caniszczyk GHA maintainer confirmed that GHA doesn't have that data, so the only possibility is the hybrid approach described here - please let me know if we want to proceed that way? (but I think this is not a really good approach - we cannot get the historical data and we're limited to 14 days days + we need to process GitHubh APi and maintain tokens for few thousands of GitHub repos - this will be slow and actually against a typincal DevStats approach).

@caniszczyk
Copy link
Author

caniszczyk commented Mar 11, 2021 via email

@lukaszgryglicki
Copy link
Member

OK.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants