Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Great Work! Also: Appearance Duration in Aggregated Data #1

Open
eric-langenberg opened this issue Sep 14, 2019 · 1 comment
Open

Comments

@eric-langenberg
Copy link

Great work on this project! It's a cool data set and an interesting project.

One small note on the aggregated data sets:

It looks like "appearance duration" is calculated by subtracting first_appearance from last_appearance. Depending on what you want to do with the data, that might not be the best definition, since stories sometimes pop out of and back into the trending data set. (This phenomenon seems to explain high outliers in the "appearance duration" column.)

Another way of measuring "duration" would be to calculate the duration of each row of the original data collected (which it looks like is usually every 5 minutes, but not always), and then sum that duration for each unique story. I suspect this would generally be a more meaningful way to measure duration.

But I don't think this metric was very important for the analyses you were running, and no big deal. Great work on this project.

@jackbandy
Copy link
Collaborator

Hey Eric, I just got around to seeing this note. You are absolutely right that the time subtraction is probably not ideal for all analyses, and I will be mindful of this in any future analyses. Thanks for pointing this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants