Improve file size estimate #205

sindresorhus · 2020-10-15T23:59:29Z

I tried out many different algorithms and variants, and in the end, found that this one produced the most accurate estimate:

Get all the normal frame time codes.
Pick 5 consecutive samples from 5 evenly distributed places in the video.
Convert with current settings.
Divide by the original frame count compared to the one used in the estimate.

Fixes #41

Here is a build: Gifski - with estimate.zip

Please try it out on various videos and check the estimate compared to the final result. The most important part is that the estimate never shows less than the actual file.

Note: The code and look are not done. It should be cleaned up a lot, but I would like to finalize the algorithm first.

Some potential optimization would be to run the 5 samples concurrently, but I don't plan to do that in this PR. I'll add a TODO comment.

Open questions:

Should we still show the naive estimate (the one we currently use) while the good estimate is being generated? Currently, the estimate takes from 5 - 15 seconds. The downside of showing the naive one is that it can be woefully incorrect sometimes.

kornelski · 2020-10-16T10:46:51Z

Looks good.

I see you're adding 10% just in case. I have an idea to make it more scientific: measure frame sizes in bytes, and compute standard deviation of frame sizes. If the deviation is large, then the estimate is uncertain and should be inflated. If all frame sizes are about the same, then the estimate is likely to be accurate.

kornelski · 2020-10-16T10:49:00Z

Alternative solution, which I think we've discussed previously, is to start the actual final conversion in the background, and use it for the estimate. When user presses start, instead of restarting, just reuse the in-progress conversion. This will have disadvantage of using frames from the beginning for the estimate, but OTOH it will make conversion seem faster, since it will get a head start.

sindresorhus · 2020-10-16T12:50:58Z

I have an idea to make it more scientific: measure frame sizes in bytes, and compute standard deviation of frame sizes. If the deviation is large, then the estimate is uncertain and should be inflated. If all frame sizes are about the same, then the estimate is likely to be accurate.

That's a good idea. I'll try it out.

This will have disadvantage of using frames from the beginning for the estimate

I tried that too and the estimate was much worse. We really need to evenly spread out samples to get an accurate estimate. I'd rather have a more accurate estimate over a slightly shorter conversion time.

sunshinejr · 2020-10-20T10:26:40Z

@sindresorhus this is looking really good! tried few recordings from apps (this is mostly my use case for Gifski), and the file size was always a bit bigger, but not that much. E.g. I saw naive 59mb, then updated to 35,1 and then it generated 34,9. This is a really big improvement for me. And personally, I'd skip the "naive" and maybe add a loader or a "Estimating filesize" text?

kornelski · 2020-10-20T14:01:04Z

Regarding naive estimate:

Show a min-max range. "20MB-50MB" makes it clearer how imprecise it is.
Once you get any better estimate, remember the ratio between the naive and the better estimate. Apply that ratio to later naive estimates. This will give you decent estimates real time for changing parameters.

sindresorhus · 2020-10-21T23:13:53Z

Show a min-max range. "20MB-50MB" makes it clearer how imprecise it is.

How should we get the range? Just expand the current naive estimate both ways to get a lower/upper bound, or do you have something more clever in mind?

Once you get any better estimate, remember the ratio between the naive and the better estimate. Apply that ratio to later naive estimates. This will give you decent estimates real time for changing parameters.

That's a good idea.

kornelski · 2020-10-22T16:31:05Z

For lower/upper guesstimate I had in mind plugging in different constants/assumptions into the algorithm.

sindresorhus · 2020-10-24T14:13:23Z

For lower/upper guesstimate I had in mind plugging in different constants/assumptions into the algorithm.

Let's continue this in #130.

sindresorhus · 2020-10-24T14:14:35Z

I don't have time to implement and test all these things right now, but I've opened an issue to track them: #211

sindresorhus · 2020-10-24T14:14:53Z

I think it's more important to get this out there now. A lot of people have complained about inaccurate estimates.

Improve file size estimate

83faff3

sindresorhus requested review from kornelski and sunshinejr October 15, 2020 23:59

Minor tweaks

9160872

Improve UI

e51a46e

sindresorhus mentioned this pull request Oct 24, 2020

Display estimate as a range #130

Open

sindresorhus mentioned this pull request Oct 24, 2020

Further improvements to file size estimate #211

Open

5 tasks

Remove logging

0ef2cb0

sindresorhus merged commit e6c97bc into master Oct 24, 2020

sindresorhus deleted the better-estimate branch October 24, 2020 14:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve file size estimate #205

Improve file size estimate #205

sindresorhus commented Oct 15, 2020 •

edited

kornelski commented Oct 16, 2020

kornelski commented Oct 16, 2020

sindresorhus commented Oct 16, 2020 •

edited

sunshinejr commented Oct 20, 2020

kornelski commented Oct 20, 2020 •

edited

sindresorhus commented Oct 21, 2020

kornelski commented Oct 22, 2020

sindresorhus commented Oct 24, 2020

sindresorhus commented Oct 24, 2020

sindresorhus commented Oct 24, 2020

Improve file size estimate #205

Improve file size estimate #205

Conversation

sindresorhus commented Oct 15, 2020 • edited

kornelski commented Oct 16, 2020

kornelski commented Oct 16, 2020

sindresorhus commented Oct 16, 2020 • edited

sunshinejr commented Oct 20, 2020

kornelski commented Oct 20, 2020 • edited

sindresorhus commented Oct 21, 2020

kornelski commented Oct 22, 2020

sindresorhus commented Oct 24, 2020

sindresorhus commented Oct 24, 2020

sindresorhus commented Oct 24, 2020

sindresorhus commented Oct 15, 2020 •

edited

sindresorhus commented Oct 16, 2020 •

edited

kornelski commented Oct 20, 2020 •

edited