Make the build reproducible #182

ia0 · 2020-10-09T10:20:40Z

git clone && ./setup.sh && ./deploy should be reproducible at any time for a given commit. This is currently not the case, at least because we don't commit a Cargo.lock.

This is blocking #151. The workflow is currently disabled by #181.

The text was updated successfully, but these errors were encountered:

jmichelp · 2020-10-12T07:41:10Z

This means we should really go for 2 branches:

on dev branch the workflow should reset + compile twice in a row and ensure that the hashes are the same. Otherwise this would mean we include randomly generated data into the firmware.
on master we should only merge from dev and have a workflow on merge that generates a Cargo.lock file and computes the hashes.

WDYT?

ia0 · 2020-10-12T08:52:54Z

This issue is not about a problem in the continuous integration setup nor in the test itself. Both are working fine, although they could be improved as you describe. This issue is about the build not being reproducible as indicated in the title and the description.

Maybe the definition of build is reproducible needs to be clarified to explain the problem and how to check it.

A build is reproducible with time window T (e.g. 1 year) if the following property holds: For any point t in time, for all commits c in the repository, if the time of the commit tc is between t - T and t then build(t, c) = build(tc, c) where build(t, c) builds commit c at point t in time.

Because we can only run build(now, c) with now the current time, we can't check the property as stated. So we have to keep artifacts like build(ta, c) for each commit c where ta is the time when the artifact was built (usually a bit before the time of the commit tc). Then we can test that the build is reproducible with the following steps: Let now be the current time, for all commits c in the repository, if the time of the artifact ta is between now - T and now then build(now, c) should be equal to the artifact build(ta, c).

This test is currently failing as demonstrated in #180 which does a no-op change (which means that the commit is equal to its previous commit and thus both can be used interchangeably). With the notations above we have PR = master and build(t_PR, PR) != build(t_master, master), i.e. build(t_PR, master) != build(t_master, master) with t_PR - t_master < T for any reasonable T.

gendx · 2020-10-12T16:19:17Z

I don't think this level of formalism is necessary to understand the issue. It's rather clear that over time changes in any of the dependencies affects reproducibility, and therefore we can't achieve reproducibility over time without pinning to a Cargo.lock. Likewise changes in the compiler affect reproducibility, and that's why we pin a compiler version.

So I think pinning a Cargo.lock and a compiler version are reasonable measures, with an extra workflow running as a cron job to notify when dependencies are outdated, or when the compiler is outdated. For the compiler, because a new nightly exists pretty much every day, it would probably be more reasonable to be notified when it's more than N days old (as opposed to "not the latest").

WDYT?

ia0 · 2020-10-12T16:49:13Z

So I think pinning a Cargo.lock and a compiler version

Yes, we should do that.

an extra workflow running as a cron job to notify when dependencies are outdated, or when the compiler is outdated

This is an orthogonal improvement to the reproducibility problem. However a cron job that tests the reproducibility of old commits (maybe just some tags to avoid computational costs) would be useful to track reproducibility issues.

gendx · 2020-10-14T15:46:31Z

However a cron job that tests the reproducibility of old commits (maybe just some tags to avoid computational costs) would be useful to track reproducibility issues.

This could be useful. However, if an old commit X turns out not to be reproducible, it will be forever non-reproducible and we cannot fix it - the only option is to create a different commit Y.

ia0 · 2020-10-14T15:56:31Z

Yes, and that would work well with a release branch process. The cron job would check the last commit of all release branches. If a release is not reproducible (for example the retention of the nightly compiler that was used expired), then we push a commit to update the compiler version for that release branch.

kaczmarczyck · 2023-12-14T12:35:41Z

#667 suggests to have binary releases. We might want to revisit reproducibility if we go for it.

ia0 assigned ia0 and gendx and unassigned ia0 Oct 9, 2020

jmichelp added the enhancement New feature or request label Oct 12, 2020

ia0 added bug Something isn't working and removed enhancement New feature or request labels Oct 12, 2020

gendx removed their assignment Oct 12, 2020

ia0 mentioned this issue Oct 12, 2020

Disable reproducible workflow #181

Merged

kaczmarczyck mentioned this issue Dec 14, 2023

Suggested environment to be able to build/compile #667

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the build reproducible #182

Make the build reproducible #182

ia0 commented Oct 9, 2020

jmichelp commented Oct 12, 2020

ia0 commented Oct 12, 2020

gendx commented Oct 12, 2020

ia0 commented Oct 12, 2020

gendx commented Oct 14, 2020

ia0 commented Oct 14, 2020

kaczmarczyck commented Dec 14, 2023

Make the build reproducible #182

Make the build reproducible #182

Comments

ia0 commented Oct 9, 2020

jmichelp commented Oct 12, 2020

ia0 commented Oct 12, 2020

gendx commented Oct 12, 2020

ia0 commented Oct 12, 2020

gendx commented Oct 14, 2020

ia0 commented Oct 14, 2020

kaczmarczyck commented Dec 14, 2023