Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boot nor fallback #1244

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from
Draft

Boot nor fallback #1244

wants to merge 2 commits into from

Conversation

esden
Copy link

@esden esden commented Sep 1, 2023

This PR adds a special handler that allows bootloader fallback updates on nor (aka. QSPI) flash.

The first patch implements this feature using the current version of flashcp. It will work on systems that check the full integrity of the bootloader before trying to load it. It will not work right on systems that only check the bootloader header. This is where the second patch comes in. It uses a new option in flashcp that delays writing of a specified amount of bytes till the very end.

I have written and submitted the necessary patch to mtd-utils project.

The new slot type is called boot-nor-fallback and we pass two mtd devices to the handler by separating them with a colon :. First device is the primary and the second device is the secondary. Using two mtd devices avoids us having to also add the ability to write at an offset to flashcp.

Resolves: #1235

Signed-off-by: Piotr Esden-Tempski <[email protected]>
@ejoerns ejoerns added the enhancement Adds new functionality or enhanced handling to RAUC label Sep 1, 2023
@jluebbe
Copy link
Member

jluebbe commented Sep 1, 2023

Is this nor-specific or could it also work on other flashes (like NAND)?

I'm not sure yet if overloading the device parameter with ':' is a good idea. Everywhere else in RAUC it's just a single path. An alternative would be to allow an additional option to specify the secondary MTD.

@esden
Copy link
Author

esden commented Sep 1, 2023

I think it should be able to use all the flashes supported by the mtd subsystem. I do think that flashcp can write to all of them including NAND. I just have QSPI to test with though.

Yeah having an extra device_fallback config variable would be a good solution. I just wanted to keep the patch rather less than more invasive. And adding a config variable means adding things to the config struct and I did not want to touch it. :)

If you want me to change to the use of a separate variable I am happy to do it.

@esden
Copy link
Author

esden commented Sep 13, 2023

As a heads up. The patch for flashcp that this feature requires is now merged up stream: http://lists.infradead.org/pipermail/linux-mtd/2023-September/101077.html

@jluebbe jluebbe self-assigned this Sep 19, 2023
@jluebbe jluebbe self-requested a review September 19, 2023 07:42
Copy link
Member

@jluebbe jluebbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs documentation:

"Destination device '%s' not found", plan->target_slot->device);
return FALSE;
if (g_strcmp0(plan->target_slot->type, "boot-nor-fallback") == 0) {
devices = g_strsplit(plan->target_slot->device, ":", -1);
Copy link
Member

@jluebbe jluebbe Sep 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This g_strsplit should not be at multiple places. Instead, the handling for multiple devices should be explicit in the RaucSlot struct and should be split during parsing in config_file.c. For consistency, it should use g_key_file_get_string_list, which splits at ;. If only one device is given, it should behave as before, storing it in the gchar *device. Otherwise, a new gchar **devices would be used and device would be NULL.

This way, it's explicit if one or multiple devices are configured. Then, we'd need a consistency check at the end of load_config() which ensures that only boot-nor-fallback passes two devices. All others would need only one.

"fallback"
};
int first_part_desc_index = 1;
const gsize header_size = 512; /* TODO: this value should be settable through the system config file. */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this needs to be configurable.

My suggestion would be a header-size field in the system config and a guint64 header_size in RaucSlot, similar to how region_size works.

*
* We are only checking the header area, as it is the last thing written.
*/
res = check_if_area_is_clear(devices[0], 0, header_size, &primary_clear, &ierror);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compared to block devices, NOR flash doesn't have atomic writes for 512 byte blocks. That results in some additional cases that need to be handled.

  • Power loss while writing the primary header (after successfully writing the secondary image)
  • Primary header is incomplete, but not clear
  • Secondary image is used for booting
  • Second update attempt erases secondary image, as primary header is not clear ⇒ temporarily no valid image

This means we'd need a better check to detect if the header is incomplete/corrupt. For simple cases, it might be enough if the last byte of the header region is != 0.

  • Power loss while writing the primary header (after successfully writing the secondary image)
  • Primary header is incomplete, but seems valid to the ROM code
  • Boot fails

This depends on the SoC: For headers with a proper checksum, this shouldn't be a problem. If not it probably needs SoC specific research on how it actually behaves for partially written headers.

An alternative solution would be to keep some state in the RAUC data partition, to be able to detect the different cases.

If these corner-cases are acceptable in your use case, I'd also be OK with only the current approach of checking if the header is clear. In that case, there needs to be a big warning in the documentation, though.

@jluebbe jluebbe assigned esden and unassigned jluebbe Oct 24, 2023
@jluebbe jluebbe marked this pull request as draft October 24, 2023 10:00
@jluebbe
Copy link
Member

jluebbe commented Oct 24, 2023

@esden What's your current status for this PR? We don't mind keeping it open for as long as you want to work on it.

@daterdots
Copy link

@jluebbe - @esden and I are talking about this. We hope to get it into good mergable shape soon, but we need at least until February 2024 if that's OK with you. Other stuff is slamming us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adds new functionality or enhanced handling to RAUC
Projects
None yet
Development

Successfully merging this pull request may close these issues.

boot fallback slot type for nor flash
4 participants