Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change order in which blocks are erased and rewritten #144

Open
LegacyNsfw opened this issue Feb 29, 2020 · 6 comments
Open

Change order in which blocks are erased and rewritten #144

LegacyNsfw opened this issue Feb 29, 2020 · 6 comments

Comments

@LegacyNsfw
Copy link
Owner

The current approach works, but a different approach would be more resilient to interuptions.

The guidance below is from PeteS / Loud160 (of LS Droid fame).

write the boot block 1st and then write the next higher chip segment increasing until you get to the end. The last segment of the chip should be the last thing you write.

For calibration you should write the the $8000 range 1st and then the upper block after that.
if you write the boot block and the rest of the PCM is blank the PCM will still boot and can be flashed over the DLC.

so once the boot block is written the PCM will be able to boot at any point regardless of how much of the chip is flashed.

This is assuming you erase the entire chip in 1 pass. If you have 1/2 of 1 OS and 1/2 of another it's not gonna boot.

But if you wipe the entire chip and write just the boot block your fine

@antuspcm
Copy link
Collaborator

antuspcm commented Feb 29, 2020

I'd suggest we stage an OS flash and go from low to high, skipping the boot block for an OS flash. Then we have another class of flash for the boot block, and do it last. So full flash becomes something like:

  1. erase and rewrite cal, 2) erase and re-write OS 3) erase and re-write boot

Steps 1 and 2 would be ordered low to high blocks, boot block is a single block.

This way we get the suggested advantage of the check bytes at the end of the CAL and OS ranges functioning properly as PeteS describes, but if someone has insufficient bench voltage for the erase or something goes wrong with the first segment attempt, the boot sector remains in tact. This way the user has the option to short the pins and trigger recovery mode and try again. Most people wont need this - some major commercial software erases the whole chip and then does the flash - but if it saves a couple of people, I think its worth it.

@BarryHuffman
Copy link

BarryHuffman commented Jul 8, 2021

Optionally if there's space one could mirror the "boot" region to the end of the flash if there's space.

This is how recovery mode is done with GPT, the replacement for MBR tables. In the event that the first boot region is messed up you can check the second region, and on a success copy the genuine one.

@LegacyNsfw
Copy link
Owner Author

That's an interesting idea.

One of the other things Pete suggested was to NOT write the top block first (which we currently do) because it contains a signature that the PCM checks during boot. If the signature is valid, the PCM thinks all is well... but if the flash was interrupted, we're in the "half of one OS, half of another" state that he mentioned. Whereas if that block is erased but not yet written, the PCM boots into recovery mode. So we should really erase that block early but only write it as the last step.

So, integrating that with Antus' proposal to do the calibration first as a sanity check:

  1. erase and write calibration
  2. erase the top block
  3. erase and write new OS blocks, starting from the top-1 block and working down to the boot block
  4. write the top block
  5. reboot

Now, adding the two-step boot-block proposal...

The boot block is 16kb and the upper blocks are 64kb (AMD flash chip) or 128kb (Intel flash chip) so we could put the boot block into the top block and just leave the rest as 00 or FF. If the write gets interrupted in that state, the signature a tthe top of the flash would be invalid and the PCM would either boot into recovery mode.

So the sequence would be:

  1. erase and rewrite calibration
  2. write the boot block into the top block (padded with 00 or FF)
  3. write new OS blocks, starting from the top-1 block and working downward
  4. when we reach the boot block, just tell kernel to copy the first 16kb from the top block
  5. poll until the kernel says it's done
  6. erase and rewrite the top block
  7. reboot

I'm more inclined to ship this in stages:

First do the 5-step process, release it, confirm that there are no surprises. (And before release, confirm that the PCM boots into recovery mode if the flash process is interrupted.) This is just a matter of changing the sequence of steps during the flash process, so it's not a huge change.

Then do the 7-step process. After the 5-step stuff has been implemented, adding the extra steps for the boot block should also be a fairly small change. But it will require new kernel code.

@antuspcm
Copy link
Collaborator

I would suggest not to use flash as a temporary copy for the boot block for 2 reasons.

  1. It adds to the wear and tear on the flash chip. The newer flash chips have much better wear life, but word of mouth is that the original early P01s have poor life and flashing is much more dangerous. The idea might be to make flashing safer, but since it doesn't seem to be much of a problem in the first place we might cause more bricks than we save.

  2. It complicates the kernel code further, and the app code too. Especially if we expect to support V6 PCMs later, I think the code complexity isn't worth it.

If we really wanted to do something like this we could copy the first 4k of the boot block (which would be enough for a recovery boot, I tihnk) in to ram and copy it out of ram at write time. But I still think the rewards don't outway the negatives. My preference would be to keep it simple. Perhaps boot block 2nd to last, and final OS block last. If the boot block fails there isnt going to be anything to test the signature in the OS block anyway on boot.

@LegacyNsfw
Copy link
Owner Author

Good points.

@BarryHuffman
Copy link

BarryHuffman commented Jul 24, 2021

Especially if we expect to support V6 PCMs

👀 If that's on the roadmap, I have a couple V6 PCM vehicles to test 👨‍🔬

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants