New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data loss with file Hardlinks on NTFS when overwriting files #715
Comments
This is a somewhat grey area. If we think of the "overwrite" action as "remove the existing file and create a new one from scratch" then yes, the connection should be broken. |
I agree. However I have always thought that for the purposes of file ops it is "remove the existing file and create a new one from scratch", whereas for opening / editing a file with a tool, it would be "empty the existing file and write a new data stream into it". So FAR's copy/extract routine is definitely targeting the first case and thus it seemed natural to me it should behave accordingly.
This was precisely my case. Perhaps, therefore there can be additional option or a set of options related to links processing in general added to account for such cases to copy, move, extract dialogs, since unwittingly one can cause data loss by accident assuming one behavior when in fact the behavior is different. Similar argument I think should also extend into symbolic links (I am not sure of FAR's current's behavior in this regard) and there are quite a few extra cases possible:
Personally I think only the first 2 options make practical sense without gotchas - first option being the default, while the second option applying when Copy symlink's contents flag is selected. |
In fact perhaps several link-related options could be consolidated under either a new button (accessible similar to filter for example) or put directly on the Copy/Move/Extract dialogs:
|
Alternatively, we probably can show a confirmation like "file exists and has hardlinks, what to do? break link / update all". |
Yes, probably. But the issue is with multiple files (when you need to overwrite multiple files, some of which have hard links) and if you also want to remember the answer, such that for example files with hardlinks should all be overwritten, while files without hardlinks should be skipped. With a single confirmation dialog that may be difficult to implement. So, perhaps a separate, additional or seperate confirmation dialog should be displayed in the case of hardlinks overwriting (independent from regular file overwriting dialog). And so, if both kinbds of files are encountered - with and without hardlinks, then 2 overwrite prompts should be shown - the regular one for normal files (a it happens currently) and a new one for hardlinks. |
Sounds like a sound principle, file extraction should probably do a generic replace for overwriting, and shouldn't try to be clever. If someone sets up space-saving links manually, and wants it to be robust, symlinks should be used instead (despite the nuisances on Windows): they are much better suited for human use. If those links were generated by a tool (I tend to think hardlinks should generally be, as an internal implementation detail), then it's likely easy to regenerate, too, or shouldn't have been tampered with in the first place. (The risk of breaking such a setup by manually unpacking files over it, IOW tampering the guts of a (tool-managed) setup, already falls into the "warranty void" category.)
I'm not sure if telling whether a file has hardlinks is actually always cheap/free. On some filesystems it isn't even really viable. |
Far Manager version
3.0.6074.0 x64
OS version
10.0.22621
Other software
No response
Steps to reproduce
Expected behavior
The overwritten files should now be of zero size (and contain metadata of that file from the archive). Most importantly, it should not have any hardlinks any more. All the other hardlinks to the original file should be unaffected.
Actual behavior
The file does get overwritten - kind of. That is the file is now zero length, however it still has hardlinks. Moreover for some time (until you try openning the file) the other hardlinks report non-zero size, however eventually the new zero-size file seems to trickle down and soon all hardlinks are of zero size.
This is very serious behavior as it could easily lead to data loss due to inconsistent result.
The data loss occurs that once such zero-length file is then deleted, the other hardlinks (to which it was still mistakenly "linked") now "get synchronized" and all report 0 size - leading to data loss!
I have confirmed that the same behavior occurs even with regular File Copy with Overwrite routine (so it does not have to be extraction from archive).
Remarks
The text was updated successfully, but these errors were encountered: