Recovery data from chunks without metadata :) #572

asyslinux · 2024-03-18T16:01:36Z

Hello dear MooseFS developers. I know that on your website it is written that it is extremely difficult to recover data without metadata from chunks.

But it so happened that I only had chunks left; the server was hacked and the metadata and metadata backups were destroyed.

But I would like to recover only "jpg" files from the chunks. They are all less than 64MB.

Past MooseFS 3.0.117 configuration:

goal = 1

hdd-1,3,5 = chunkserver 1 = LABEL A
hdd-2,4,6 = chunkserver 2 = LABEL B

I still have .chunkdb files left...

Since I am a developer myself, I can write a recovery utility myself, but I have a couple of questions:

How to directly read chunks files, is there some format, bytes-characters-delimiters between files located inside a chunk file? I am good at working with bytes. Please tell me how I can work with chunk files directly, if possible.
How exactly are real files distributed among chunk files? Can a real file smaller than, say, 32MB in size be, within one chunk server, divided between several chunk files within one hard drive, or even within different hard drives in chunk files?

It’s just that if the chunk format has separators, a certain or custom marshaller/unmarshaller that you use, and if real small-sized files are stored within a specific chunk file on a specific hdd, and are not divided into small parts between a bunch of different chunk files, then I can easily write a utility that counts bytes from chunk files separated by delimiters and determines from the beginning of the data that this is, say, a “jpg” picture and I can easily restore my photo archive for 10 years.

I hope you can help me with information about the chunk file format. Or at least point to your source code regarding the chunk file format, which I will need to pay attention to.

Thank you very much.

deltabweb · 2024-03-21T08:38:28Z

Hi,
To quickly answer your first question, the chunks are prefixed with an 8KB header.
I had to recover jpg files recently as well; for small files, you can recover the data with tail -c +8193 chunk.mfs > image.jpg

asyslinux · 2024-03-21T15:15:34Z

@deltabweb, thx for advice. https://github.com/asyslinux/irec - utility simply find start/end jpg bytes and then recovery jpg files with size <64MB from any raw/device/chunk file, in this case there is no need to skip first 8KB header. Now the recovery process is underway, in a few days I will write the result.

asyslinux · 2024-03-22T19:03:41Z

Updated information about recovery jpg files, now as default recovery tool i using standard recoverjpeg utility in linux distros + small shell script, this is better tool for recovery jpg files, than custom script or program. New manual here: https://github.com/asyslinux/irec

asyslinux · 2024-03-23T10:59:49Z

Success story. I recovered from MooseFS chunks most files with size <64MB - jpg, png, webp, docs, archives and much more, photorec can recover around 300+ different types of files. Files recovered without real filenames.

Release of recovery manual for recovery all file types from MooseFS with photorec: https://github.com/asyslinux/irec
Maybe someone will find this useful. Developers can close the issue and save this manual for another people.

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recovery data from chunks without metadata :) #572

Recovery data from chunks without metadata :) #572

asyslinux commented Mar 18, 2024 •

edited

deltabweb commented Mar 21, 2024

asyslinux commented Mar 21, 2024

asyslinux commented Mar 22, 2024

asyslinux commented Mar 23, 2024

Recovery data from chunks without metadata :) #572

Recovery data from chunks without metadata :) #572

Comments

asyslinux commented Mar 18, 2024 • edited

deltabweb commented Mar 21, 2024

asyslinux commented Mar 21, 2024

asyslinux commented Mar 22, 2024

asyslinux commented Mar 23, 2024

asyslinux commented Mar 18, 2024 •

edited