Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request]: Recursive unpacking? #92

Open
1 task done
jindroush opened this issue Jun 27, 2022 · 6 comments
Open
1 task done

[Feature Request]: Recursive unpacking? #92

jindroush opened this issue Jun 27, 2022 · 6 comments
Assignees
Milestone

Comments

@jindroush
Copy link

bit7z version

3.1.x

7-zip version

v19.00

7-zip DLL used

7z.dll

MSVC version

2019

Architecture

x86_64

Which version of Windows are you using?

Windows 10

Bug description

Is it possible to recursively unpack several layers of archives without actually dropping intermediary files on disk?
And second question - is it possible to extract files with extractMatching, but to store output in flat directory, not several levels deep?

Situation:
I have VirtualBox running Linux on Windows host. After turning off the VM, I need few files from the guest:

So, there is VDI file (layer 0), it contains MBR (layer 1), it contains EXT4 (layer2). In Ext4 I need to extract files from level 5 of nested directories.
I have written test app, which extract these and it works - but VDI file is huge, so is MBR file and so is EXT4 file, unnecessarily tripling the disk space needed for extracting few megs of files, which I want to avoid.

Is my "hunch" right in a way that both of my questions could be only replied by 'directly calling 7z.dll'?

Steps to reproduce

No response

Expected behavior

No response

Relevant compilation output

No response

Code of Conduct

@rikyoz
Copy link
Owner

rikyoz commented Jun 28, 2022

Hi!

Is it possible to recursively unpack several layers of archives without actually dropping intermediary files on disk?

Unfortunately, no. Or rather, not in your use case. In the case of a small archive, you might extract it to a std::istream or a buffer and extracting again from this latter. But it's not a feasible approach for big archives, obviously.

And second question - is it possible to extract files with extractMatching, but to store output in flat directory, not several levels deep?

This will actually be possible from the next version of the library, in which you can use the setRetainDirectories(false) method to disable the re-creation of the directory structure inside the archive when extracting it. But unfortunately, it's not possible in bit7z v3.1.x.

Situation:
I have VirtualBox running Linux on Windows host. After turning off the VM, I need few files from the guest:

So, there is VDI file (layer 0), it contains MBR (layer 1), it contains EXT4 (layer2). In Ext4 I need to extract files from level 5 of nested directories.
I have written test app, which extract these and it works - but VDI file is huge, so is MBR file and so is EXT4 file, unnecessarily tripling the disk space needed for extracting few megs of files, which I want to avoid.

This seems to be a problem similar to #90, only on a much bigger scale.

Is my "hunch" right in a way that both of my questions could be only replied by 'directly calling 7z.dll'?

I'm not entirely sure that 7-zip DLLs provide any immediate API for this kind of operation, but yes, probably it's achievable only by directly calling 7z.dll functions.
I'm trying to study the 7-zip source code and the really poor documentation, but I still didn't find anything.
Moreover, I'm still not entirely sure how bit7z might provide both a flexible and easy-to-use API for this kind of task.
But it's a feature that I definitely want to implement, just probably not in the short term. Or at least, it all depends on how easily it can be implemented.

@rikyoz rikyoz changed the title [Bug]: Recursive unpacking? [Feature Request]: Recursive unpacking? Jun 28, 2022
@jindroush
Copy link
Author

jindroush commented Jun 28, 2022

I think it should work like this:
instead of calling 'extract', the function deferredExtract would be called, returning some stream object with partial functionality - it'd implement read, and forward seek only. And such stream could be a input to another deferredExtract function. In the deepest level, the stream from deferredExtract would be put in some function dropDeferredToDisk (which would only copy from input stream to disk file).

@rikyoz
Copy link
Owner

rikyoz commented Jun 29, 2022

I think it should work like this: instead of calling 'extract', the function deferredExtract would be called, returning some stream object with partial functionality - it'd implement read, and forward seek only. And such stream could be a input to another deferredExtract function. In the deepest level, the stream from deferredExtract would be put in some function dropDeferredToDisk (which would only copy from input stream to disk file).

Uhm yeah, I think this might be a good API!
Thank you for the suggestion!

@kenkit
Copy link
Contributor

kenkit commented Aug 8, 2022

I tried this within libarchive, it's not as easy as it looks, also some archive formats won't be supported as they don't provide seekable streams.
It's an intresting feauture especially considering I implemented it ontop of some curl supported protocols, unfortunately extracting the second layer requruires retreiving the whole second file within the archive, you should look into my app under remote archive tab qtapp.
You just input the url and can list upto two layers I think, you don't have to install the whole of it just qtapp.
It's not perfect but it works to some extent.
http://github.com/kenkit/neon_service/releases/latest

@kenkit
Copy link
Contributor

kenkit commented Aug 8, 2022

image
This should be the page of intrest.
Just putting a raw zip file on http is a good example, google drive links require auth which is not working correctly for now, but e.g mediafire zip archive links will work, just copy from the download button.

@rikyoz
Copy link
Owner

rikyoz commented Aug 14, 2022

I tried this within libarchive, it's not as easy as it looks, also some archive formats won't be supported as they don't provide seekable streams. It's an intresting feauture especially considering I implemented it ontop of some curl supported protocols, unfortunately extracting the second layer requruires retreiving the whole second file within the archive, you should look into my app under remote archive tab qtapp. You just input the url and can list upto two layers I think, you don't have to install the whole of it just qtapp. It's not perfect but it works to some extent. http://github.com/kenkit/neon_service/releases/latest

Interesting, I'll take a look into it for sure!
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants