Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a possiblity for good quality compression on images containing text ? #38

Open
Akash7789 opened this issue Dec 30, 2021 · 2 comments

Comments

@Akash7789
Copy link

Hi, I tried this project in google colab and tested it on a .jpg image from one of the physics ebook I have. I used the
HIFIC-low model as it was mentioned in the colab notebook that it will give best compression ratio. After decompression I got a .png image with all text inside unrecognizable. Does this mean this program can only do good on content not containing text ?
Or is there a possiblity of this program doing good on these type of images ? Will training this program a custom dataset (specifically for compression of type of images defined) be any good ?
If yes, what should be the structure of the dataset ?

@Akash7789 Akash7789 changed the title Is there a possiblity for good quality compression for images containing text ? Is there a possiblity for good quality compression on images containing text ? Dec 30, 2021
@Justin-Tan
Copy link
Owner

For image regions with high-frequency detail, e.g. faces, text, this model tends not to do well, presumably because OpenImages does not contain many examples of these images, or perhaps because psychologically, we are extremely sensitive to slight variations in facial/text structure. It would be interesting to see if training on text-heavy datasets - e.g. pages of an ebook would allow this model to compress text well without modification, but I don't know off the top of my head.

@Akash7789
Copy link
Author

@Justin-Tan Can you please tell the format/folder structure of open image dataset. I searched on Google but was not able to find anything about the format of open image dataset. I am thinking to train the program on ebooks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants