Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect aspect ratio in output of img2img.py (and similars: img2img_color.py, video2video.py and video2video_color.py) #21

Open
quark67 opened this issue Feb 21, 2024 · 0 comments

Comments

@quark67
Copy link

quark67 commented Feb 21, 2024

From the demo image (dimension: 976×538, aspect ratio = 976/538 = 1.81):

python3 img2img.py --num_cols 100 --language general --mode complex --background white --output data/output.png gives:

output

which is a picture of dimension 1200×515 (aspect ratio = 1200/515 = 2.33, which is very different from 1.81).

The reason is this line of code: cell_height = scale * cell_width (in line 36 of img2img.py).

The factor cell_height / cell_width needs to be the same as the factor char_height / char_width, so the previous code becomes:

cell_height = (char_height / char_width) * cell_width.

Moreover, the line char_width, char_height = font.getsize(sample_character) generate a warning:
DeprecationWarning: getsize is deprecated and will be removed in Pillow 10 (2023-07-01). Use getbbox or getlength instead.

So on utils.py, all lines similar to char_width, char_height = font.getsize("◊") (with various values for ◊) needs to be replaced by:

char_bbox = font.getbbox("◊")
char_width = char_bbox[2] - char_bbox[0]
char_height = char_bbox[3]

(caution: there is no missing char_bbox[1] in the previous code. And strangely "bottom" really gives the height. See this: python-pillow/Pillow#7802).

This correction must also be made in line 44 of img2img.py: remplace char_width, char_height = font.getsize(sample_character) with

char_bbox = font.getbbox(sample_character)
char_width = char_bbox[2] - char_bbox[0]
char_height = char_bbox[3]

So, by rearranging order of some calculus (because the calculus of cell_height needs to known the value of char_height / char_width), I suggest this correction in the code of img2img.py (extract of the code for the main function):

def main(opt):
    if opt.background == "white":
        bg_code = 255
    else:
        bg_code = 0
    char_list, font, sample_character, scale = get_data(opt.language, opt.mode)
    num_chars = len(char_list)
    num_cols = opt.num_cols
    image = cv2.imread(opt.input)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    height, width = image.shape
    char_bbox = font.getbbox(sample_character)
    char_width = char_bbox[2] - char_bbox[0]
    char_height = char_bbox[3]
    cell_width = width / opt.num_cols
    #cell_height = scale * cell_width
    cell_height = (char_height/char_width) * cell_width
    num_rows = int(height / cell_height)
    if num_cols > width or num_rows > height:
        print("Too many columns or rows. Use default setting")
        cell_width = 6
        #cell_height = 12
        cell_height = (char_height/char_width) * cell_width
        num_cols = int(width / cell_width)
        num_rows = int(height / cell_height)
    #char_width, char_height = font.getsize(sample_character)
    out_width = char_width * num_cols
    out_height = scale * char_height * num_rows
    out_image = Image.new("L", (out_width, out_height), bg_code)
    draw = ImageDraw.Draw(out_image)

For comparison, the old code for the same portion was:

def main(opt):
    if opt.background == "white":
        bg_code = 255
    else:
        bg_code = 0
    char_list, font, sample_character, scale = get_data(opt.language, opt.mode)
    num_chars = len(char_list)
    num_cols = opt.num_cols
    image = cv2.imread(opt.input)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    height, width = image.shape
    cell_width = width / opt.num_cols
    cell_height = scale * cell_width
    num_rows = int(height / cell_height)
    if num_cols > width or num_rows > height:
        print("Too many columns or rows. Use default setting")
        cell_width = 6
        cell_height = 12
        num_cols = int(width / cell_width)
        num_rows = int(height / cell_height)
    char_width, char_height = font.getsize(sample_character)
    out_width = char_width * num_cols
    out_height = scale * char_height * num_rows
    out_image = Image.new("L", (out_width, out_height), bg_code)
    draw = ImageDraw.Draw(out_image)

So, with the modified code, python3 img2img.py --num_cols 100 --language general --mode complex --background white --output data/NewOutput.png gives:

NewOutput

The dimension of this corrected image is 1200×648, and it's aspect ratio is 1200/648=1.85, which is near the 1.81 aspect ratio of the original image.

I will show a more visible difference, by scaling the outputted image, so it's width is the same as the inputed image, and displaying it in a graphic manipulation software, with a transparent backgroung over the inputed image:

Before the correction:

image

After the correction:

image
@quark67 quark67 changed the title Incorrect aspect ratio in output of img2img.py (and similars: img2img_color.py, video2video.py and video2video_color.py) Incorrect aspect ratio in output of img2img.py (and similars: img2img_color.py, video2video.py and video2video_color.py) Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant