Korean Character Decomposition Issue in CVAT on Dockerized Ubuntu Server #7788

devwonhee · 2024-04-19T04:26:08Z

Environment:

CVAT Version: dev (latest as of the issue reporting)
Docker Version: 26.0.1
OS: Ubuntu 20.04.6 LTS
Browser: Naver Whale V3.25.232.19
Description:
We are experiencing an issue with the Korean text input in CVAT running in a Docker container on an Ubuntu server. When entering labels in Korean, characters are incorrectly split into their constituent consonants and vowels. For example, inputting "사과" (apple) results in "ㅅ ㅏ ㄱ ㅘ" appearing on the labels.

Steps to reproduce:

Launch CVAT in Docker on an Ubuntu server.
Enter the OCR labeling mode.
Input Korean characters to label an image.
Observe that the characters decompose into separate consonants and vowels.
Expected behavior:
The input should correctly show full Korean characters like "사과" without any decomposition.

Actual behavior:
Korean characters are split into consonants and vowels ("ㅅ ㅏ ㄱ ㅘ").

Attempts to fix:

Tested on multiple browsers (Chrome, Firefox, and Naver Whale) without success.
Checked Docker container settings for any locale or encoding issues.
Additional information:
This issue seems to relate to how CVAT or the underlying system in the Docker container handles Korean character encoding. Any help or pointers from the community would be greatly appreciated.

devwonhee closed this as completed May 17, 2024

devwonhee reopened this May 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Korean Character Decomposition Issue in CVAT on Dockerized Ubuntu Server #7788

Korean Character Decomposition Issue in CVAT on Dockerized Ubuntu Server #7788

devwonhee commented Apr 19, 2024

Korean Character Decomposition Issue in CVAT on Dockerized Ubuntu Server #7788

Korean Character Decomposition Issue in CVAT on Dockerized Ubuntu Server #7788

Comments

devwonhee commented Apr 19, 2024