Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug with emoji as bookmark, Javascript SDK, eastAsia #326

Open
branaway opened this issue May 8, 2024 · 11 comments
Open

bug with emoji as bookmark, Javascript SDK, eastAsia #326

branaway opened this issue May 8, 2024 · 11 comments

Comments

@branaway
Copy link

branaway commented May 8, 2024

<speak version='1.0' xmlns="http://www.w3.org/2001/10/synthesis\" xmlns:mstts="https://www.w3.org/2001/mstts\" xml:lang='en-US'>

物理学 <bookmark mark="😀" /> <break time="1750ms" />已经从任何事物都是“如露亦如电,应作如是观”这个方向往佛学的境界上又靠近一步了。世界上可能存在着类似灵魂的东西,它在人生结束之后不死,只是回到宇宙中的某个地方去了。这种观念跟唯识的根本-阿赖耶识学说是相一致的。

  resulted in  bad word boundary: ">已". The closing bracket was regarded as part of a word.
@Kerry-LinZhang
Copy link

Kerry-LinZhang commented May 8, 2024

Hi @branaway You can try using【speechConfig.SetProperty(PropertyId.SpeechServiceResponse_RequestPunctuationBoundary, "false"); 】to control the output of punctuation marks. After setting it to false, punctuation marks will no longer be outputted.

Currently, it's enabled by default, so an extra punctuation mark is outputted here.

@branaway
Copy link
Author

branaway commented May 9, 2024

well, the '>' is part of the tag, not a punctuation mark.

@branaway
Copy link
Author

branaway commented May 9, 2024

the comment system of Github removed the '<' bookmark '>' tag

@branaway
Copy link
Author

branaway commented May 9, 2024

I was using a bookmark tag in my test and the bookmark name is an emoji 😁, which resulted the closing right angle mark being treated as part of the audible text.

@ForrestGumb
Copy link
Collaborator

Is you input for a real product purpose? Why do you use bookmark and word boundary event together? Can you just use one of them. Word boundary event is a better choice if both works for you. Bookmark may change readout if you put that in improper place (for example, in the middle of a word).
If you do need bookmark, change that to may also resolve your problem.

@branaway
Copy link
Author

branaway commented May 9, 2024

In fact the emojis are created by AI chatbot. I transformed the embedded emojis to bookmarks (with the emojis as the 'mark' atrribites) and pick them up in the time series of the words while playing back the audio stream. I tend to think the server did process the emojis well, since they may take extended bytes of space and caused misaligned word boundaries.

@branaway
Copy link
Author

branaway commented May 9, 2024

BTW, the server would "read" the meaning of each emojis if they are embedded in the text stream verbatim, which is something I'd rather process my self.

@ForrestGumb
Copy link
Collaborator

So, you would like to get bookmark event for emoji. I guess you may do some post processing, like replace them with sound effect, right?
It may be a service issue that didn't handle emoji extended bytes well. Do you really need word boundary event at the same time, or you only need bookmark event?

@branaway
Copy link
Author

In my case I keep track of the bookmarks and the words stream and display proper visual effects in a timely fashion. I was suggesting the developers to review the code that handles the bookmark tag particularly if it contains emojis.

@ForrestGumb
Copy link
Collaborator

I can reproduce the issue you reported with SSML below. Will investigate on service side.
物理学已经从任何事物。

@ForrestGumb
Copy link
Collaborator

For Unicode after U+10000, the word boundary offset returned by TTS service is wrong. This is not limited to bookmark, but to all SSML.
Root cause: TTS service parse SSML with libxml2. libxml2 has offset issue.
There's no clear plan to fix this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants