Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is gathering all text being generated appropriate #132

Open
turtle0x1 opened this issue Apr 18, 2024 · 5 comments
Open

Is gathering all text being generated appropriate #132

turtle0x1 opened this issue Apr 18, 2024 · 5 comments

Comments

@turtle0x1
Copy link

turtle0x1 commented Apr 18, 2024

It seems gathering all user input text is now be gathered as of this commit? eebdcc6

Adding analytics is annoying because we'll have to patch them out / add more firewall rules, but siphoning off user data like this feels very wrong if I've understood it correctly.

This could leak sensitive business information if meta voice was used as part of some kind of internal system, please considere adding a --disable-metrics flag/env or removing the gathering of input text altogether (the other stuff is "by the by" - if you just want trends you shouldn't need "text").

@vatsalaggarwal
Copy link
Contributor

Telemetry is optional. You can set ANONYMIZED_TELEMETRY=False in an .env file at the root level of this repo. We provide these instructions when you run inference, and they are additionally provided in a README here: https://github.com/metavoiceio/metavoice-src/tree/main/fam/telemetry

@vatsalaggarwal
Copy link
Contributor

vatsalaggarwal commented Apr 18, 2024

(The text is useful for us to debug errors folks are facing, etc, but if more people complain we can look into options... fyi @sidroopdaska)

@turtle0x1
Copy link
Author

turtle0x1 commented Apr 18, 2024

Still will have to patch it out, taking the "text" is just weird - what if we put sensitive business numbers with our company name or user information through this? It shouldn't leave the network "anonymized" or otherwise.

@turtle0x1 turtle0x1 changed the title Is gathering the text being synathised appropriate Is gathering all text being generated appropriate Apr 18, 2024
@MethanJess
Copy link

@vatsalaggarwal I really don't see how collecting the text can help "debug errors"
If someone is facing a problem with text, then they can just manually report it :/
Even then, I think this has more bad than good to it.

@sidroopdaska
Copy link
Contributor

sure thing @MethanJess & @turtle0x1. I'll add a flag to avoid telemetry altogether. Will share a PR tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants