-
Notifications
You must be signed in to change notification settings - Fork 134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
how to generate wide sparse features #73
Comments
I mean every single field categorical feature has its vocabulary, then multiple field categorical features have multiple vocabularies. then the vocabulary of the multi hot sparse feature is the union set of multiple vocabularies, and index the multiple field categorical feature. Or just use the hash way to index the categorical feature like string "field_name:categorical feature value", this way may have some conflicts but don't have to maintain the whole vocabulary. |
Hi @kiminh, I assume that your question is based on DeText-TF2. In DeText TF2, each sparse feature field (wide part) is a multi hot vector. This vector should be generated by user beforehand (e.g. hashing). The vocab size can be passed to DeText through nums_sparse_ftrs. The vocab for each field is independent of each other. There's no correlation between them. |
Hi,I'm confused about how to generate the wide sparse features. Here is my understanding: combine the multi field categorical features together and form the multi hot sparse feature. then the index is generated by hash value or simliar way like the labelencode way?
The text was updated successfully, but these errors were encountered: