gowe - Word Embedding Utilities for Go

gowe is a Go package for using word embeddings.

Motivation

There are existing packages for word embeddings in Go that have inspired this package, but most have not seen updates for a while:

This motivates the creation of a new package built for modern Go (1.22+).

API

Import using:

import "github.com/jackiedeng0/gowe"

Load plaintext file to a float 32 model:

model := newFloatModel[float32]()
err := model.FromPlainFile("glove.6B.50d.txt", false)
// You can retrieve this model at https://github.com/stanfordnlp/GloVe/
// 'false' because the file doesn't have a "<size> <dim>" description

// Get the vector embedding for a word
fmt.Println(model.Vector("cat"))
// [1.45281 -0.50108 -0.53714 -0.015697 0.22191 ... ]

// Get the similarity (cosine) between two words
fmt.Printf("%0.3f\n", model.Similarity("cat", "dog"))
// 0.922

// Within a list of words, exhaustively search and rank the N most similar words
words := []string{"dog", "apple", "lincoln", "whisker", "road", "cheetah"}
nearest, err := model.NNearestIn("cat", words, 3)
fmt.Println(nearest)
// [dog cheetah apple]

Load plaintext file to a quantized int model (int8, int16, int32 supported):

model := newIntModel[int16]()
err := model.FromPlainFile("glove.6B.50d.txt", false, 5.0)
// Requires an additional float64 argument for the maximum magnitude of any
// scalar value - in this case, it was 5.0. For a normalized model, this would
// be 1.0

Load binary file to float and int models respectively:

floatModel := newFloatModel[float32]()
err := model.FromBinaryFile("model.bin", 32)
// Description is always provided, so we just need to specify the bitSize of
// floating points in the file

intModel := newIntModel[int8]()
err := model.FromBinaryFile("model.bin", 32, 2.0)

Status

Load plaintext model files as float64 embedding models
Float and Int generic vector types
Quantization and Dequantization
Loading models as any vector type
Loading binary model files

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
floatmodel.go		floatmodel.go
go.mod		go.mod
go.sum		go.sum
gowe.go		gowe.go
gowe_test.go		gowe_test.go
intmodel.go		intmodel.go
test_vocabulary.txt		test_vocabulary.txt
vectors.go		vectors.go
vectors_test.go		vectors_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

floatmodel.go

floatmodel.go

go.mod

go.mod

go.sum

go.sum

gowe.go

gowe.go

gowe_test.go

gowe_test.go

intmodel.go

intmodel.go

test_vocabulary.txt

test_vocabulary.txt

vectors.go

vectors.go

vectors_test.go

vectors_test.go

Repository files navigation

gowe - Word Embedding Utilities for Go

Motivation

API

Status

About

Releases

Packages

Languages

License

jackiedeng0/gowe

Folders and files

Latest commit

History

Repository files navigation

gowe - Word Embedding Utilities for Go

Motivation

API

Status

About

Topics

Resources

License

Stars

Watchers

Forks

Languages