Sentence Similarity Calculator

This repo contains various ways to calculate the similarity between source and target sentences. You can choose the pre-trained models you want to use such as ELMo, BERT and Universal Sentence Encoder (USE).

And you can also choose the method to be used to get the similarity:

1. Cosine similarity
2. Manhattan distance
3. Euclidean distance
4. Angular distance
5. Inner product
6. TS-SS score
7. Pairwise-cosine similarity
8. Pairwise-cosine similarity + IDF

You can experiment with (The number of models) x (The number of methods) combinations!

Installation

This project is developed under conda enviroment
After cloning this repository, you can simply install all the dependent libraries described in requirements.txt with bash install.sh

conda create -n sensim python=3.7
conda activate sensim
git clone https://github.com/Huffon/sentence-similarity.git
cd sentence-similarity
bash install.sh

Usage

To test your own sentences, you should fill out corpus.txt with sentences as below:

I ate an apple.
I went to the Apple.
I ate an orange.
...

Then, choose the model and method to be used to calculate the similarity between source and target sentences

python sensim.py
    --model    MODEL_NAME  [use, bert, elmo]
    --method   METHOD_NAME [cosine, manhattan, euclidean, inner,
                            ts-ss, angular, pairwise, pairwise-idf]
    --verbose  LOG_OPTION (bool)

Examples

In this section, you can see the example result of sentence-similarity
As you know, there is a no silver-bullet which can calculate perfect similarity between sentences
You should conduct various experiments with your dataset
- Caution: TS-SS score might not fit with sentence similarity task, since this method originally devised to calculate the similarity between long documents
Result:

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
img		img
models		models
utils		utils
.gitignore		.gitignore
README.md		README.md
corpus.txt		corpus.txt
download.sh		download.sh
requirements.txt		requirements.txt
sensim.py		sensim.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

img

img

models

models

utils

utils

.gitignore

.gitignore

README.md

README.md

corpus.txt

corpus.txt

download.sh

download.sh

requirements.txt

requirements.txt

sensim.py

sensim.py

Repository files navigation

Sentence Similarity Calculator

Installation

Usage

Examples

References

Papers

Libraries

Articles

About

Releases

Packages

Contributors 3

Languages

Huffon/sentence-similarity

Folders and files

Latest commit

History

Repository files navigation

Sentence Similarity Calculator

Installation

Usage

Examples

References

Papers

Libraries

Articles

About

Topics

Resources

Stars

Watchers

Forks

Languages