Skip to content

r-m-n/sklearn-deltatfidf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sklearn-deltatfidf

PyPI PyPI - Downloads

DeltaTfidfVectorizer for scikit-learn.

The Delta TFIDF is suggested in a article by Justin Martineau and Tim Finin, and usually associated with sentiment classification or polarity detection of text.

Usage

from sklearn_deltatfidf import DeltaTfidfVectorizer

v = DeltaTfidfVectorizer()
data = ['word1 word2', 'word2', 'word2 word3', 'word4']
labels = [1, -1, -1, 1]
v.fit_transform(data, labels)

# you can use it in pipelines as usual
pipe = Pipeline([
      ('vectorizer', DeltaTfidfVectorizer()),
      ('clf', svm.LinearSVC())
  ])
pipe.fit(data, labels)

Installation

With pip:

$ pip install sklearn-deltatfidf

From source:

$ git clone https://github.com/r-m-n/sklearn-deltatfidf.git
$ cd sklearn-deltatfidf
$ python setup.py install