Skip to content

πŸ†– A lightweight Swift library for building probability models of n-grams.

License

Notifications You must be signed in to change notification settings

mathewsanders/Tally-Walker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

82 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Tally & Walker logo

Tally & Walker

Tally & Walker is a lightweight Swift library for building probability models of n-grams.

Quick example

Build a frequency model of n-grams by observing example sequences:

// Create a model out of any type that adopts the `Hashable` protocol
var weatherModel = Tally<Character>()

// Observe sequences of items to build the probability model
weatherModel.observe(sequence: ["🌧","🌧","🌧","🌧", "β˜€οΈ","β˜€οΈ","β˜€οΈ","β˜€οΈ"])

// Check the overall distributions of items observed
weatherModel.distributions()
// Returns:
// [(probability: 0.5, element: "🌧"),
//  (probability: 0.5, element: "β˜€οΈ")]

// Check to see what items are expected to follow a specific item  
weatherModel.elementProbabilities(after: "🌧")
// Returns:
// [(probability: 0.75, element: "🌧"),
//  (probability: 0.25, element: "β˜€οΈ")]

weatherModel.elementProbabilities(after: "β˜€οΈ")
// Returns:
// [(probability: 0.75, element: "β˜€οΈ"),
//  (probability: 0.25, element: .unseenTrailingItems)]
//
// `.unseenTrailingItems` is an element, which instead of representing an
// item, is a marker that indicates that the sequence continues but, based
// on the sequences we have observed, we don't know what items come next

Generate new sequences based off a random walk using through the probability model:

// Create a walker from a frequency model
var walker = Walker(model: weatherModel)

// Create four weeks of 7 day forecasts
for _ in 0..<4 {
  let forecast = walker.fill(request: 7)
  print(forecast)
}

// Prints:
// ["β˜€οΈ", "β˜€οΈ", "🌧", "🌧", "🌧", "🌧", "🌧"]
// ["β˜€οΈ", "β˜€οΈ", "🌧", "β˜€οΈ", "β˜€οΈ", "🌧", "β˜€οΈ"]
// ["🌧", "🌧", "β˜€οΈ", "β˜€οΈ", "β˜€οΈ", "β˜€οΈ", "β˜€οΈ"]
// ["β˜€οΈ", "β˜€οΈ", "β˜€οΈ", "β˜€οΈ", "β˜€οΈ", "β˜€οΈ", "β˜€οΈ"]
//
// Although the overall distribution of rainy days and sunny days are equal
// we don't want to generate a sequence based off a coin flip. Instead we
// except that the weather tomorrow is more likely the same as the weather
// today, and that we will find clusters of rainy and sunny days but that
// over time the number of rainy days and sunny days will approach each other.

Documentation

  • [Tally options](Documentation/1. Tally.md)
  • [Saving models](Documentation/2. Saving models.md)
  • [Normalizing items](Documentation/3. Normalizing items.md)
  • [Tally and complex types](Documentation/4. Tally and complex types.md)
  • [Walker options](Documentation/5. Walker.md)

Examples

  • Weather Playground A Playground with the weather example used above
  • [Predictive Text](/Examples/Predictive Text) A proof-of-concept using Tally to re-create iOS QuickType predictive suggestions.

Roadmap

  • Build models from observed training examples
  • Model either continuous or discrete sequences
  • Option to set the size of n-grams used
  • Generic type - works on any Hashable item
  • List probability for next item in sequence
  • List probability for next sequence of items in sequence
  • List most frequent n-grams
  • Persist model using Core Data
  • Add pseudocounts to smooth infrequent or unseen n-grams
  • Normalize items as they are observed
  • Tag observed sequences with metadata/category to provide context
  • Approximate matching to compare item sequences
  • Include common sample training data
  • Generate new sequence from random walk
  • Generate sequences from biased walk
  • Semi-random walk that biases towards a target length of a discrete sequence

Requirements

  • Xcode 8.0
  • Swift 3.0
  • Target >= iOS 10.0

Author

Made with ❀️ by @permakittens

Contributing

Feedback, or contributions for bug fixing or improvements are welcome. Feel free to submit a pull request or open an issue.

License

MIT

About

πŸ†– A lightweight Swift library for building probability models of n-grams.

Resources

License

Stars

Watchers

Forks

Packages

No packages published