Skip to content

Implementation of the transformer architecture (Vaswani et. al. 2017) in PyTorch.

License

Notifications You must be signed in to change notification settings

patricebechard/transformer

Repository files navigation

Transformer and Generative Pre-Training

This repo contains (will contain shortly) a PyTorch implementation of the Transformer architecture (Vaswani et. al. 2017) as well as experiments with generative pre-training (Radford et. al. 2018, Devlin et. al. 2018).

The repo also contains slides for a presentation given for the Scientific Discussions at Intact Data Lab.

TODO

  • create training setting similar to Vaswani paper
  • add dropout
  • use BPE to encode sentences
  • Preprocessing using SpaCy
  • Train on WMT and Cornell Movie Dialog Corpus
  • add label smoothing
  • implement beam search

References

About

Implementation of the transformer architecture (Vaswani et. al. 2017) in PyTorch.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published