Skip to content

hululuzhu/fun-paper-sharing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

Fun papers to read

A primer of large language models

Slides 07/2022, covers

  • 2017 Google Transformer
  • 2018 GLUE/SuperGLUE
  • 2018 Google BERT
  • 2018 OpenAI GPT-1
  • 2018 OpenAI GPT-2
  • 2019 Google T5
  • 2020 OpenAI GPT-3
  • 2020 HuggingFace decoding algorithms
  • 2021 OpenAI Codex
  • 2021 OpenAI Math paper
  • 2021 DeepMind Gopher
  • 2021 Google&Others Big-Bench
  • 2022 OpenAI ML Parallelism guide
  • 2022 OpenAI InstructGPT
  • 2022 DeepMind AlphaCode
  • 2022 Google LaMDA
  • 2022 Google PaLM
  • 2022 DeepMind Chinchilla
  • 2022 Google Minerva (pathways)
  • 2022 Salesforce CodeRL

Evolution of positional encoding in Transformer

Slides 06/2022, covers

  • Learned embedding like BERT
  • Sinusoidal pos embedding like vanilla Transformer
  • Relative position embedding
  • Rotary position embedding (RoPE)

Alpha series of DeepMind, to cover the following papers

V2 Slides 11/2022

10+ 'classic' NLP papers

V2 'Fun' version slides 04/2022

  • Genenal Deep Learning related

    year/id Title
    2014 Dropout: A Simple Way to Prevent Neural Networks from Overfitting
    1412.6980 Adam: A Method for Stochastic Optimization
    1502.01852 Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
    1503.02531 Distilling the Knowledge in a Neural Network
    1502.03167 Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
    2016 Deep Neural Networks for YouTube Recommendations
  • Deep Learning NLP related

    year/id Title
    1301.3781 Efficient Estimation of Word Representations in Vector Space
    1409.3215 Sequence to Sequence Learning with Neural Networks
    1706.03762 Attention Is All You Need
    1810.04805 BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
    1804.07461 GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
    1910.10683 Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
    2005.14165 Language Models are Few-Shot Learners

DouZero & PerfectDou

Paper: https://arxiv.org/abs/2106.06135