Skip to content

Pallas Fusion: GPU Turbocharged 馃殌

Compare
Choose a tag to compare
@erfanzar erfanzar released this 16 May 09:33
· 147 commits to main since this release

EasyDeL version 0.0.65

  • New Features

    • Pallas Flash Attention on CPU/GPU/TPU via FJFormer and supports bias.
    • ORPO Trainer is added and now it's in your bag.
    • WebSocket Serve Engine.
    • Now EasyDeL is 30% faster on GPUs.
    • No JAX-Triton is now needed to run GPU kernels.
    • Now you can specify the backward kernel implementation for Pallas Attention.
    • now you have to import EasyDeL as easydel instead of EasyDel.
  • New Models

    • OpenELM model series are now present.
    • DeepseekV2 model series are now present.
  • Fixed Bugs

    • CUDNN FlashAttention Bugs are now fixed.
    • Llama3 Model 8Bit quantization of parameters had a lot of improvements.
    • Splash Attention bugs on TPUs are now fixed .
    • Dbrx Model Bugs are fixed.
    • DPOTrainer Bugs are Fixed (creating dataset).
  • Known Bugs

    • Splash Attention won't work on TPUv3.
    • Pallas Attention won't work on TPUv3.
    • You need to install flash_attn in order to convert HF DeepseekV2 to EasyDeL (bug in DeepseekV2 implementation from original authors).
    • Some Examples are out dated.

Full Changelog: 0.0.63...0.0.65