Skip to content

autogluon/pydata2023-autogluon-workshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AutoGluon: Leveraging Text, Images, and the Kitchen Sink to solve complex ML problems in a few lines of code

Automated machine learning (AutoML) offers the promise of translating raw data into accurate predictions without the need for significant human effort, expertise, and manual experimentation. In this workshop, we introduce AutoGluon, a state-of-the-art and easy-to-use toolkit that empowers multimodal AutoML. Different from most AutoML systems that focus on solving tabular tasks containing categorical and numerical features, we consider supervised learning tasks on various types of data including tabular features, text, image, time series, as well as their combinations. We will introduce the real-world problems that AutoGluon can help you solve within three lines of code and the fundamental techniques adopted in the toolkit. Rather than diving deep into the mechanisms underlining each individual ML models, we emphasize on how you can take advantage of a diverse collection of models to build an automated ML pipeline. Our workshop will also emphasize on the techniques behind automatically building and training deep learning models, which are powerful yet cumbersome to manage manually.

Join us at the PyData Seattle 2023 located at Microsoft Conference Center on Wednesday, April 26th at 9:00-10:30am, PDT in St. Helens.

Note: Github repository for this website is available at https://github.com/autogluon/pydata2023-autogluon-workshop .

Schedule

For each section, there will be a 10-15min QA at the end of section. In addition, there will be additional hands-on notebooks after each session that people can try out asynchronously.

Topic Speaker Duration (PDT timezone) Slides Cheatsheet
Introduction + AutoGluon Tabular Nick Erickson 9:00AM -- 9:40AM link tabular-cheatsheet docs
AutoGluon Multimodal Nick Erickson 9:40AM -- 9:55AM link multimodal-cheatsheet docs
AutoGluon EDA Alexander Shirkov 9:55AM -- 10:15AM link docs
Additional QA + Feedback All speakers 10:15AM -- 10:30AM -

Section Outline and Materials

AutoGluon Tabular

  • AutoML Basics: Discussion of core AutoML principles and historical background (including early AutoML toolkits such as AutoWeka and auto-sklearn)
  • History of competition ML and how it influenced the design of modern AutoML systems
  • Discussion of model combination strategies (stacking, bagging, model aggregation)
  • Constraint satisfaction and engineering for a performance envelope (accuracy, speed, compute resources)
  • Benchmark comparisons showcasing the advancement of AutoML systems in recent years both compared to earlier AutoML systems and human data scientists (4 AutoML frameworks, 104 OpenML datasets, 10 Kaggle datasets)

AutoGluon Multimodal

  • Foundational models for image and text
  • Real-world multimodal problems
  • Fusion techniques and multimodal distillation

Advanced Topics

  • Exploratory data analysis

Hands-on Notebooks

For hands-on tutorials, we provide notebooks for you to try out AutoGluon via SageMaker Studio Lab or Google Colab.

All notebooks can be found in notebooks.

Checkout AutoGluon Website and get started!