Skip to content
This repository has been archived by the owner on Aug 2, 2018. It is now read-only.
/ cdsw-simple-serving Public archive

Modeling Lifecycle with ACME Occupancy Detection and Cloudera

License

Notifications You must be signed in to change notification settings

srowen/cdsw-simple-serving

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Modeling Lifecycle with ACME Occupancy Detection and Cloudera

Data science is more than just modeling. The complete data science lifecycle also includes data engineering and model deployment. This project offers a simplified yet credible example of all three elements, as implemented using Apache Spark, the Cloudera Data Science Workbench, and JPMML / OpenScoring.

In this project, the ACME corporation is productionizing a connected-house platform. Part of this service requires predicting the occupancy of a room given sensor readings.

This example project includes simplified examples of:

  • Data Engineering
    • Ingest
    • Cleaning
  • Data Science
    • Modeling
    • Tuning and evaluation
  • Model Serving
    • Model management
    • Testing
    • REST API

Requirements

Get Started

To continue, review documentation for each of the three modules, which contains more information about what it show and how to run it.

Build Status