Skip to content

gerhard1050/Applying-Data-Science-Using-SAS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Applying Data Science - Business Case Studies using SAS

Applying Data Science Methods to Real Life Business Use Cases. Companion and Download Site for the SAS Press Book "Applying Data Science - Business Case Studies using SAS" by Gerhard Svolba. --> amazon.com

  • Presentations of all case studies (and more) have also been recorded in my Data Science Webinar on Youtube.
  • Contributions at SAS Communities on content from this book

Encoding of CLASS Variables in Regression Analysis - Better understand the ORDINAL encoding

Display the hidden estimate for the reference category in EFFECT coding for better interpretability

%CALC_REFERENCE_CATEGORY displays the "hidden" coefficient in EFFECT encoding for CLASS variables

Simulate timeseries data with a SAS DATA Step and SAS Functions

Automatically highlight data-driven events with reference lines in line-charts

Overview

This is the new SAS Press book of Gerhard Svolba. It contains 8 case studies in 28 practical chapters with business explainations, methodological considerations and lots of SAS Code.

Why you want to read this book:

  • This book reflects the author's enthusiasm to use analytical and data science methods to solve business questions and to implement the solution using SAS.
  • It shows you the benefits of analytics, how to gain more insight into your data, and how to make better decisions. In eight entertaining and real-world case studies, Svolba combines data science and advanced analytics with business questions, illustrating them with data and SAS code.
  • Written for business analysts, statisticians, data miners, data scientists, and SAS programmers, Applying Data Science bridges the gap between high-level, business-focused books that skimp on the details and technical books that only show SAS code with no business context.

This book is written for a variety of different persona groups and profiles.

  • Business Analysts and Business Experts: Businesspeople can review the examples and see what can be achieved with analytical methods. They get insight into the power of analytics and the additional findings that can be generated by these methods. They might not study the SAS implementation and the code in much detail. They would rather hand over the implementation examples to their data scientist to give them a quick start to apply the methods.
  • Statisticians, Data Miners, Data Scientists, and Quantitative Experts: This group of people might be interested to see how analytical methods can be applied to real-world business questions. They learn how analytical methods that are established in a certain industry might be applied to other areas. They see practical situations and constraints that they can expect to encounter when they apply data science methods.
  • SAS Programmers: The book contains a lot of SAS code, including SAS macros, SAS DATA step code for data preparation, SAS analytics procedures, and SAS graph procedures. In this code SAS programmers can find new ways to solve certain problems in SAS and transfer the solutions in these examples to their day-to-day problems

Changes, Improvements and Typos in the printed version of the book

Please send any findings potential typos and necessary changes to the author. Email: [email protected]

Note that the downloadable code files from github already contain these changes.

  • Page 16 and 17: The PROC LIFETEST statement has the brackets around HAZARD and MAXTIME set in a wrong way. Here is the correct version for both PROC LIFETEST calls in section 1.7.2 and 1.7.3

PROC LIFETEST DATA=employees plots=(hazard(bandwidth=3)) maxtime=120;

Thanks to Nicole Fox for pointing this out!

  • Page 54: The datastep at the beginning of chapter 3.5.2 should use variable _ T _ instead of variable TIME. Here is the correct version.

data employees_expanded;

set employees;

do _ T _ = 1 to duration;

if TIME NE duration then Event = 0;

else Event = Resigned;

output;

end;

run;