Skip to content

Business Intelligence and Data Warehousing (Blended)

License

Notifications You must be signed in to change notification settings

juanbretti/gmbd_bidw

 
 

Repository files navigation

Business Intelligence and Data Warehouse Course (blended)

This repository contains all necessary inputs to run the course hands-on labs.

Repository contents (by session)

  • Additional articles and documents
  • MySQL Workbench Schemas
  • ETL processes
  • Datasets
  • Tableau files
  • Videos

Software Installation

  • Data Warehouse: MySQL (database) and MySQL Workbench (database modeling and SQL development)
  • ETL: Pentaho Data Integration (PDI)
  • Business Intelligence/Data Visualization: Tableau Desktop

Steps

Install Java

  • Check if you have previous Java installed in your system If have more than one, uninstall all of them and follow the steps. If you already have Java JDK v8, it is not required to follow the steps.
  • Download Java JDK v8 from: http://www.oracle.com/technetwork/java/javase/downloads/index-jsp-138363.html (in our case: Java SE 8u231 or later). It may be possible you must create a new Oracle Account to download the JDK.
  • Install and follow the instructions
  • [Optional] Instead of using Oracle Java JDK, you can use
    • Amazon Correto. In particular version 8. Consider the right installer for your OS. This is a long-term support production-ready distribution of the Open Java Development Kit (OpenJDK) supported by Amazon.
    • OpenJDK.
  • Remember to use only one JDK version.

If you have problems with Oracle Java, uninstall and switch to Amazon Correto or OpenJDK.

Install MySQL and MySQL Workbench

  • Download the right version of MySQL and MySQL Workbench for your OS (in our case: MySQL Community Server 8.0.19 and MySQL Workbench 8.0.19 or later). Check in advance if your system is supported: MySQL and MySQL Workbench.
  • Download the program(s) for your specific OS:
  • MySQL configuration: During the installation process you will configure the password for root user (choose iembd2020). Choose legacy password encryption. If you forget the password you will be able to change it from system preferences (in MAC) or using MySQL Workbench o reinstalling (Windows). PDI and Tableau only support legacy password encryption, not the new strong encryption available in MySQL 8. Select this option until the strong encryption is supported.

Note: for Microsoft Windows it is just one installer for MAC, two files.

Remember to start the server to be able to use the database. Open MySQL Workbench and create a new connection using the right user and password and the standard parameters for configuration.

Install PDI

We will use the community version of Pentaho Data Integration (a.k.a PDI, previously known as Kettle). It can be downloaded from this link (in our case: pdi-ce-8.3.0.0-371.zip).

  • Download the file, unzip and follow these instructions:
    • [Mac] Move the data-integration folder into Applications folder
    • [Windows] Move the data-integration folder into C:/ folder
  • Open PDI
    • [Windows] Double-click spoon.bat inside data-integration folder. Optional: create a shortcut.
    • [Mac] Open the terminal and execute:
cd /Applications/data-integration/
./spoon.sh
  • [Optional, Recommended, Mac] Activate data-integration.app as a double-click app using the terminal:
sudo xattr -dr com.apple.quarantine /Applications/data-integration/Data\ Integration.app
  • Configuring a JDBC Connection to MySQL 8.0.19 Using PDI:
    • Download the MySQL 8.0.19 JDBC driver - or later - (select platform independent, zip) to the computer running Pentaho from https://dev.mysql.com/downloads/connector/j/
    • Unzip the file mysql-connector-java-8.0.18.zip and enter inside the folder
    • Copy mysql-connector-java-8.0.17.jar to the Pentaho lib folder. [Windows]: C:\data-integration\lib. [Mac OS]: …/Applications/data-integration/lib
    • Configure a Generic Database connection in Pentaho: (1) Connection URL: jdbc:mysql://localhost:3306/<database_name> (at the beginning the only database is sys, subtitute <database_name> by sys) (2) Driver Class Name: com.mysql.cj.jdbc.Driver (3) use the previous user and password
    • In case the server time zone value 'AEST' (or other time zone) is unrecognized or represents more than one time zone, then consider: jdbc:mysql://localhost:3306/<database_name>?useLegacyDatetimeCode=false&serverTimezone=UTC
  • [Not required, only if you use MySQL 5.x] Install MySQL 5.x plugin for PDI:
    • Open PDI
    • Go the tools menu > Marketplace > MySQL Plugin and install
    • Restart PDI

Install Tableau Desktop

We can access student licenses due to the Academic Partnership. Tableau has versions for Mac and Windows. Follow these instructions:

  • Download the latest version of Tableau Desktop here.
  • Copy Tableau Desktop License from campus.
  • Install the software following the instructions in the screen.
  • Update your license in the application: Help menu -> Manage Product Keys
  • Download the driver for MySQL from here

(Optional) Atom

In case you need a multipurpose text editor, I recommend Atom.

FAQ

Is there a Pentaho Release Product Version Matrix?

Yes! You can find it here.

Any recommendation for MySQL SQL syntax?

Yes, check MySQL™ Notes for Professionals book and MySQL Documentation.

Any tutorials for MySQL Workbench?

Yes, check MySQL Workbench Manual.

Any book for SQL?

Yes, check SQL notes for professionals

How can I have this repository?

Fork it using github and github desktop. Are you interested in how Github works? Start here.

About

Business Intelligence and Data Warehousing (Blended)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TSQL 100.0%