Skip to content

Data Engineering with Google Cloud Platform - Second Edition, published by Packt

License

MIT, Apache-2.0 licenses found

Licenses found

MIT
LICENSE
Apache-2.0
LICENSE_APACHE
Notifications You must be signed in to change notification settings

PacktPublishing/Data-Engineering-with-Google-Cloud-Platform-Second-Edition

Repository files navigation

Data Engineering with Google Cloud Platform- Second Edition

no-image

This is the code repository for Data Engineering with Google Cloud Platform, published by Packt.

A guide to leveling up as a data engineer by building a scalable data platform with Google Cloud

What is this book about?

The second edition of Data Engineering with Google Cloud builds upon the success of the first edition by offering enhanced clarity and depth to data professionals navigating the intricate landscape of data engineering.

This book covers the following exciting features:

  • Load data into BigQuery and materialize its output
  • Focus on data pipeline orchestration using Cloud Composer
  • Formulate Airflow jobs to orchestrate and automate a data warehouse
  • Establish a Hadoop data lake, generate ephemeral clusters, and execute jobs on the Dataproc cluster
  • Harness Pub/Sub for messaging and ingestion for event-driven systems
  • Apply Dataflow to conduct ETL on streaming data
  • Implement data governance services on Google Cloud

If you feel this book is for you, get your copy today!

https://www.packtpub.com/

Instructions and Navigations

All of the code is organized into folders. For example, Chapter03.

The code will look like the following:

random_forest_classifier = RandomForestClassifier(n_estimators=100)
random_forest_classifier.fit(X_train,y_train)

Following is what you need for this book: Data analysts, IT practitioners, software engineers, or any data enthusiasts looking to have a successful data engineering career will find this book invaluable. Additionally, experienced data professionals who want to start using Google Cloud to build data platforms will get clear insights on how to navigate the path. Whether you're a beginner who wants to explore the fundamentals or a seasoned professional seeking to learn the latest data engineering concepts, this book is for you.

With the following software and hardware list you can run all code files present in the book (Chapter 3-12).

Software and Hardware List

Chapter Software required OS required
3-12 GCP account Windows, macOS, or Linux
3-12 Python Windows, macOS, or Linux

Related products

Get to Know the Author

Adi Wijaya is a strategic cloud data engineer at Google, with over a decade of experience, and brings his expertise to the pages of this book. Throughout his career, Adi has successfully built scalable data pipelines and optimized data processing workflows for various industries. Valuing the importance of sharing, Adi is also a dedicated educator and mentor. He has conducted numerous workshops and training sessions, empowering aspiring data talent with the skills and knowledge required to excel. Based on the recognition from the first edition, Adi aims to further empower readers and provide them with the tools they need to thrive in the dynamic world of this field.

About

Data Engineering with Google Cloud Platform - Second Edition, published by Packt

Resources

License

MIT, Apache-2.0 licenses found

Licenses found

MIT
LICENSE
Apache-2.0
LICENSE_APACHE

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published