Fundamentals of Apache Spark | Think Tech #14 episode artwork

EPISODE · Jul 24, 2023 · 21 MIN

Fundamentals of Apache Spark | Think Tech #14

from Think Tech · host Shivam Mohan

Join me in this engaging and comprehensive episode of Think Tech that explores the powerful distributed computing system, Apache Spark, designed for big data processing. The episode covers the core concepts and inner workings of Spark, emphasizing its memory-centric architecture that enables lightning-fast processing and real-time or near-real-time capabilities. Listeners gain insights into Spark's fault-tolerant master/worker model, the significance of partitions for parallel processing, and the three essential data abstractions - RDD, Dataframe, and Dataset. The podcast also delves into Actions and Transformations, explaining their roles in optimizing data processing workflows. Additionally, the Spark Session as the entry point and the execution modes (Client, Cluster, and Local) for different scenarios are highlighted. Overall, the episode serves as an essential guide for understanding Apache Spark and its groundbreaking contributions to the world of big data processing.

Join me in this engaging and comprehensive episode of Think Tech that explores the powerful distributed computing system, Apache Spark, designed for big data processing. The episode covers the core concepts and inner workings of Spark, emphasizing its memory-centric architecture that enables lightning-fast processing and real-time or near-real-time capabilities. Listeners gain insights into Spark's fault-tolerant master/worker model, the significance of partitions for parallel processing, and the three essential data abstractions - RDD, Dataframe, and Dataset. The podcast also delves into Actions and Transformations, explaining their roles in optimizing data processing workflows. Additionally, the Spark Session as the entry point and the execution modes (Client, Cluster, and Local) for different scenarios are highlighted. Overall, the episode serves as an essential guide for understanding Apache Spark and its groundbreaking contributions to the world of big data processing.

NOW PLAYING

Fundamentals of Apache Spark | Think Tech #14

0:00 21:37

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Frequently Asked Questions

How long is this episode of Think Tech?

This episode is 21 minutes long.

When was this Think Tech episode published?

This episode was published on July 24, 2023.

What is this episode about?

Join me in this engaging and comprehensive episode of Think Tech that explores the powerful distributed computing system, Apache Spark, designed for big data processing. The episode covers the core concepts and inner workings of Spark, emphasizing...

Can I download this Think Tech episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!