Invited Talk/Tutorial: Big Data Analytics with Spark

by admin last modified Jul 25, 2018 05:11 PM

Mark C. Lewis

Big Data Analytics with Spark

Prof. Mark C. Lewis

Department of Computer Science, Trinity University, San Antonio, Texas, USA;
Author of "Programming and Problem-Solving Using Scala" series of books

Date & Time: August 1 (Wednesday), 2018; 09:20am - 10:20am
Location: Galleria B

We live in an age of data, presenting us with the challenge of trying to find meaning in all of that data. Google's MapReduce, as embodied in the Hadoop implementation, ushered in the era of big data analytics by providing a standard system that allowed data to be analyzed across a cluster with good fault tolerance. Hadoop does this by storing results off to disk after each reduce step. This provides fault tolerance, but at a high cost to speed. The Spark framework sits in the Hadoop ecosystem as an alternative to straight MapReduce that performs more operations in memory, and thus can run much faster. Standard benchmarks have shown it performing as much as 100x faster than Hadoop on standard benchmarks. Attendees of this tutorial will be introduced to the Spark framework. We will run through a number of example problems showing how they can be solved using the operations provided through both Resilient Distributed Datasets (RDDs) and the Dataset abstraction of Spark SQL. We will also run through a number of examples of machine learning using the Spark ML library.

Mark Lewis has been in the Department of Computer Science at Trinity University since 2001. His courses tend to focus on aspects related to programming/programming languages, including web development, and simulation/scientific computing. He has been the lead author on over 30 papers spanning a range of topics from planetary ring dynamics in the journal Icarus to the SIGCSE annual conference proceedings. He is also the author of several textbooks using Scala published by CRC Press and has over 1 million views of his tutorial videos on his YouTube channel that focuses on Scala.

Filed under: , ,