Nonfiction
Book
0 Holds on 1 Copy
Availability
Details
PUBLISHED
EDITION
DESCRIPTION
xxvi, 576 pages : illustrations ; 24 cm
ISBN/ISSN
LANGUAGE
NOTES
Includes index
Part 1. Gentle overview of big data and Spark. What is Apache Spark? -- A gentle introduction to Spark -- A tour of Spark's toolset -- Part 2. Structured APIs : DataFrames, SQL, and datasets. Structured API overview -- Basic structured operations -- Working with different types of data -- Aggregations -- Joins -- Data sources -- Spark SQL -- Datasets -- Part 3. Low-level APIs. Resilient distributed datasets (RDDs) -- Advanced RDDs -- Distributed shared variables -- Part 4. Production applications. How Spark runs on a cluster -- Developint Spark applications -- Deploying Spark -- Monitoring and debugging -- Performance tuning -- Part 5. Streaming. Stream processing fundamentals -- Structured streaming basics -- Event-time and stateful processing -- Structured streaming in production -- Part 6. Advanced analytics and machine learning. Advanced analytics and machine learning overview -- Preprocessing and feature engineering -- Classification -- Regression -- Recommendation -- Unsupervised learning -- Graph analytics -- Deep learning -- Part 7. Ecosystem. Language specifics : Python (PySpark) and R (SparkR and sparklyr) -- Ecosystem and community