Large Scale Machine Learning with Spark
Data processing, implementing related algorithms, tuning, scaling up and finally deploying are some crucial steps in the process of optimising any application. Spark is capable of handling large-scale batch and streaming data to figure out when to cache data in memory and processing them up to 100 times faster than Hadoop-based MapReduce.This means predictive analytics can be applied to streaming and batch to develop complete machine learning (ML) applications a lot quicker, making Spark an ideal candidate for large data-intens ive applications. This book focuses on design engineering and scalable solutions using ML with Spark. First, you will learn how to install Spark with all new features from the latest Spark 2.0 release. Moving on, you’ll explore important concepts such as advanced feature engineering with RDD and Datasets. After studying developing and deploying applications, you will see how to use external libraries with Spark.
Key Features, Get the most up-to-date book on the market that focuses on design, engineering, and scalable solutions in machine learning with Spark 2We use Spark's machine learning library in a big data environmentYou will learn to develop high-value applications at scale with ease and a personalized design, Book Description, Scaling out and deploying algorithms, interactions, and clustering are crucial steps in the process of optimizing any application. By maintaining and streaming data, Spark can figure out when to cache data in-memory, 100x faster than Hadoop and Mahoot. This means data streaming and analytics can run and complete jobs a lot quicker, making Spark ideal for large data-intensive applications., This book focuses on design, engineering, and scalable solutions in machine learning with Spark. You will learn how to install Spark with all new features as in the latest version Spark 2. You will also get to grips with Spark MLlib and Spark ML and its implementation for machine learning algorithms. Moving ahead, we'll explore about important concepts such as Dataframes and advanced feature engineering. After studying more about the development and deployment of an application, you will also find out about the other external libraries available for your data analysis., What you will learn, Solid theoretical understanding about machine learning algorithms and techniques for new and unknown datasetsSet up and configure Spark, and develop your first Spark application using Scala, Java, and SparkRUse ML and MLlib implement practical and large-scale machine learning pipelines and applications including collaborative filtering, classification, regression, clustering, association rule mining, twitter sentiment analysis, and dimensionality reductionScale up your machine learning application on large cluster or even cloud computing environment like Amazon EC2Enhance performance of your machine learning modelsTune your machine learning models for cross-validation, grid searching, hyperparameter tuning and train validation splitDeal with large-scale text data, including feature extraction and using text data as input to machine learning modelsDevelop machine learning application real-time streaming data using Spark Streaming
暂无评论