Scala Data Analysis Cookbook

Author: Arun Manivannan
Publisher: Packt Publishing Ltd
ISBN: 9781784394998
Size: 17.37 MB
Format: PDF, ePub, Docs
View: 95

Navigate the world of data analysis, visualization, and machine learning with over 100 hands-on Scala recipes About This Book Implement Scala in your data analysis using features from Spark, Breeze, and Zeppelin Scale up your data anlytics infrastructure with practical recipes for Scala machine learning Recipes for every stage of the data analysis process, from reading and collecting data to distributed analytics Who This Book Is For This book shows data scientists and analysts how to leverage their existing knowledge of Scala for quality and scalable data analysis. What You Will Learn Familiarize and set up the Breeze and Spark libraries and use data structures Import data from a host of possible sources and create dataframes from CSV Clean, validate and transform data using Scala to pre-process numerical and string data Integrate quintessential machine learning algorithms using Scala stack Bundle and scale up Spark jobs by deploying them into a variety of cluster managers Run streaming and graph analytics in Spark to visualize data, enabling exploratory analysis In Detail This book will introduce you to the most popular Scala tools, libraries, and frameworks through practical recipes around loading, manipulating, and preparing your data. It will also help you explore and make sense of your data using stunning and insightfulvisualizations, and machine learning toolkits. Starting with introductory recipes on utilizing the Breeze and Spark libraries, get to grips withhow to import data from a host of possible sources and how to pre-process numerical, string, and date data. Next, you'll get an understanding of concepts that will help you visualize data using the Apache Zeppelin and Bokeh bindings in Scala, enabling exploratory data analysis. iscover how to program quintessential machine learning algorithms using Spark ML library. Work through steps to scale your machine learning models and deploy them into a standalone cluster, EC2, YARN, and Mesos. Finally dip into the powerful options presented by Spark Streaming, and machine learning for streaming data, as well as utilizing Spark GraphX. Style and approach This book contains a rich set of recipes that covers the full spectrum of interesting data analysis tasks and will help you revolutionize your data analysis skills using Scala and Spark.

Scala Guide For Data Science Professionals

Author: Pascal Bugnion
Publisher: Packt Publishing Ltd
ISBN: 9781787281035
Size: 14.89 MB
Format: PDF, Docs
View: 74

Scala will be a valuable tool to have on hand during your data science journey for everything from data cleaning to cutting-edge machine learning About This Book Build data science and data engineering solutions with ease An in-depth look at each stage of the data analysis process — from reading and collecting data to distributed analytics Explore a broad variety of data processing, machine learning, and genetic algorithms through diagrams, mathematical formulations, and source code Who This Book Is For This learning path is perfect for those who are comfortable with Scala programming and now want to enter the field of data science. Some knowledge of statistics is expected. What You Will Learn Transfer and filter tabular data to extract features for machine learning Read, clean, transform, and write data to both SQL and NoSQL databases Create Scala web applications that couple with JavaScript libraries such as D3 to create compelling interactive visualizations Load data from HDFS and HIVE with ease Run streaming and graph analytics in Spark for exploratory analysis Bundle and scale up Spark jobs by deploying them into a variety of cluster managers Build dynamic workflows for scientific computing Leverage open source libraries to extract patterns from time series Master probabilistic models for sequential data In Detail Scala is especially good for analyzing large sets of data as the scale of the task doesn't have any significant impact on performance. Scala's powerful functional libraries can interact with databases and build scalable frameworks — resulting in the creation of robust data pipelines. The first module introduces you to Scala libraries to ingest, store, manipulate, process, and visualize data. Using real world examples, you will learn how to design scalable architecture to process and model data — starting from simple concurrency constructs and progressing to actor systems and Apache Spark. After this, you will also learn how to build interactive visualizations with web frameworks. Once you have become familiar with all the tasks involved in data science, you will explore data analytics with Scala in the second module. You'll see how Scala can be used to make sense of data through easy to follow recipes. You will learn about Bokeh bindings for exploratory data analysis and quintessential machine learning with algorithms with Spark ML library. You'll get a sufficient understanding of Spark streaming, machine learning for streaming data, and Spark graphX. Armed with a firm understanding of data analysis, you will be ready to explore the most cutting-edge aspect of data science — machine learning. The final module teaches you the A to Z of machine learning with Scala. You'll explore Scala for dependency injections and implicits, which are used to write machine learning algorithms. You'll also explore machine learning topics such as clustering, dimentionality reduction, Naive Bayes, Regression models, SVMs, neural networks, and more. This learning path combines some of the best that Packt has to offer into one complete, curated package. It includes content from the following Packt products: Scala for Data Science, Pascal Bugnion Scala Data Analysis Cookbook, Arun Manivannan Scala for Machine Learning, Patrick R. Nicolas Style and approach A complete package with all the information necessary to start building useful data engineering and data science solutions straight away. It contains a diverse set of recipes that cover the full spectrum of interesting data analysis tasks and will help you revolutionize your data analysis skills using Scala.

Serverless Computing In Der Aws Cloud

Author: Niko Köbler
Publisher: entwickler.Press
ISBN: 9783868027808
Size: 14.92 MB
Format: PDF, Mobi
View: 92

Serverless heißt das neue Trendthema im Bereich des Cloud Computing. Dabei meint Serverless nicht, dass man keine Server mehr benötigt. Es geht vielmehr darum, sich auf die Ausführung seines Codes auf der Ebene von einzelnen Funktionen zu konzentrieren und das Management von Serverinstanzen, Verfügbarkeit und Skalierbarkeit der Cloud zu überlassen. Zudem bezahlt man bei Serverless nur das, was man wirklich nutzt, nicht die bloße Bereitstellung oder Verfügbarkeit. Verwende Ressourcen, nicht Server oder Systeme. Infrastruktur ist implizit vorhanden. Das Buch führt in die Konzepte von Serverless Computing am Beispiel der AWS (Amazon Web Services) Cloud ein und beschreibt, wann der Einsatz von Serverless ein sinnvoller Lösungsansatz ist. AWS Lambda hat den Begriff Serverless geprägt, doch Serverless ist mehr als nur die Ausführung von Funktionen als Service (Function-as-a-Service, kurz FaaS). Auch in Richtung von API Gateways, Datenspeichern, Amazon DynamoDB und weitern Komponenten lässt sich Serverless denken. Praxisnahe Beispiele helfen beim Einstieg in die Serverless-Welt.