
Edward Elgar Publishing UN iLibrary HeinOnline Directory of Open Access Books SAGE Journals ASTM Compass
From this page you can:
Home |
Descriptors



Data analytics with Hadoop / Benjamin Bengfort
Title : Data analytics with Hadoop : an introduction for data scientists Material Type: printed text Authors: Benjamin Bengfort, Author Publisher: Beijing : O'Reilly Publication Date: 2016 Pagination: xvi, 268 p. Size: 24 cm ISBN (or other code): 978-1-491-91370-3 General note: Includes bibliographical references and index Languages : English (eng) Original Language : English (eng) Descriptors: Apache Hadoop
Big data
Data mining
Electronic data processing - Distributed processing
File organization (Computer science)Class number: 006.312 Abstract: "Data analytics with Hadoop "- Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you’ll focus on particular analyses you can build, the data warehousing techniques that Hadoop provides, and higher order data workflows this framework can produce. Data scientists and analysts will learn how to perform a wide range of techniques, from writing MapReduce and Spark applications with Python to using advanced modeling and data management with Spark MLlib, Hive, and HBase. You’ll also learn about the analytical processes and data systems available to build and empower data products that can handle—and actually require—huge amounts of data. Understand core concepts behind Hadoop and cluster computing Use design patterns and parallel analytical algorithms to create distributed data analysis jobs Learn about data management, mining, and warehousing in a distributed context using Apache Hive and HBase Use Sqoop and Apache Flume to ingest data from relational databases Program complex Hadoop and Spark applications with Apache Pig and Spark DataFrames Perform machine learning techniques such as classification, clustering, and collaborative filtering with Spark’s MLlib Contents note: The age of the data product; An operating system for big data; A framework for Python and Hadoop streaming; In-memory computing with Spark; Distributed analysis and patterns; Data mining and warehousing; Data ingestion; Analytics with higher-level APIs; Machine learning; Summary : doing distributed data science; Record link: https://library.seeu.edu.mk/index.php?lvl=notice_display&id=17989 Hold
Place a hold on this item
Copies
Barcode Call number Media type Location Section Status 1702-002338 006.312 Ben-Dat 2016 General Collection Library "Max van der Stoel" English Available 1702-002339 006.312 Ben-Dat 2016 General Collection SEEU Library Skopje English Available Hadoop / Tom White
Title : Hadoop : the definitive guide Material Type: printed text Authors: Tom White, Author Edition statement: 3rd edition Publisher: Beijing : O'Reilly Publication Date: 2012 Pagination: xxiii, 657 p. Layout: ill. Size: 24 cm ISBN (or other code): 978-1-449-31152-0 General note: Includes index Languages : English (eng) Original Language : English (eng) Descriptors: Apache Hadoop
File organization (Computer science)Class number: 005.74 Abstract: "Hadoop: The Definitive Guide"- provides a comprehensive and detailed guide to the Hadoop ecosystem. The first three chapters provide an overview and history of the Hadoop project and introduce the two primary components; HDFS and Map Reduce. The following chapters focus in great depth on the architecture of HDFS and Map Reduce. These chapters build upon the introductory chapters and dive deeper into topics such as architecture, availability, compression, file systems, building Map Reduce applications, jobs and tasks. The core components of Hadoop are HDFS and Map Reduce and and they are covered by the author in a progressive and digestible format. Each chapter provides in addition to description, technical details and recommendations, working examples that build chapter upon chapter to walk you through simple illustrative examples of each of the concepts. The examples are well presented and easy to understand and consistently use the data and use cases from previous chapters so that the reader does not need to comprehend a new use case for each example distracting focus from the example’s message. There are two chapters devoted to configuring and operating a Hadoop cluster and these are augmented by the three appendices that cover installation and prepping for the example code. Working through the example set up in combination with these chapters should prepare the reader for their own Hadoop implementation. Pig, HBase and ZooKeeper are addressed in later chapters but are not covered in the same depth as HDFS and Map Reduce. Each of these additional tools deserves its own reference and there are many available. If you have covered the previous chapters the author’s introduction and examples to these three tools will get you started. Contents note: Meet Hadoop; MapReduce; The Hadoop distributed filesystem; Hadoop I/O; Developing a MapReduce application; How MapReduce works; MapReduce types and formats; MapReduce features; Setting up a Hadoop cluster; Administering Hadoop; Pig; Hive; HBase; ZooKepper; Sqoop; Case studies; Installing Apache Hadoop; Cloudera's distribution including Apache Hadoop; Preparing the NCDC weather data; Record link: https://library.seeu.edu.mk/index.php?lvl=notice_display&id=17951 Hold
Place a hold on this item
Copies
Barcode Call number Media type Location Section Status 1702-002265 005.74 Whi-Had 2012 General Collection SEEU Library Skopje English Available Hadoop in practice / Alex Holmes
Title : Hadoop in practice Material Type: printed text Authors: Alex Holmes, Author Publisher: Shelter Island, N.Y. : Manning Publication Date: 2012 Pagination: xxiii, 511 p. Layout: ill. Size: 24 cm ISBN (or other code): 978-93-511-9742-3 General note: Includes index (p. 475-487) Languages : English (eng) Original Language : English (eng) Descriptors: Apache Hadoop
Electronic data processing - Distributed processing
File organization (Computer science)Class number: 005.74 Abstract: "Hadoop in Practice"-collects 85 Hadoop examples and presents them in a problem/solution format. Each technique addresses a specific task you'll face, like querying big data using Pig or writing a log file loader. You'll explore each problem step by step, learning both how to build and deploy that specific solution along with the thinking that went into its design. As you work through the tasks, you'll find yourself growing more comfortable with Hadoop and at home in the world of big data. Contents note: Preface; Acknowledgments; About this book; About the cover illustration; Background and fundamentals; Data logistics; Big data patterns; Beyond MapReduce; Summary; Appendix; Index; Record link: https://library.seeu.edu.mk/index.php?lvl=notice_display&id=17953 Hold
Place a hold on this item
Copies
Barcode Call number Media type Location Section Status 1702-002266 005.74 Hol-Had 2012 General Collection SEEU Library Skopje English Available