Posts

Showing posts from August, 2017

Introduction to Apache Spark

Image
What is Apache Spark? Reasons behind Apache Spark invention: • Exploding Data • Data Manipulation speed Several shortcomings of Hadoop are: • Adherence to its Map Reduce programming model • Limited programming language API options • Not a good fit for iterative algorithms like Machine Learning Algorithms • Pipelining of tasks is not easy What is Spark Apache Spark is an open source data processing framework for performing Big data analytics on distributed computing cluster. Spark Features Spark has several advantages when compared to other big data and Map Reduce technologies like Hadoop and Storm. Spark is faster than Map Reduce and offers low latency due to reduced disk input and output operation. Spark has the capability of in memory computation and operations, which makes the data processing really fast than another Map Reduce. Unlike Hadoop, spark maintains the intermediate results in memory rather than writing every intermediate outpu...

Cluster Computing and why is it used in Big Data

Image
Introduction Big data is a term for data sets that are so large or complex that traditional data processing application software is inadequate to deal with them. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating and information privacy. We will talk about big data on a fundamental level and also take a high-level look at some of the processes and technologies currently being. What Is Big Data? An exact definition of "big data" is difficult to nail down because different people use it quite differently. Generally speaking,  big data  is: large datasets the category of computing strategies and technologies that are used to handle large datasets In this context, "large dataset" means a dataset too large to reasonably process or store with traditional tooling or on a single computer. This means that the common scale of big datasets is constantly shifting and may vary s...