Bigdata

Big Data Platform

Here I am comparing the most famous vendors who offer hadoop platform for enterprise Below is a typical vision of big data analytics architecture

Continue reading

Data Analysis

A friend of mine asked me what is data analysis. This is a simple but difficult question. It is simple because we talk about data analysis all the time and everywhere. It is difficult because there are so many ways of explaining it at different time. In the ear of big data, I think data analysis have three following areas. Flatten Analysis: Analysis is performed on the static data set from single dimentional view.

Continue reading

Data Lake Stages

Edd has post a very impressive blog about how Hadoop ecosystem influence the data lake in enterprise recently. It discussed about the four following stages when enterprise’s data evolution to the dream of data lake. I also share some of mine as addition. Stage 1 - Life Before Hadoop In this stage, the enterprise data architecture has following characteristics. Applications stand alone with their databases Some applications contribute data to a data warehouse Analysts run reporting and analytics in data warehouse What’s more:

Continue reading

Moving to the Spark

It has been a while that the blog is now updated since 2014 is a ready busy year. After I almost completed my first book recently, I think it is the right time to start new journey in big data for real time processing. Big data ecosystem has great changes over the past two years. The speed of big data processing becomes the hot topic over the past year. When Hadoop enter the area of Yarn, it becomes more like a distribute computing infrastructure.

Continue reading