Modern Data Warehouse Architecture

With the changing trends in the world of BI and the Big Data wave everywhere, a lot of organizations have started initiatives to explore how it fits in. To leverage the data ecosystem at it’s fullest potential, it is necessary to think forward and ingest new technology pieces in the right place. That way, in a long run, both business and IT will reap its benefits.

Here’s an interesting prediction by Gartner

By 2020, information will be used to reinvent, digitalize or eliminate 80% of business processes and products from a decade earlier.

Imagine all the time, money and efforts you’ll save off your existing data and infrastructure components if the Big Data implementation goes well. The architecture diagram below , is a conceptual design of how you can leverage the computation power of Hadoop ecosystem in your traditional BI / Data warehousing processes along with all the real time analytics and data science. They call it a data lake, warehouse is old school now.

ModernDWArchitecture

Alright, having a Hadoop ecosystem saves the computational time and provides all bells and whistles of real time analytics but “how does it save money? Continue reading

Advertisements

How do I begin with Hadoop?

“Tell me and I forget. Teach me and I remember. Involve me and I learn.”

                                                                                   -Benjamin Franklin

I’m a big fan of practical learning, “implement as you learn” is my mantra for learning anything. Hadoop being open source gives the best opportunity for getting your hands dirty as you read about it. There are plenty of free resources online that you can refer to get started with and in this post, I’m going to list and refer some of the good ones I’ve come across.

Getting Started with Hadoop

Depending on your level of interest in learning and exploring Hadoop, you can enroll in any of the free online fundamental courses offered from Big Data University or watch video tutorials form edureka on YouTube. These two sources do not require a sign in from your corporate email id and give a basic overview on what Hadoop is? And of-course the documentation provided by Apache helps in understanding it detail, alternatively you can read the Yahoo Hadoop tutorial. Continue reading