OLAP / Hadoop / Pentaho

Data Warehouse planning and implementation

Accurate analysis of data is a critical step to the success of any business strategy. Understanding and recognising patterns within data is one of the most challenging tasks a business can face. Business data within any given industry is usually collected over a period of time using a variety of techniques. These could be in the form of raw server logs, logs collected through the application layer, structured database records, unstructured data collected through social networking sites, analysed data in spreadsheets etc. Given the vast volume of data that businesses deal with today, time and accuracy are of paramount importance. Modelling a data warehouse begins with defining business needs and using ETL techniques to set the building blocks of the DW. Big data tools like Hadoop can be used to crunch data and perform accurate analysis. Finally creating cubes on top of the warehouse helps product owners make better decisions. I have gained knowledge and experience in working with big data within the ad serving industry using tools like Pentaho (OLAP cubes), Hadoop with Python scripts to analyse data on EC2.

Data Warehouse Projects

  • Unanimis Consulting Limited - Designing OLAP cubes presented in Pentaho (Fact and Dimension tables), writing custom Python scripts to analyse data on a Hadoop cluster feeding into the DW for the online ad serving industry