Wednesday, January 6, 2010

Top Tips For Designing a Data Warehouse

By Kimberlie Hutson Platinum Quality Author

Data warehousing is the process whereby your business collates its data to help you make better business decisions and improves your data analysis capability. A lot of businesses have their data in separate source systems, which makes it difficult to report across the enterprise, the data quality may not be clean enough to support decision making which is an issue you may need to address. You may also have to look at the performance of reporting too as we all need our data as quickly as possible.
Due to these different factors a lot of businesses can struggle with data warehousing because the thought processes in building data warehousing is different to other types of systems but when you get it right it can have a massive positive impact on your business.

Although you may need a data warehouse to deploy business intelligence, the modern business intelligence tool actually mask the complexities of the underlying data very well for the end users, this means that you've got one primary source of information and this is where you should begin with your business intelligence. Your IT department may object to running queries directly off their source systems, but modern technology will not struggle with this task, and because disk space is so cheap, you can replicate data very easily.

A data warehouse appliance is an all in one solution of hardware, database software and networking software, it's a very cleaver bit of kit that allows you to take data from disk, filter it and present it out to build queries from terabyte databases (which means it's more suited to larger databases and solutions). Even if you're using a data warehouse appliance, you still need to make sure it's designed, built and delivered correctly, and even after it's completed, your user requirements may still be changing. In the interest of planning for the future, it's best to optimise the design the best you can then use the appliance to give you the performance and scalability you may need at a later date, but it does allow you to process your data much more efficiently

It's not uncommon to struggle with designing a data warehouse, one of the most popular ways for implementing data warehouses now is a star schema (there's debate out in the industry as to whether that's right or wrong). This is mainly because it's assumed that our transactions, our facts, our data is so vast and we need to make it narrow as possible.

Slowly changing dimensions are a mechanism by which we store history in the dimension data that we record. So for example, your operational systems is the data changes we normally overwrite it because we're not really interested in the previous version of that record, but for historical reporting purposes we want to know what that record was previously, in the data warehouse we version those records and they're called slowly changing dimensions.

The ETL tool is there to move data from A to B and process it, traditionally people have handcrafted this process, especially if they're comfortable with the source code but most managers will prefer to have the data centralised and therefore more manageable with a tool. You're going to want a data warehouse the can evolve and change with your business so you'll need a tool set that rapidly allows you to adapt your system, the ETL tools of today (which are very gooey based, centralised, multi-user), enable people to change data, document the processing they're doing as they build it, making it a rapid development tool. As with any implantation of a new system, it's imperative that you get user buy in very quickly, this process may be easier if you involve your staff from the beginning of the project by asking them what they feel they need

At IT Performs we pride ourselves on our expertise in Data Warehousing we started specialising in this field back in the mid 90's meaning our customers can avoid many of the pitfalls, reduce the risk and gain far more value from their investment and their data.

0 comments:

Post a Comment