Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Publication of C Data Warehouse Code 17/11/2002 – Today I am pleased to announce the publication of a suite of C code which has been used to load large.

Similar presentations


Presentation on theme: "1 Publication of C Data Warehouse Code 17/11/2002 – Today I am pleased to announce the publication of a suite of C code which has been used to load large."— Presentation transcript:

1 1 Publication of C Data Warehouse Code 17/11/2002 – Today I am pleased to announce the publication of a suite of C code which has been used to load large dimensional data warehouses. Today there are many ETL tools available in the market place. Surprisingly there seems to be more of them now than there were a while ago. Each of the major vendors (Oracle, IBM, Microsoft, Business Objects, Cognos) has decided to add an ETL tool of some sort to their suite of products. There are also some independent ETL tool vendors out there (Informatica, Ascential, Sagent to name a few). Most of these independent companies are trying to get out of the way of Microsoft's DTS. We have also had a number of very good books produced on star schema database design and there has been a great deal of discussion on forums like the dwlist. However, there is little public information as to how to actually go about writing the code to build a large star schema database. The ETL tool vendors all have their own way of approaching the problem. And most of these vendors and consultants who work with the tools would have you believe it is just a case of letting the tool generate you an integer key and keeping that one integer key in the dimension table to translate the real key into an integer key. The main reason for this is it is pretty easy to do. In fact, doing that kind of star schema is trivial. And it makes building summaries quite a bit harder. What keys do you use for the summary level? It turns out that it's pretty hard to maintain multiple levels of information and multiple sets of keys using the ETL tools. Today I am taking one small step to help reduce the dearth of information as to how to build a star schema data warehouse. I am publishing a full suite of example code which will extract simple data like invoice orders from a staging area and place it into a star schema data warehouse. I have also published a Type 1 dimension table maintenance program for a simple customer record. You can actually download this code, compile it and run it. Though this suite of code manages only a very simple star the same code has been implemented (in cobol) on many large star schema databases. The code is clean, efficient and demonstrates clearly all that is necessary to manage multi-level fact tables. I've decided to publish in the most standard windows/unix programming language around, C, so that the largest number of people can read the code. All those of my vintage are welcome to read the cobol!! I expect that a large number of people out there will be interesting in looking at this code. The first group of people most likely to be interested are all those who have bought Ralph Kimball's books. This is because the code I have published is the standardised code you need to load a typical star schema database as Ralph has been discussing for some time now… Press Release From the desk of Peter Nolan

2 2 Publication of C Data Warehouse Code I can still recall the frustration our team experienced back in 1994 when we tried to write this code for the first time. It was very hard for us to develop because we just didn't know how to make it work. Now you can have it, for free!! You can read more details and download the code from my Downloads page. All the best. I sincerely hope that by publishing this code that people new to star schema data warehousing can learn more, faster, and we can produce more successful data warehouse projects. Best Regards Peter Nolan Press Release From the desk of Peter Nolan


Download ppt "1 Publication of C Data Warehouse Code 17/11/2002 – Today I am pleased to announce the publication of a suite of C code which has been used to load large."

Similar presentations


Ads by Google