Tuesday, May 03, 2005

data warehousing

http://www-1.ibm.com/grid/pdf/GW510-5041-00F.pdf

Data Warehousing- I thought it'll be useful to find an analogy for the ETL process. ETL stands for Extract (take data from various potentially heterogeneous sources), Transform (modify the structure of the data so that it meets the integrity requirements of the DW system) and Load (load the data in the DW as records). To understand this whole process better, think of John- a book retailer. John's business depends to a large extent on people's ability to find the right books in his shop. To smoothen this process, John has to do 3 basic functions.

First, he has to procure the books from various sources- different publishers with their own invoicing, billing and delivery mechanisms, and perhaps libraries for used books. This is the Extract function.

Second, he has to catalog the books in his shop, so that he has a proper information structure in his bookshop, and he knows where to find what. This is the Transform function.

Third, he has to put the books and arrange them systematically so that they are not lying around in a mess, and people can search or browse for them. This is the Load function.