Updates
Latest Tweet
What's New?
Check out for latest innovation, a computer based training video collection
Like this Page
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition) Review by Michael Tozer
Joining the Debate
The title for this review of Ralph Kimball's book is chosen with set purpose in mind. Within the past several days, I've completed reading both Kimball's data warehousing tome, here reviewed, and that of Bill Inmon. Now, ostensibly there rages a debate within the corporate data warehousing community between the disciples of Kimball's and Inmon's competing approaches. For this reason, it was interesting and enlightening to read both books in short succession. It is also important to note that my assessment of the debate is influenced by over twenty-five years worth of experience in the discplines of logical data modeling and relational database design. And this is the reason for the selection of my review title, mentioned above. I employ a term from the relational world to draw attention to the fact that, when we are talking about databases today, we still must reasonably do so from a relational perspective. And it is this important perspective that is so evidently absent from Kimball's approach.
Kimball's concept is founded on the notion of a "dimensional model" for database. Quite interestingly, Kimball pleads ignorance relative to the question of the actual origins of this dimensional approach. With this, I can be of assistance. In the early days of the Decision Support Software industry, there was a product known as Express. I believe the vendor was Management Decision Sciences, Inc., or something like that. This product competed, at one level, with IFPS, the Integrated Financial Planning System(IFPS), which was sort of like fancy Fortran, and at another level with the then emerging world of relational database software. I still remember meetings from back in the early 80's when proponents of Express would argue passionately that data ought to be organized in "cubes", the forerunner, and predecessor, Ralph, of dimensions. Now, when you pinned the technical folks advocating such an approach down, they would finally admit that what they were talking about was really nothing more than a fancy array processor. That's what it was. And that is the essence of this whole "dimensional model" concept.
It is interesting to compare and to contrast the approaches taken by Inmon and Kimball in their respective books on Data Warehousing. Inmon acknowledges that there is a debate extant. He also respectfully cites Kimball's contributions to the debate within the corpus of his text. Kimball is silent on the identity of his rival. And this silence really speaks volumes. He, Kimball, that is, is also strangely silent on even the efficacy of a relational design of any warehouse data structure, finally allowing that you may allow such a thing in a "staging area". But you mustn't let your users know about it. This is the strangest sort of censorship of important corporate data I've ever encountered. Consider the following: Suppose we work for an organization with say, seven million customers. Should we not, in this instance, have a relational database table somewhere that has seven million rows, one row representing each customer? And should not this table be readily available to our user community? These questions are intended to be rhetorical. However, on reading Kimball's book, we judge that he, and his followers, would strongly resist such a common sense line of reasoning.
Kimball's book is noteworthy in so far as he does present many interesting, and potentially useful, designs. However, his mute avoidance of the essence of the ongoing debate says all we really need to know about his outreach. Were the good Dr. Codd, inventor of the Relational Model for Database, alive today, it seems clear that he would give Ralph Kimball a good scolding, and direct him to stick to end user analysis, leaving actual issues of database design to more fully arrived professionals.