New multi layer datawarehouse

We are currently discussing a new structure for our data warehouse. The current situation is that we have roughly 2 layers. Staging and the layer that contains partially denormalized tables, the T or end tables. I always say that the T stands for Terminus, Latin for “Endpoint” but I now believe that this means “limit.” …My latin is no longer what it used to be … 😉 We have had a Dim fact model here and there but never implemented it consistently. We use these T tables for ad hoc questions and for building reports. Of course it is true that one T table can feed the other. This is currently difficult for newcomers to follow. The current setup is around 15 years old. And is fine but is not transparent enough for the business and external accountability.

In detail, a ‘T table’ is therefore a commonly used combination of staging tables joined together, for example, we have a T_production daily that contains all production. Wide in number of columns and with many rows. Nowadays the servers have no more problems with this. YES!! 1 table to rule them all and in darkness bind them… (Ban of the Ring) That’s the idea.

Current levels datawarehouse (very simple version)=>

Proposed multilayer model=>

Staging

Staging level seems obvious to me, this contains a copy of the production databases. There is a desire to do this incrementally and / or in real time. I leave this aside for a moment.

Business layer

We have a discussion about this. The proposal is very similar to what we are doing now in the T tables layer, only redesigned. This layer is seen as a preparation for the dim/fact layer that is more intended for the end user.

People from our department will be able to directly work on the business layer because our experience shows that for SQL people, dimfacts are not really pleasant schema’s to work with. Nevertheless, we wonder whether we should do this, since this is not mentioned in the standard holy bibles….Kimball and others…

Dim / fact layer (self service)

The source system of production is very complicated. I also worked with competing systems and they were a lot easier. It strikes me that people with years of background in this system when they start modeling dimensionally, they tend to follow that complexity. What then arises are fact tables with 15 dimensions. My proposal is to reduce this as much as possible. No less than 5 dimensions per facttable 😉

Unfortunately, a system to be designed and accepted by the project members depends on a compromise between the members. I do not yet know which way it is going. The tendency is now to avoid disagreements so after repeating opinions several times they are not repeated again. But if the goal is to build something better, everyone should talk and listen to each other. Welcome to the human factor … I have experienced a project several times that would rebuild the data warehouse and they all fail by human factors. But not yet!!!