AlarmMeRSI version 17 ‘Happy New Year’ in production

Yesterday I published version 17 ‘Happy New Year’ on google play. In this version I added 2 more Alarm types:

-One time alarm for one alarm on a day

-Random alarm in the range of the chosen time interval.

In the old version when restarting the phone the alarm was lost. Now the alarm when active is loaded again. The only way to get rid of the alarm is by choosing the garbage can button in the app.

Happy New Year!

New multi layer datawarehouse

We are currently discussing a new structure for our data warehouse. The current situation is that we have roughly 2 layers. Staging and the layer that contains partially denormalized tables, the T or end tables. I always say that the T stands for Terminus, Latin for “Endpoint” but I now believe that this means “limit.” …My latin is no longer what it used to be … 😉 We have had a Dim fact model here and there but never implemented it consistently. We use these T tables for ad hoc questions and for building reports. Of course it is true that one T table can feed the other. This is currently difficult for newcomers to follow. The current setup is around 15 years old. And is fine but is not transparent enough for the business and external accountability.

In detail, a ‘T table’ is therefore a commonly used combination of staging tables joined together, for example, we have a T_production daily that contains all production. Wide in number of columns and with many rows. Nowadays the servers have no more problems with this. YES!! 1 table to rule them all and in darkness bind them… (Ban of the Ring) That’s the idea.

Current levels datawarehouse (very simple version)=>

Proposed multilayer model=>

Staging

Staging level seems obvious to me, this contains a copy of the production databases. There is a desire to do this incrementally and / or in real time. I leave this aside for a moment.

Business layer

We have a discussion about this. The proposal is very similar to what we are doing now in the T tables layer, only redesigned. This layer is seen as a preparation for the dim/fact layer that is more intended for the end user.

People from our department will be able to directly work on the business layer because our experience shows that for SQL people, dimfacts are not really pleasant schema’s to work with. Nevertheless, we wonder whether we should do this, since this is not mentioned in the standard holy bibles….Kimball and others…

Dim / fact layer (self service)

The source system of production is very complicated. I also worked with competing systems and they were a lot easier. It strikes me that people with years of background in this system when they start modeling dimensionally, they tend to follow that complexity. What then arises are fact tables with 15 dimensions. My proposal is to reduce this as much as possible. No less than 5 dimensions per facttable 😉

Unfortunately, a system to be designed and accepted by the project members depends on a compromise between the members. I do not yet know which way it is going. The tendency is now to avoid disagreements so after repeating opinions several times they are not repeated again. But if the goal is to build something better, everyone should talk and listen to each other. Welcome to the human factor … I have experienced a project several times that would rebuild the data warehouse and they all fail by human factors. But not yet!!!

Staged rollout AlarmMeRSI

In my normal work as bi analist I have the experience of maintaining some databases with a userbase of 10 000 people. Disruptions are deadly for users’ confidence. With large user populations with a shocked trust you constantly receive questions and comments that can take up a large part of your daily working time if things go wrong. Of course I prefer not to have that at all in my private time. I do Android in my private time…and other things….

At work to reach a large group of users and know the wishes of the business is at least physically easy. It is clear who is approachable, you know where the management is and where your users are. With google play, that is of course impossible for a beginner to know. You know nothing at all outside your own circle.

Translated to the Android platform, I am assuming that a negative experience is far more important. After all, a user does not have to use your app, so negative experiences count more than positive ones. For the time being I opt for a staged rollout in the countries with a Dutch-speaking community plus Denmark because some old friends live there. In order to have room for handling issues and to see if there is any interest at all, I posted on Facebook and Linkedin that my app has been launched in recent days. At work, I also made this known to the group of close colleagues in a relatively small circle for now.

Android App in play!

Last week Archimedes launched his first app in Google play! AlarmMeRSI is an app which we use to prevent RSI from happening (repetive strain injury, for more information go here .)

It is a periodic alarm after which you can for instance stand up from behind your desk and do your exercises. (or get yourself some coffee)

This is the link to the app in google play store: Dutch version or the English version

Datawarehouse containerization

We were asked to deliver something with regard to the data architecture and modeling to be used in the new datawarehouse of my employer. The organization is recently merged with another one, whereby systems and processes are still separated from the main data warehouse.

In the new environment, the following must be taken into account:

-the different cultures of the blood groups of the fusion partners.

-the organization that is spread over several locations and several departments.

-the existing landscape that must continue to deliver, the show must go on.

-the technology and the environment require a flexible set-up that can last for several years.

With regard to the cultures:

-The largest merger partner consists of a self-managing team of experienced employees with a more ad hoc way of working. Speed of work is central. Demand driven.

-The smallest merger partner consists of 2 teams that are organized according to the principles of a demand supply organization, so they do a lot with external hiring. They also provide the new management.

To address all these factors, we propose to address this as follows:

– Leave the current data warehouse intact, “the show must go on.”

– Add the data of the smallest merger partner as quickly as possible so that an unequivocal truth is created.

-To organize the total data landscape in containers with their own staff and their own possibilities to organize their work processes without other containers being affected by this. It is also easier to work with the OTAP street if there are no or at least clearly defined interfaces between those containers. Experience shows that otap testing processes that affect everything can hardly be done anymore. So we want to reduce the complexity. The linking pin between these containers should then be formed by the Meta data repository & job control center.

The containers that we see before us are:

–Self service BI / Dashboards based on star models, perhaps to a large extent simply copy from the already present models. Use an otap street for this Self Service BI container as standard.

–Data Science section this is easily forgotten but must be named. it seems to me that this should also be arranged with a development and production part. Unless nothing is produced and then this must also be clear. In principle this concerns science and long-term studies without guarantee of results.

–Applications. There is currently a development towards real-time information provision because the production system supplier no longer support any regular reports and there are major problems with the validation of registration. Operational provision of information and ad hoc data provides the organization with money so this is important. For realtime only flat tables are important, star modelling should not be used. Timeliness and lead time are important in this container.

–Meta data repository & job control center

An automated repository that maintains meta data, performance and usage of all components of the data warehouse for auditing and maintenance and prioritizing datawarehouse processes.

In principle, the boundaries between the containers are not fixed. The core of the split is that between an “old” unmanaged part and a “new” to be built managed container landscape. Depending on the success and degree of exploitation of these containers, shrinkage or growth, including the allocated FTE, the boundary between the containers can be moved.

So what is a container? 😉

Inspiration for the setup in containers comes from our practical experience within a large complex data warehouse environment and the developments within the cloud architecture. (see, for example, docker.com) Basically, this means stopping a certain group of data warehouse activities in 1 environment, including all dependencies so that people and resources do not get in each other’s way.

Step 1 of applying “containerization” is to put picket posts within the existing (largest) data warehouse environment.