From time to time we receive an interesting question from people interested in using the Datavault Builder: Does the Datavault Builder DWH automation tool supports 3NF? The interesting answer is: yes.
Though we have actively decided against a 3NF core because we believe that splitting the data into individual functions such as keys (hubs), relationships (links), and context/attributes (satellites) solves many problems with 3NF DWHs.
This includes the creation of models, which can now be created from different starting points and can easily grow together, as well as easier maintenance in case of model changes and additionally also the independence of individual loading routes. And these are only three of many advantages.
So, we usually ask what the business requirement is to use 3NF. Most of the answers can be summarized that the data consumers were used to a 3NF core and are not sure if they can also handle queries from a Data Vault model.
Use Interfaces
This is the moment to emphasize clearly: nobody should query the core directly. In my opinion, this was already not the case for Inmon models and applies even more explicitly to Data Vault cores.
Clearly, interfaces must be created for the queries. These bring the data into a format that is optimized for queries and reports.
Very often these interfaces are created as a dimensional model since many modern reporting tools are optimized for this. However, since we have a higher level of normalization in the Data Vault, we can also provide interfaces at a finer level like 3NF.
How to create a 3NF Layer
If you have captured the expected cardinalities in the data modeling of the data vault (one-to-many, many-to-many, etc.) you even have all the information to deterministically translate a data vault model into a 3NF interface:
• The hub business key or hash is assigned to the PK
• All many-to-one or one-to-one connected hub business keys/hashes become foreign keys
• All fields from the satellites become attributes
Whether you create this as an As-of-Now or As-of-Then view doesn’t matter. You can simply go through the list of hubs and create the 3NF view per hub without any user input. If you now add a satellite in the Data Vault, you simply have to regenerate the view for that hub. Easy to create, easy to adjust.
Since we offer APIs for all functions in the Datavault Builder, it is easy to use them to create interface views fully automatically according to your own rules (e.g. do the foreign keys have an FK_ prefix, or are they named exactly the same as the target field or use the BK or hash as PK). In theory, the data consumers will never know that their data is stored in Data Vault format.
Data Vault and 3NF?
So our answer is: yes we support 3NF to present the data, but always use a Data Vault to store it. This could well be the message to convince also data consumers who want the assurance that they can continue to work with existing approaches.
And if that doesn’t work you could even hide it from their eyes how you organize your backroom. After all, when we go shopping, we don’t visit the mall’s warehouse but are happy about the well-stocked shelves.
7 november (online seminar op 1 middag)Praktische tutorial met Alec Sharp Alec Sharp illustreert de vele manieren waarop conceptmodellen (conceptuele datamodellen) procesverandering en business analyse ondersteunen. En hij behandelt wat elke data-pr...
11 t/m 13 november 2024Praktische driedaagse workshop met internationaal gerenommeerde trainer Lawrence Corr over het modelleren Datawarehouse / BI systemen op basis van dimensioneel modelleren. De workshop wordt ondersteund met vele oefeningen en pr...
18 t/m 20 november 2024Praktische workshop met internationaal gerenommeerde spreker Alec Sharp over het modelleren met Entity-Relationship vanuit business perspectief. De workshop wordt ondersteund met praktijkvoorbeelden en duidelijke, herbruikbare ...
26 en 27 november 2024 Organisaties hebben behoefte aan data science, selfservice BI, embedded BI, edge analytics en klantgedreven BI. Vaak is het dan ook tijd voor een nieuwe, toekomstbestendige data-architectuur. Dit tweedaagse seminar geeft antwoo...
De DAMA DMBoK2 beschrijft 11 disciplines van Data Management, waarbij Data Governance centraal staat. De Certified Data Management Professional (CDMP) certificatie biedt een traject voor het inleidende niveau (Associate) tot en met hogere niveaus van...
3 april 2025 (halve dag)Praktische workshop met Alec Sharp [Halve dag] Deze workshop door Alec Sharp introduceert conceptmodellering vanuit een non-technisch perspectief. Alec geeft tips en richtlijnen voor de analist, en verkent datamodellering op c...
10, 11 en 14 april 2025Praktische driedaagse workshop met internationaal gerenommeerde spreker Alec Sharp over herkennen, beschrijven en ontwerpen van business processen. De workshop wordt ondersteund met praktijkvoorbeelden en duidelijke, herbruikba...
15 april 2025 Praktische workshop Datavisualisatie - Dashboards en Data Storytelling. Hoe gaat u van data naar inzicht? En hoe gaat u om met grote hoeveelheden data, de noodzaak van storytelling en data science? Lex Pierik behandelt de stromingen in ...
Deel dit bericht