In the world of BI/DW, we sometimes have to put an end-to-end architecture together starting either by identifying the performance metrics, or some of the existing legacy artifacts (reports, screenshots, etc.), that we want to collect and monitor first, or by compiling an inventory of the data available to us, and then see what metrics we could derive from it (of course, everything must be based on specific requirements from the clients – for those who swear by the solid SDLC framework!). The reality is that we very often have to swing between both options to arrive at an optimal solution instead.
This post is intended to explore an interpretation of a technique that might have been long forgotten but could be very valuable when you need to design some suitable data structures in the data warehouse based on some report/screenshot artifacts from your clients.
The technique described here is the good old JSD developed by Michael Jackson in the 1970s (not the ‘Bad’ M Jackson we know in the modern days!). We will not apply all of the three steps in JSD framework. We instead use only the first Modeling Stage to help us untangle the layout, and contents of some reports in order to define a data structure (or multiple) to live in the data warehouse later.
The technique itself is simple – by reviewing the artifacts, you produce a series of diagrams describing the final desirable structures, using only sequences, iterations, and selections. Below are the basic diagram notations available to you.
Let’s start applying the JSD technique against a sample report in the IBM Cognos 10.2 suite as shown below (right click on image to view full size)
- Translate the above report into four primitives as follows
Primitive#1 represents the top three outputs
The following alternative primitive is acceptable but will be rejected because the report title depicts the high level view level to start from a Financial Year.
Primitive#2 represents the Top 10 Sales Staff
Primitive#3 represents the Revenue by Product Type
Primitive#4 represents the Top 10 Sales Staff by Quantity Sold
- We then attempt to align all the above primitives by similar levels of data relationship
- Then, finally, normalize them into the following two main data structures
We are now ready to translate the above structures into appropriate fact and aggregate tables (i.e. big and small star schemas) in our data warehouse.
The remaining effort is to determine whether or not (and how) we could seed the above tables with some data at all!
Sometimes, the old artifacts might need to be translated into some new disguises in the new data warehouse. You hence would need to apply the same modeling exercise against these new reports to obtain an overall picture of the final data structures before you can comfortably build your star schemas!
By the way, this technique is also good for XML modeling just in case you need to stream out data for consumption by other applications.
The above structure would go with the following basic XML document.