Tuesday, March 3, 2009

It all starts with a good data model...

Yesterday I was asked to assist in the development of a custom display that was proving difficult. By all accounts, the display should have been simple to build. It was basically a two tier report with the following structure:
  • 1. Top Level Entity
    • 1.1 Child Entity
    • 1.2 Child Entity
  • 2. Top Level Entity
    • 2.1 Child Entity
    • 2.2 Child Entity
When I saw the desired format, I imagined a data model that represented a Set of Sets. The first set being the set of top level entities. Each of those entities, would have an attribute that is a set of the Child Entities. A model like this lends itself very well to representing a 2 tier tree and it's easy to imagine how additional tiers would be added.

Instead, what I found was a model where an entity type held a attribute for the top level set and a different set attribute for each possible value of the top level set. Bear with me as I try toclear up the confusion I most likely have created. Suppose that we have a set of food groups (Vegetable, Fruit, Meat, Grain). for each food group, we also want to record examples of foods
  • Vegetable
    • Carrots
    • Peas
  • Fruit
    • Apple
    • Banana
    • Orange
  • Meat
    • Chicken
    • Beef
  • Grain
    • Wheat
    • Rice
There are really two acceptable ways to model this.

Set of Sets
MyEntityType
  • foodGroups (Set of FoodGroup)
FoodGroup
  • name (String)
  • foods (Set of Food)

Single Set with Reference To grouping type
MyEntityType
  • FoodItems (Set of Food)
Food
  • name (String)
  • foodGroup (entity of FoodGroup)

Both approaches allow for a direct navigation of the the data model. In the real life example, I saw a model that resembles this:

MyEntityType
  • foodGroups (set of FoodGroup)
  • meats (Set of Meat)
  • vegetables (Set of Vegetable)
  • grains (Set of Grain)
  • fruits (Set of Fruit)

To make matters even more challenging, the types Meat, Vegetable, Grain, and Fruit were similar in structure but not exactly the same as one another. While I can see how this might make sense in certain circumstances, it is a difficult model to work with because there is no uniformity. Another concern is that there are attributes and types that model the possible data values of FoodGroup. What happens if another food group is added? The unfortunate truth is that in order to add another entity of FoodGroup, the model itself would need to be updated.

Even if it was known that no additional food groups were ever going to be added to the list, using this model to build the display requires convoluted logic.

All of this is meant to reinforce the point that data models matter. A good data model makes everything easier. Views, Custom Searches, custom logic are all easier to implement and maintain which reduces the overall cost of ownership. Putting time into a solid data model is a wise investment.

I hope I haven't confused anyone with this post. Pictures would certainly help illustrate my point and one day I might find the time to include them. Also, if you see any similarities in your own data model, it's purely coincidental ;-)  

Cheers!

No comments:

Post a Comment