Tom's Research Extranet Blog

Friday, June 5, 2009

What’s in a name?

Or more accurately what’s in an ID?

ID formats can vary widely from one system to another. In many of the legacy systems I’ve seen, these IDs do a whole lot more than uniquely identify the Protocol or Grant. In fact many also contain embedded data. Now, I’m a bit of a purist when it comes to the role of an ID in any system. My preference is that they do their one job and do it well: Uniquely identify something. Clean, clear, and to the point. If there is other information that should be a part of the protocol or grant application then it’s easy enough to define additional properties to do that. Each property can be clear in purpose and provide the flexibility to be presented, searched, and sorted however the application demands.

I’ve seen many proposed ID formats that embed other information, such as the year the protocol was created, the version number, and yes, even the protocol status. All of these are better off being distinct data elements and not part of an ID. I can offer some practical reasons why I feel this way:

An ID must remain constant throughout the life of the object it is identifying
The purpose of an ID is to provide the user a way to uniquely refer to an object. We come to depend upon this value and things would get confusing if we were never sure if the ID we used before will still work. If additional data is embedded into an ID, the risk of the ID having to change because the embedded value changes is real. If this happens, all trust of the correctness of the ID is lost.
Don’t force the user to decode an ID in order to learn more about an object
It’s easier to read separate fields than it is to force the user to decode a sometimes cryptic abbreviation of data. My preference would be to store each field individually and, wherever it makes sense to do so, display the additional fields alongside the ID. Keeping the fields separate also allows for the ability to search, sort, and display the information however you wish.
All required data may not be known at the time the ID is set
If some of the additional information embedded in the ID is not known until the user provides it, there is a good chance that it isn’t known when the ID needs to be generated. This can happen quite easily, because the ID is set at the time the object is created. Addressing this issue can get tricky depending upon the timing of creation so it’s best to avoid the problem by not embedding data.
When using a common ID format, devoid of additional data, the ID generation implementation can be highly optimized
This is the case within Click Commerce Extranet and altering the default implementation can have an adverse performance impact. We know this to be true because our own implementation has evolved over time. This evolution was driven in part because we try to strike a balance between an easy to generate unique ID such as a GUID and one that is human readable and easier to remember, but also because of the need to avoid system level contention if multiple IDs need to be generated at the same time.

The ID format we use in the Extranet product is the result of attempting to strike a balance between performance, uniqueness, and human readability. Also, since IDs are unique within type, we introduced the notion of a Type specific ID Prefix.

Often the biggest challenges in adopting a new ID convention aren’t technical at all. They’re Human. When transitioning from one system to another (Click Extranet isn’t really different in this respect) there are a lot of changes thrust upon the user and change is difficult for many. Users may have become accustomed to mentally decoding the ID to learn more about the object but, in my experience, avoiding the need to do that by keeping the values separate ultimately makes for an easier system to use and comprehend.

Cheers!

Thursday, May 14, 2009

Modeling a multiple choice question

Sometimes the fact that a problem can be solved in many ways isn’t a good thing as it forces you to weigh your options when only one approach encapsulates current best practices. One such case is how to model a multiple choice question. You have a need to allow the user to select from a list of items. Your example if “safety equipment” but it could really be anything. In your question, you pose two possible approaches and are asking which is the preferred implementation:

A separate Boolean fields for each possible choice, or
A Selection CDT and an attribute that is a Set of that CDT

While you could make both approaches work, there are significant advantages to using a Selection CDT that is populated with data representing all possible choices rather than separate Boolean fields.

I’ll use an example to demonstrate. Suppose you want to have the researcher specify which types of safety equipment will be used and the types to choose from are gloves, lab coats, and eyeglasses (I’ll intentionally keep the list short for this example but it would obviously be longer in real life)*

In option 1, you would define the following Boolean properties:

gloves
labCoat
eyeglasses

You could either define them directly on the Project, or, more likely, create a Data Entry CDT called “ProtectiveEquipment” and define the attributes there. Then you would define an attribute on Protocol named “protectiveEquipment” which is an entity reference to the ProtectiveEquipment CDT. Once the data model is defined, you can add the fields for each property to your view.

It’s pretty straightforward, but not the path I would recommend. The reason I say this is that by modeling in this way, you would embed the list of choices directly into the data model which means that if the choice list changes, you would have to update the data model and anything dependant upon it. Embedding data into the data model itself should really be avoided if at all possible.

The same functional requirements could be met by option 2. With this approach, you would define a Selection CDT named “ProtectiveEquipment” and add custom attributes to it that may look something like this:

name (String) – this will hold the descriptive name of the specific piece of safety equipment. Recall, that you automatically get an attribute named “ID” which could be used to hold a short unique name (or just keep the system assigned ID)
You could add other attributes if there was more information you wanted to associate with the type of equipment, such as cost, weight, etc.

Then, on the Data Tab in the Custom Data type editor for the ProtectiveEquipment CDT, you can add all the possible types of equipment. There will be one entity for each type meaning, for this example, one each for gloves, lab coat, and eyeglasses.

The last step in completing the data model would be to then add an attribute to your protocol named “protectiveEquipment”. This attribute would be a Set of ProtectiveEquipment so it can hold multiple selections.

Next you can add the protectiveEquipment attribute to your Project view. In doing this, you have a couple of options for how the control is presented to the user. You can specify you want it to display as a Check Box list, in which case the user would see a list of checkbox items, one per ProtectiveEquipment entity, or you could use a chooser, in which case the user would be presented with an “Add” button and they could select the equipment from a popup chooser. If the number of choices is small (less than 10, for example) the checkbox approach works well. If the number of different protectiveEquipment types can get large, you’re better off going the way of the chooser. Both visual representations accomplish the same thing in the end but the user experience differs.

So why is option 2 better?

The list of choices can be altered without having to modify the type definition. You have the option of versioning the list of entities in source control so that they are delivered as part of a patch to your production system or only versioning the type definition and allowing the list to be administered directly on production.

Views will not have to change as the list of choices change
Code does not have to change as the list of choices change. Your code avoids the need to reference properties that (in my opinion) are really data rather than “schema”. This is critical because if the list of choices change, you won’t have code that has to change as well. Instead your code will simply reference the custom attributes
You have the ability to manage additional information for each choice which you may or may not need now, but is easy to add because you’re starting with the right data model.

(Note: These reasons are also why I would recommend against defining properties of type “list”)

I hope this explanation is clear. If not, please let me know.

Let’s say, for the sake of argument, that you had the additional requirement that the user must not only say they will use eyeglasses but also have to specify how many they will use. This is easily accomplished but changes the recommended approach a bit. To support that requirement, you would set up the following data model.

ProtectiveEquipment (Selection CDT)

Name (string)

ProtectiveEquipmentUsedOnProtocol (Data Entry CDT)

EquipmentType (Entity of ProtectiveEquipment)
Count (integer)

Protocol

protectiveEquipmentUsed (set of ProtectiveEquipmentUsedOnProtocol)

You would then add the Protocol attribute protectiveEquipmentUsed to your view. When rendered the user will be presented with a control that includes an Add button. When the user clicks that button, a popup window will be displayed that prompts the user for the equipment type (which can either be a drop down list, radio button list, or chooser field) and count. You can define a view for the ProtectiveEquipmentUsedOnProtocol to make the form look nice since the default system generated form is kind of bland.

I hope this helps. Let me know if you’d like me to clarify anything.

Cheers!

* Thanks to The Ohio State University for the example

CCC 2009 Day 2 - Lessons learned and the Road Ahead

After only getting 5 hours of sleep in 3 days, I failed to summon the energy before I could post an update on CCC Day 2. Now, the conference is over and I’m jetting back to Portland. Since I have a few hours to kill, it’s time for me to post a CCC recap.

The day’s agenda was basically split into parts:

A series of presentations on lessons learned from implementing everything from IRB to Clinical Trials.
Presentations from Click on the road ahead for Click Products.

I attended the IACUC and Clinical Trials lessons learned and found them both very interesting. Personally, I’m still coming up to speed on Clinical Trial so this was another opportunity for me to wade around in unfamiliar terminology but, just as wading around in a pool filled with cold water, I’m getting used to it and the shock to my system is diminishing. There were presentations from both Utah and Research Institute at Nationwide Children’s Hospital on Clinical Trials and it was a good opportunity to learn from other’s first hand experiences. My biggest takeaway was that this solution, more than any others, involves so many different groups that reaching a consensus on how a certain feature should work is very difficult. When planning a CTPT implementation, the cost of politics and “design by committee” should not be underestimated. I was pleased to hear that both institutions have worked through most of those challenges.

The session on IACUC was presented by MCW and was very good as well. I had planned to attend DJ’s SF424 update so that I could heckle but I’m glad I stayed to hear about IACUC. I’ll just have to give DJ a hard time back in Portland.

The rest of the day was DJ time. He presented an update on Extranet 5.6 and followed with a discussion of future development efforts. I was personally gratified to see that many of the 5.6 enhancements drew cheers from the CCC attendees. For me, that kind of positive feedback makes the hard work the Click Engineering team put in even more worthwhile.

The conference wrapped up in typical fashion with an open call for suggestions for future improvements. All the familiar faces chimed in with suggestions, some old and some new and I was glad to see some new contributors. It’s the open dialog from as many CCC members as possible that continues to drive the product forward in the right direction.

A chance for some final conversations as the crowd thinned out and then CCC 2009 came to a close. I left feeling good about the work we’ve done (knowing that there’s always more to do) and impressed with what all of you have accomplished. My number 1 thought for making next year’s conference even better is for Click to deliver Solution level presentations that demonstrate new enhancements, development trends, best practices and future roadmap discussions. While not general platform products like Extranet, the exchange of ideas about how to make them better would be very valuable to Click and I assume everyone in the CCC community as it’s an opportunity for the collective group to share ideas.

I truly enjoyed meeting everyone once again and learning from your experiences. Thank you! This is something I missed in the 2 years I was away and I’m looking forward to doing it all again. I wonder where it will be next time…

Cheers!

Tuesday, May 12, 2009

CCC – Day 1 recap

It’s now 1:22 AM and I’m bringing day 1 of CCC 2009 to a close. It was a good day of sessions dominated by what I decided were two common themes: Reporting and Integration. Now, to be fair, these topics were on the agenda, but based upon how many times the topics came up in both the presentations and the all important small group conversations, these are clearly problems looking for a solution.

The day kicked off (after the key note address) with a session on reporting. Martin kicked it off with a discussion on PDF generation. Martin’s presentation highlighted some challenges in generating PDF document when the end user has full control over the Word document format. Microsoft Word has issues in converting to PDF in certain cases. The case Martin demonstrated was when the document had text wrapping around an embedded image. In the generated PDF, the text bled into the image making it difficult to read. He made a point to say that this was a Microsoft issue rather than an issue with the Click software, but to the end user it really doesn’t matter where the problem lies. What should concern the end-user is that the problem exists at all. A workaround for the occasional problem in converting a Word document to PDF is to print the document to a PDF driver in order to generate the PDF. This approach leverages Adobe’s print driver for PDF generation rather than Microsoft document conversion to achieve consistently better formatting in these special cases. The downside is that the end-user (typically an IACUC or IRB administrator) must then upload the generated PDF. A small price to pay for a properly formatted PDF document, but annoying nonetheless.

I followed with a review of the different ways to define and develop reports. I won’t bore you with the details here as the entire presentation will be posted to the ClickCommerce.com website for this CCC meeting. The point I wanted to stress in my presentation is that the notion of reporting carries with it a broad definition. Reports are anything that provides information to the reader. It includes informational displays that either summarize information or provide specific detail. It can be presented in a wide variety of formats. There really are no restrictions. The goal of a “report” is to provide answers to the questions held by the targeted users and it’s important to first understand what those questions are. Reports can be delivered in a variety of ways, from online-displays, to ad-hoc query results, to formatted documents that adhere to a pre-specified structure. A report is not one thing – it’s many.

David M. followed up on this same topic during his afternoon session, providing more detail on the type of reporting Michigan has implemented to track operational efficiencies. Karen from Ocshner also contributed with how they report their metrics and cited some very impressive efficiency numbers. Given their rate of growth over the last few years, their ability to maintain the level of responsiveness in their IRB is something that will continue to impress me long after this conference is over.

My session on integration approaches combined with Johnny Brown’s report on their progress toward a multi-application-server model highlighted the challenges in managing a distributed enterprise. There clearly is a need to establish best practices around this topic. The questions raised during both sessions were excellent and provided me with several topics to cover in future posts.

Unfortunately I missed the last two sessions as I had to attend to other matters, but from what I heard, the session on usability improvements presented by Jenny and Caleb was a big hit. Even DJ saw things in that presentation that got him thinking about ways to enhance the base Extranet product. Some of that was presented at C3DF and I agree, the judicious use of DHTML and Ajax really goes a long way to improving the overall user experience.

All the sessions were good but I especially enjoyed the small group discussions before and during dinner. I had the pleasure of dining with representatives from The Ohio State University, University of British Columbia, and Nationwide Children’s. If any of you are reading this, thanks for the great conversation and I’m looking forward to the “pink tutu” pictures.

The day was capped off by a survey of local nightlife. Thanks to our UMich hosts for being our guide and keeping us out late. I now have some new stories to tell.

Tomorrow’s almost here – time to get some sleep.

Cheers!

Social Networking – The old fashioned way

It’s day 1 at the annual CCC conference and it's great to see so many of our customers all in one place. Sitting in the ballroom at University of Michigan on a beautiful sunny day, I’m struck by the notion that there’s nothing better than meeting face to face. With all the hype about “Social Networking” where all contact is virtual, it’s refreshing to see that the old ways still seem to work best. Don’t get me wrong. I’m a fan of email, web-conferences, chat and the like, but not to the exclusion of a real face-to-face exchange. I’m looking forward to seeing some great presentations, but I’m even more excited about what I hope to be many hallway conversations, where I get to learn about what you’re up to. I’ll try to post as often as often as I can throughout the conference, but for now let me just say thanks to those of you who journeyed to participate in this old social custom.

Monday, April 20, 2009

The SDLC, Part 3 – Common pitfalls when applying configuration updates

You’ve followed the recipe to the letter…only to discover you’re out of propane for the barbecue.

You brush and floss just like you’re supposed to….but the dentist tells you you have a cavity anyway.

You’ve conducted your workflow development with as much rigor and care as humanly possible…but your configuration update fails to apply.

Sometimes things just don’t go your way. This is true with all things and, when it happens, it often leaves you scratching your head. When it comes to failed Configuration Updates, it’s sometimes difficult to figure out what went wrong, but there are some common pitfalls that affect everyone eventually. I’ll discuss a few of the more common ones with the hope that you are one of the fortunate ones who can avoid pain by learning from the experiences of others.

Pitfall #1: Developing directly on Production

The whole premise of the SDLC is that development only takes place in the development environment, and no where else. While this sounds simple, it’s a policy frequently broken. Workflow configuration is a rich network of related objects. Every time you define a relationship through a property that is either an entity reference or a set, you extend the “object network”. In fact, your entire configuration is really an extension of the core object network provided by the Extranet product.

The Extranet platform is designed from the ground up to manage this network, but its world is scoped to a single instance of a store. The runtime is not, and in the case of the SDLC, should not be aware of other environments. This means that it can make no assumptions about objects that exist in the store to which the configuration update is applied. It must assume that the object network in development reflects the object network in staging and production. Because of this, it’s trivially easy to violate this assumption by configuring workflow directly on production. If that’s done, all assumptions that the state of production reflects the state of development at the time the current round of development began are incorrect and the Configuration Update is likely to fail.

Errors in the Patch Log that indicate you may be a victim of this pitfall will often refer to an inability to find entities or establish references.

One common cause for such an error is when you add a user to the development store but there is no corresponding user with the same user ID on production. Some objects include a reference to their owner. In the case of Saved Searches, for example, the owner will be the developer that created the saved search. In order to successfully install the new saved search on the target store, that same user must also exist there.

Troubleshooting this type of problem is tedious and sometimes tricky because it’s often necessary to unravel a portion of the object network. It’s a good idea to do whatever you can to avoid the problem in the first place when you can.

Bottom Line: Only implement your workflow on the development store and make sure that all developers have user accounts on development and production (TIP: You don’t need to make them Site Managers on production).

Pitfall #2: Not applying the update as a Site Manager

If your update fails to apply and you see a message that has this in the log entry:

Only the room owner(s) can edit this information.

you are probably not specifying the credentials for a site manager account when Applying the Update.

This can happen when a user is not provided in the Apply Update command via the Administration Manager or if the provided user is not a Site Manager. The installation of new or updated Page Templates causes edit permissions checks for each component on the page template and unless the active user is a Site Manager those checks will likely fail.

Bottom Line: Always specify a site manager user when applying a Configuration Update. Technically this isn’t always required depending upon the contents of the Configuration Update, but it’s easy to do so make a habit of doing it every time.

----

More pitfall avoidance tips next time….

Cheers!

Sunday, April 19, 2009

Limitations in my blogging approach, and what, if anything, to do about them

I’d like to take a minor time out from the SDLC discussion to solicit feedback on how to make this blog more useful. In my post back on March 2nd, I described that I’m actually hosting this blog at http://ResearchExtranet.blogspot.com then exposing into the Click Commerce site at Tom's Blog via an RSS Viewer component. This approach, while allowing me to use a wide array of authoring tools, does have some limitations for the reader. The two I have found the most inconvenient are:

BlogSpot recently decided to include an invisible pixel sized image into every post so they can track readership. This seemingly innocuous change is the cause for the Security Warning being displayed by the browser so the user sees a message that looks something like this: “This webpage contains content that will not be delivered using a secure HTTPS connection, which could compromise the security of the entire webpage.”

This happens because http://research.clickcommerce.com is SSL secured for authenticated users and the source URL for the tracking image is not. Though the warning doesn’t translate into a problem actually seeing the blog post, it is annoying. The friendly people at BlogSpot have informed me that they are looking into providing better support for SSL enabled sites. I’m hopeful they will provide a solution so I’m inclined to wait this one out if you are willing to suffer the wait with me. Please let me know if this inconvenience is a major issue for you.
There have been times when an image would have done a better job than mere words in making my point. To allow the image to be viewable no matter where you read my blog (ClickCommerce.com, BlogSpot, or your favorite blog reader), the image needs to be available to all and not hosted. I’ve avoided using them because of the mixed content warning which results when presenting an image from a site other than the site where the blog post is viewed. So I put it to you, do you view the blog from locations other than ClickCommerce.com? Would you be willing to see and dismiss the mixed content warning in order to get the benefit of embedded images? An alternative would be for me to post knowledgebase articles and use the blog posts to introduce them. It’s not quite as convenient as having it all in one place, but would avoid the issue with the warning. Please send me your thoughts on how you’d like to see this blog move forward.

And now back to regularly scheduled programming….

Cheers!
- Tom

Saturday, April 18, 2009

The SDLC Part 2 – Process Studio and Source Control

Last time I introduced the notion of the recommended Software development Lifecycle (SDLC). Now it’s time to get a bit more specific.

As mentioned last time, the best way to support a disciplined development process is to make use of three distinct environments: Development, Staging, and Production. Each environment can either be made up of a single or multiple servers. While there is no requirement that each environment be like the others, it is recommended that your staging environment match production as closely as possible so that experience gained from testing your site on staging will reflect the experience your users will have on the production system. It’s also useful because this will best enable you to use your staging server(s) as a warm-spare in case of catastrophic failure of the production site.

Further Reading….

FAQ: Everything You Wanted to Know about Source Control Integration But Were Afraid to Ask

HOWTO: Apply Large Configuration Updates

To go into everything you can do when configuring and implementing your workflow processes would take more time than I have here and there are several good articles and online reference guides available in the knowledgebase. We also offer both introductory and advanced training courses. Instead I’ll focus on how to manage the development process.

A key principle of the SDLC is that development only takes place in the development environment and not on staging or production. The work you do in the development environment gets moved to staging so it can be tested through a configuration update. A configuration update is a zip file that includes the full set of changes made during development that need to be tested then deployed to production. In order to accurately identify the changes that should be built into the configuration update, each individual change is versioned in a repository using Microsoft Visual Source Safe.

Making a change or enhancement to workflow configuration begins by checking out the elements from the configuration repository using a tool called Process Studio. Once checked out, development takes place using the web based tools, Entity Manager, or Site Designer. Before you should consider the change complete, the changes are tested locally on the development server. If everything works as expected, the changes are checked back into source control using Process Studio. This process repeats itself for all changes.

When the development of all intended fixes/enhancements is complete, it’s time to put them to the test. While developers are expected to test their changes in the development environment before they are checked into source control, official testing should never be done on development. The reason for this is that the development environment is not a good approximation for production. Developers, in the course of their work, make changes to data and environment that make it hard to be able to use the test results as a predictor of how the changes will work on production. Instead, a configuration update is created using Process Studio so it can be applied to staging for official testing. Before applying the update to Staging, it’s a good idea to refresh the staging environment from the most recent production backup. This gives you the best chance of understanding how the changes will behave on production.

If issues are discovered during testing on staging, the process is reset.

The issues are fixed in development (check-out, fix, check-in),
A new configuration update is built,
Staging is restored from a production backup,
The configuration update is applied to staging
The changes are tested

If all the tests pass, the exact same configuration update that was last applied to staging is then applied to production. Though not required, it’s a good idea at this point to refresh your development environment with a new backup of production. The closer the development environment is to the real thing, the fewer issues you’ll have going forward.

At this point, development can begin on the next set of features, fixes, and enhancements. And the cycle repeats…

To learn more about the role of source control in your development lifecycle, please read the following article:

FAQ: Everything You Wanted to Know about Source Control Integration But Were Afraid to Ask

That article does an excellent job describing all core principles and processes. Of course, not everything goes as planned as you apply updates to your staging and production sites. Next time, I’ll discuss some common challenges and how to troubleshoot when issues do arise.

Cheers!

Sunday, April 5, 2009

The Software Development Lifecycle – Part 1

Well, it was inevitable. My goal of posting at least weekly to this blog is being threatened. It’s been over a week since my last post so it’s time to pick it back up again.

This week at Click was certainly a busy one and it made me realize that it’s time for a refresher on our recommended Software Development Life Cycle (SDLC). All software development follows a repeated cycle, sort of like the “Wet Hair, Lather, Rinse, Repeat” instructions on your shampoo bottle – simple, but effective. Generally speaking, software development follows a simple cycle as well:

Define –> Design –> Implement –> Test –> Deploy –> Repeat

This is true no matter the technology or tools. Working with Click Commerce Extranet base solutions is no different. Putting the cycle into practice requires discipline, familiarity with the development tools, and an ability to troubleshoot problems when they arise. Over the next couple of posts, I’ll be describing the Click Research and Healthcare SDLC. Along the way, I’ll highlight common problems and how to address them. Hopefully this will lead us to a discussion on how best to handle the concurrent development of multiple solutions, which is the topic of a panel discussion I’ll be hosting at the upcoming C3 conference. So…let’s get started.

Three Environments
To effectively practice the SDLC, three environments are required:

Development
This is where all active development takes place. Developers typically will work together in this environment, benefitting from and leveraging each other’s work. All work is formally versioned through the use of Source Control integration via a tool called Process Studio. We’ll be discussing the use of Process Studio in more detail a bit later. This is the only environment where development should place.
Staging (Test)
This environment is ideally a mirror image the production environment and is used as a place to test completed development before it is deemed ready for production use. It also can serve as a warm standby environment just in case there are issues with the production site that can’t immediately be resolved.
Production
This is the live system and the only site end users will use to perform their daily tasks.

Work performed in the development server is packaged up into what’s called a Configuration Update which can then be applied to Staging, where it is tested, and, if all the tests pass, to Production. For more information on what is included in a Configuration Update, check out the following Knowledgebase Article:

INFO: Understanding Configuration Updates in Click Commerce Extranet

Next time, we’ll talk about how configuration updates are built and special things to consider in order to make sure they can be correctly applied.

Tuesday, March 24, 2009

Ghosts in the machine

It's our goal to provide a product that makes the configuration of typical workflow processes relatively easy to implement, deploy and maintain. The challenge in doing this is to also provide a tool set that provides all the flexibility you need to be able to model your processes. The end result is a powerful application with some sharp edges.

I'd like to talk about one such sharp edge, but first let me set up the discussion by sharing with you a problem we encountered this past week. It all started with an observation that data was changing unexpectedly. There were apparently ghosts in the machine.

Values in custom attributes on ProjectStatus, which were set during as part of the configuration and should never change under normal use, were changing nonetheless. Keeping things simple, let's say the type definition looked like this.

ProjectStatus

ID

customAttributes

ProjectStatus_CustomAttributesManager

canEdit (Boolean)

The canEdit attribute is used by the security policies to help determine if the project is editable based upon it's status. It's value is set at design time but it was discovered that the canEdit values in the site were different than what was originally defined, causing the project to be editable when it shouldn't be (or not when it should be). Let's keep things simple by only using three states of the Study project type:

name	canEdit
In-Preparation	true
Submitted For Review	false
Approved	false

In the site, there was an administrative view, available to Site Managers, that allowed for a manual override to the project's status. The View had the following fields on it:

Field	Qualified Attribute
Project ID	Study.ID
Project name	Study.name
Project Status	Study.status (Entity of ProjectStatus; Select list)
Project Status Name	Study.status.ID(String; text field)
Project Status Can Edit	Study.status.customAttributes.canEdit (Boolean; check box)

This form is very simple but creates a serious data integrity problem. The purpose of this view is to facilitate the manual setting of project status, but it does more than that. It also sets the ID and canEdit values of the new status to match what is displayed in the form. This is because the Project Status ID and canEdit fields are not displayed as read-only text. They are instead, actual form values that are sent to the server when the form is submitted. By simply changing a project from Approved to In-Preparation it also causes the ID and canEdit properties on the In-Preparation state to change to Approved and false respectively, even if the user never alters the initially displayed values for those form fields.

Looking at the form, it's easy to see how this could happen. As the form is submitted, the project status reference from the project is changed to the new project status entity. Then, that reference is used to update the ID and canEdit values.

The resolution is simple. The ID and canEdit values on the form should be displayed as read-only text rather the active input fields. By making that small change, the ID and canEdit values are purely informational as intended and are not values posted to the server as the form is submitted.

This is a simple example, but the problem is difficult to discover after the fact. The richness of the data model and the number of paths that can be used to reach a specific attribute can occasionally make troubleshooting challenging.

This example really represents a specific pattern of configuration issue. Any time you see a single view that includes both a field for an entity reference and edit fields for attributes on the entity being referred to you are putting "ghosts in the machine" ...but now you know the ghosts are really simple to keep away.

Cheers!

Tom's Research Extranet Blog

Friday, June 5, 2009

What’s in a name?

Thursday, May 14, 2009

Modeling a multiple choice question

CCC 2009 Day 2 - Lessons learned and the Road Ahead

Tuesday, May 12, 2009

CCC – Day 1 recap

Social Networking – The old fashioned way

Monday, April 20, 2009

The SDLC, Part 3 – Common pitfalls when applying configuration updates

Pitfall #1: Developing directly on Production

Pitfall #2: Not applying the update as a Site Manager

Sunday, April 19, 2009

Limitations in my blogging approach, and what, if anything, to do about them

Saturday, April 18, 2009

The SDLC Part 2 – Process Studio and Source Control

Sunday, April 5, 2009

The Software Development Lifecycle – Part 1

Tuesday, March 24, 2009

Ghosts in the machine

Search

Archive

Topics

About Me

Followers