Thursday, December 17, 2009

Single point of configuration

In a software project, especially java, one would employ various third party libraries to ease and speed up development.

However, every library would likely have it's own configuration file or approach to configuration.

Let's look at a typical java project. A web based project would have a web configuration file (XML). It is highly likely to use logging, maybe configured with a property file. For sure there is a database connectivity, and since there is password involved, you might want to define it in the application server. And if you are using Spring, you might have a bunch more XML files to configure.

So that is like at least 3-4 files to configure.

Now let's say you are a very modular person. You like to have things configurable too. So you had a configuration file for the LDAP settings, one file for the security policies, one file for default email templates, one file for mail servers.

Now, imagine that a new developer comes along, and need to update the configuration of a bunch of things. The changes would be spread among various files.

Or imagine a fresh deployment, which require the configuration of a database connectivity, LDAP, email server. 3 files instead of one. Or maybe it is not a new server for production, but just a test server.

I was brought to the wordpress system. To configure a new copy of wordpress, all one need is to configure just one file. The file would contain all the settings. I did not look into it, but it woul also make sense for plugins to expose their settings in there too.

A conclusion I draw from this? Try to centralize an application configurations at a single location. Try to make it a facade for all the subsystems, even though they might be third party plugins, libraries, or systems. This helps in management of settings, and reduce learning curve for new developers or users.

Thursday, December 10, 2009

Making audit data friendly

Auditing is a facinating topic to me. So much so that I'm writing another blog post on it.

So, let's assume that we have a UI screen that allows a user to update his particulars. And once the user submits, the data are pushed out to be updated.

The logical thing drawn here is that there should be an audit entry indicating the user initiated an update. Simple, easy. So far so good.

Now, what if the update looked simple, but was not? The user might have updated his name, password and telephone number. And assume that the password has to be pushed to an external legacy security system. The telephone has to be pushed to the company address book system. The name to the company address book system, as well as the HR system.

Each update could fail independently of each other. And since it actually push updates to external system, we really should audit them too.

So now we actually have four audit entries. One for the user clicking submit, one for the secuity system update, one for the telephone sytem, and one for the HR system.

But how do we know the other three updates belong to the same user initiated update action?

The problem gets worse when you had another background workflow that does similar update. So you had a system initiated update, a user initiated, and a bunch of interleaved updates. They could be on the same user, or different. The focus here is the audit log. Imagine how painful it would be to decipher which update belong to which action!

We need to group all related actions. In the example above, the user initiated action could generate an action transaction id. And all the sub-actions should be tagged with the same action transaction id. This allows us to view and perhaps even search the audit log by action transactions, giving us greater control.

And as a random thought, perhaps audit entries could be catagorized with detail level too.

Let's extend the update example furthur. Assume that an approval workflow is involved. There will be audit entries for request submission, approvals, escalations and resolution due to timeout, provisioning, and so on.

We could have a level of audit for request workflow. These are request submissions, approvals, escalations, etc.

Next, a level for provisioning. The target systems being updated, and the data changes.

Depending on what one want to view, they could view a 'business action' audit, or a 'technical' audit.

This is something I noticed missing in the few systems customization project I worked in. And we usually had to build this in. And I always wonder why they never think to build this in.

I'll probably be still pondering on auditing in business applications. I hope to be involved in creating an audit friendly application development someday.

Thursday, December 3, 2009

Audit on XML data

I had to implement a feature recently for a particular requirement. Due to the nature of the product, a particular value had to be serialized into XML data before being persisted to pass through a particular adapter layer code. And just before persistence, the code would deserialize the XML data so that they can be written to the corresponding database tables.

The database is not storing XML data, to be clear. And the reason we had to do the serialization and deserialization is because the boundary of the adapter code accept only int, boolean, string, and list of strings. We needed list of objects. So our only escape route was list of strings.

It was definitely not pretty during auditing.

There is an in-built auditing feature. When data pass through the adapter, all old and new values are audited.

But now that the data are in XML, we get well.. a long string of XML. It was not too clear which value actually changed. One had to copy both old and    new XML string to some text file, do some formatting, and then do a diff on them.

The application was rather smart. Before passing the values to the adapter, they would first do thier own diff. If there were no changes, the adapter would not receive the values. The audit would not log these unchanged values too.

This, however, pose a problem with my xml data at times. Sometimes, the original value came out with a particular order of the attributes. And when saving, the order might change (perhaps I added/removed, or there was something different in the serialization). The attribute values remain the same, yet the order change, resulting in the application thinking that a change had occurred.

Imagine the frustration when you had done a diff on two values, and realized that the values were the same!

Auditing at the level of xml string is a bad idea. It always has been, and will be. If it was not for the in built functionality, and limited time, we probably would have reworked that part of the application.