Monday, March 22, 2010

Application Internationalization and Localization

First off, a definition of what these are.

An application usually can and will be used by people from diverse cultural and language speakers. And so there will be demand for the application to speak to them in their own culture (date and money formats) and languages.

And so the act of creating culture and language specific versions is known as localization. But before that, you had to perform internationalization to extract all possible.

I have done two larger projects that involve internationalization of applications, with various smaller ones, but only one that actually involve localization.

And they sure are a pain to do. They use very simple concepts. Simply extract all string from the existing application and put them in a property file. And define a format string for each money and date format use, and place these in property file as well.

But if you treat each string as a unique entry, you tend to end up with alot of duplicated string. A message like "Approver name" could appear in both a form submission page, and a form status page.

It is extremely tempting at this point, to simply allow the two placeholders to use the same string entry. And in fact, that's a perfectly reasonable and valid approach.

Until the day come with you gotta change the text themselves. Imagine that, someone decided to use the message "Choose an approver" in the form submission page. Before the change to the message, he has to first evaluated the impact of change. Like, how many other parts of the UI is using the same string entry?

It is only when there are no other parts of the UI using the same string, could the developer make the change safely. But if it is actually used n many other areas, he gotta duplicate the existing entry, give it a new message id, and update that only.

So be careful of merging and reusing messages. Do them only if both are the same message, within the same context. A message string that is used during the selection of an approver, and one that is used to display the approver might have the same string, but they are in different context.

Another point in hand, is to really hire someone who is proficient in both the language translating from and to. The least worry you want to have is on the grammer and spelling related mistakes.

And lastly, avoid concatination of messages. You might have a message that goes like this: "Your account expired on XXX. Would you like to renew?"

It is tempting to have two string here, "Your account expired on ", and "Would you like to renew?". Do that, and you might wish you had never been involved in the project when there are message changes. One day the message might be changed to "XXX: Account expired. Renew now?"

It is better and cleaner, instead, to use a single string with placeholders like this: "Your account expired on {0}. Would you like to renew?". That way, you get the absolute flexibility of having the formatted value anywhere in the message, and you get to keep the message as a whole. During localization to other languages, the context of the message is clearer to the person working on it as well.

But granted, what I have described are simple principles and techniques, but doing the actual work itself is never easy. Especially when you are doing a 'extract message id' along the way you develop in a team environment. "I need a new message, does it exist? I have no idea. I'll just add it and hope someone will use it later". And then down the road the same message, of the same context, might have like five entries, all used in different part of the UI, just because they were all added by five different people.

So lastly, I proposed the addition of a internationalization owner, who owns, manages, and gives out message id. Tell the person the message you want, and the context, and the person decides if it should use an existing message id, or a new one.

I have not seen it in work, but it sounds feasible. Any comments?

Monday, March 15, 2010

Have an effective bug tracking system

Seriously. A bug tracking system is essential. I don't mean a software application. Just a process will do.

When a bug is reported, detailed steps of how the bug occurred should be provided. Screenshots would be helpful. The actors involved should be named as well. For example, for a request workflow, state the requestor, assignee, appover, and so on.

And of course, if this is a testing phase, try to have multiple test accounts. When a bug occurred, stop using the test account involved in the bug. This allows the developer to look into the data state when the bug occurred.

Though of course, in most cases that could not be done.

Back to the system. A bug will go through various phases. It will be open, fix in progress, testing, and closed, for the basic status list. It should be noted that the bug should always be closed by the group who reported the bug, not the developers themselves. This removes any future conflicts between the client and the vendor.

So, we have a 'new' bug. The user submits it with a description of what he was trying to do, what happened, and what SHOULD happen. I feel that it is essential that all three such things are submitted. Many times I encounter vague steps of what a user did, and what happened. And then they leave out the part on what should happen. So I have no idea which part of the happening was the bug.

A priority level is tagged with each bug as well, ranging from fatal (cannot proceed on and is a show stopper) to minor (work but is an annoyance). Most people use a number system, from 1 to 4, and depending on preference, 1 could be the fatal bugs.

It is likely that there will only be critical and fatal bugs. The user would usually want everything to be changed till it suit their taste, and if no one manages the defect list, be prepared to be sucked into a neverending whirlpool of changes. Negotiate and discuss the priority of issues with the user. Give and take for some of them. Most users are reasonable, but there are always those few unreasonable ones.. especially when politics come into play.

At times, rather than doing a time and effort consuming fix, a workaround can be suggested. Update the bug status with the work around suggestion, and throw it back to the user for review.

Now for the case where a fix is needed. The bug should belong to a single developer at any time. And this is actually where I wanted to complain at the start of this post.

The bug was originally assigned to someone else. But as that developer was assigned with too many fixes, the others started to help out. But without consulting and discussion with the original assignee, bug fixes get duplicated. Time and effort are wasted. Confusion arises.

Do a reassignment of defects with the original owner around. Please.

To carry on with the discussion of the system. After a bug is fixed, it should go through a round of internal test. The bug status should be updated accordingly. After it is verified to be fixed, it should be thrown back to the user for testing, and then readied or production.

This is another point of a bug fix cycle to be careful about. Sometimes, the fix expose another irrelevant bug. And the user would report that as part of the current bug, and keep the bug open.

Avoid that! As far as the facts are concerned, the reported bug was fixed! Any furthur bug or changes should be reported as a new bug or change request. This is especially important if the team is committed by contract to fix all bugs reported before a deadline, which is considered a milestone completion. Some clients might actually wish the milestone to not be reached, to avoid payment.

In conclusion, do take care into setting up a proper bug/defect tracking system. It could turn a profitable project into resource hogger and waster (if these are real words).

Monday, March 8, 2010

Avoid doing updates as delete/insert

Recently, for a project, we had interactions with a database schema in the following form.

User has multiple Positions, which has multiple Roles.

Now, here, we tried to do the quickest way out, thinking that, when a single entry in Role has changed, or been added, we simply did the following:

  1. Delete all Positions of User
  2. Delete all Roles of User (Role had a column entry of User ID as well)
  3. Reinsert all Positions of User
  4. Reinsert all Roles of User.

Why did we do that? Well, quite simply, because we cannot tell from the each object entry if it is an update or insert, and no way to tell deletion since the consumer of such an object actually remove the role from the position array to indicate a removal.

An alternative that sprang to mind was to do a select again before the updates, and do delta compare before updates. That seemed like an awful lot work to do.

So this delete/insert approach works.. but it proved to be a wrong approach.

It creates excessive and unnecessary strain on the database, in terms of redo logs. Depending on how you configure the database, the redo logs might be considerable, especially if you get hit with a 'power user' of say, 300 positions with 200 roles each. Sure, unreasonable example, but if some logic went wrong somewhere...

And also, such insert/remove bumps our primary key id generated value up significantly fast. I have no idea what happens when it reach the maximum limit, if there is one.

And finally, because the primary key id changes so very often, another way has to be introduced to uniquely identify each entry. It could be another running sequence number, or a composite key of a few values. You cannot rely on the primary key id, which becomes somewhat redundant.

A compromise could have been reached.

It might have been better to mark an entry with 'tags'. Eg, a role without an id indicate an insert statement. A role with an id and a 'isUpdate' to true is an update. While a role with an id and a 'isDelete' to true indicate a deletion. Rather than work with it transparently as removing objects from an array, use flags to indicate the operation to perform on each entry. This is more work on the consumer of the object, but better than the alternatives.

Monday, March 1, 2010

Allow for business policy changes

I was reading over Robert Martin's PDF on design problems and solutions to accounting software. I have yet to finish the PDF, but the problem they described was so familiar to me in some ways.

In fact, it is to do with policy/rule changes. Along with scheduled tasks.

Case in point. Two companies had a contract between them, the sale of a particular material at say $x per piece. Payment is usually after delivery, and there is a scheduled delivery two months later.

One month later, and the company signed a new contract, at $y per piece. So, when the materials are delivered, which rate should it be using?

It is entirely likely that the answer is, it depends on what the business wants for that particular contract. For some companies it might be $x, others at $y.

So when we develop such a system, we must first take care to not bind the delivery to a fix rate of $x.

On the other hand we should not bind it to the current value from the contract too. These are such major decisions that the system should avoid making.

In such a case, it would help to escalate the issue to a business user, informing them of the situation, and the choices.

Such a system is of course much more complicated, but it is likely what businesses want. They want automation, and the ability to intervene.

But remember though, for every intervention and decision, it should be recorded and audited, as part of the request history for future reports and audit purposes.

There are of course, many other similar cases where the variables of a system when a task is scheduled is very different from when it is to be executed. Email sending, account creation and deletion, role assignment, etc (mostly in the context of identity management since that is where I'm concentrating on).

But the solution are similar. Avoid binding values at schedule time. Verify that all is the same at execution time. If any differs, leave it to the business users to decide. They may decide to sell at a loss now to earn more later. The system can never advise or decide that.