Monday, March 22, 2010

Application Internationalization and Localization

First off, a definition of what these are.

An application usually can and will be used by people from diverse cultural and language speakers. And so there will be demand for the application to speak to them in their own culture (date and money formats) and languages.

And so the act of creating culture and language specific versions is known as localization. But before that, you had to perform internationalization to extract all possible.

I have done two larger projects that involve internationalization of applications, with various smaller ones, but only one that actually involve localization.

And they sure are a pain to do. They use very simple concepts. Simply extract all string from the existing application and put them in a property file. And define a format string for each money and date format use, and place these in property file as well.

But if you treat each string as a unique entry, you tend to end up with alot of duplicated string. A message like "Approver name" could appear in both a form submission page, and a form status page.

It is extremely tempting at this point, to simply allow the two placeholders to use the same string entry. And in fact, that's a perfectly reasonable and valid approach.

Until the day come with you gotta change the text themselves. Imagine that, someone decided to use the message "Choose an approver" in the form submission page. Before the change to the message, he has to first evaluated the impact of change. Like, how many other parts of the UI is using the same string entry?

It is only when there are no other parts of the UI using the same string, could the developer make the change safely. But if it is actually used n many other areas, he gotta duplicate the existing entry, give it a new message id, and update that only.

So be careful of merging and reusing messages. Do them only if both are the same message, within the same context. A message string that is used during the selection of an approver, and one that is used to display the approver might have the same string, but they are in different context.

Another point in hand, is to really hire someone who is proficient in both the language translating from and to. The least worry you want to have is on the grammer and spelling related mistakes.

And lastly, avoid concatination of messages. You might have a message that goes like this: "Your account expired on XXX. Would you like to renew?"

It is tempting to have two string here, "Your account expired on ", and "Would you like to renew?". Do that, and you might wish you had never been involved in the project when there are message changes. One day the message might be changed to "XXX: Account expired. Renew now?"

It is better and cleaner, instead, to use a single string with placeholders like this: "Your account expired on {0}. Would you like to renew?". That way, you get the absolute flexibility of having the formatted value anywhere in the message, and you get to keep the message as a whole. During localization to other languages, the context of the message is clearer to the person working on it as well.

But granted, what I have described are simple principles and techniques, but doing the actual work itself is never easy. Especially when you are doing a 'extract message id' along the way you develop in a team environment. "I need a new message, does it exist? I have no idea. I'll just add it and hope someone will use it later". And then down the road the same message, of the same context, might have like five entries, all used in different part of the UI, just because they were all added by five different people.

So lastly, I proposed the addition of a internationalization owner, who owns, manages, and gives out message id. Tell the person the message you want, and the context, and the person decides if it should use an existing message id, or a new one.

I have not seen it in work, but it sounds feasible. Any comments?

blog comments powered by Disqus