Showing posts with label technical. Show all posts
Showing posts with label technical. Show all posts

Monday, February 8, 2010

Beware of the datatype char

I had the most unexpected and weirdest bug encountered recently, and this is probably the strongest argument to have a consistent testing (and sometimes even development) environment!

In our development environment, we are using a database schema that had two fields, status and reason, as varchar of 2 characters. This means that both fields would be up to 2 characters for storage.

And on the test, staging, production, we actually have the fields defined as char of 2 characters. This means that both fields would be using 2 characters for storage, even if it is a single character.

They are different database products, and me not being a database expert, declined to comment on the difference. Perhaps one of them did support varchar. Or maybe the other was more efficient with char. In any case, it sounded like it would still work. After all, the difference ought to be just on storage.

And I was so wrong. We had weir cases where some comparisons of the returned value failed.

On closer inspection, a value of 'A' from the database was in fact, 'A '! It was padded with whitespaces behind!

That took a while to find out, though fixing was easy. I'm gonna be careful about char from now on.

Friday, January 22, 2010

Code comments

I used to have a rather strong stance on comments in code. Mostly in the form of avoiding them. I believe that the code IS the comment. If you had written clear code, you had no need to write comments. They would just be clutter in the most cases.

I still believe in what I just said, but recently, I had began writing more and more comments in my code.

That's because that despite clear code, sometimes it's useful to explain, in the code, why some things are done in some way.

To the coder (person who wrote the code), everything is clear. He might write two loops over the same list of objects, modifying them in different ways. He might have done that to keep the objects in a consistent state, to finish one set of operations before another.

Now, sometime later (or maybe just days), someone came along and notice this code. Why did the coder not merge both operations in a single loop? That would help in performance. He did not ask the coder, as the coder was unavailable. So he made the changes, run some basic tests, and check the code in.

And then a whole set of bugs appeared in the next few days.

This could have been avoided if the coder had commented in the code why he seperated the operations as two loops.

It is the same way in documentations. Rather than document just how things work, the documentation should explain why things work this way, why an approach was chosen, and why others might be discarded.

Anyone can figure out how things work, or how the code flow. But it is the why that is many a times missing. And we definitely need to document them in, both in code and doumentations.

Monday, October 12, 2009

Programming: Feeling burdened

Recently I had been giving much thoughts into the programming languages like C++ and Java. I mentioned only these two because those are the ones I had been working in most of the time.

I felt somewhat burdened.

When the project grows beyond a certain number of classes and methods, it felt like a pain trying to even look at the source code. Even though the files were structured to be pleasant to read, there is this weird growing sense that something's not right.

I remembered Uncle Bob mentioned in his book, 'Clean Code', that the sequence of methods placed in the source file matters. It had to flow right. Perhaps that is my main problem.

I tried to keep my public methods as a block, and my private methods as another block. It did not work, since you had to jump from the public down to private when understanding code. Then I had all related private methods right below each public method. However, now, at a glance I cannot know what all the public methods of a class are (except through perhaps a smart IDE filter).

This is just wrong! Why are we still concern with ordering of methods in a source file! We should be working on a higher level of abstractions on them! We should be looking at only method blocks where they matter! The IDE should be smart enough to filter them out!

(I still miss VisualAge for Java from IBM).

But that's not all that made me feel burdened. I feel slow when developing on these. Perhaps its not the languages themselves. It could be my approach. I am very much a TDD guy. But recently, much affected by all the readings from 37signals on Getting Ready, etc. I wonder. Perhaps I am over TDD? Or are they even necessary?

Even Kent Beck, in his recent post on his JUnit Max, confessed that he did not take a TDD approach. There are plenty of code that were not test-covered.

It seemed that, in a rush to deliver, we had to sacrifice test coverage.

But of course, they do not apply to all scenarios. The reason I am feeling these are because, well, when I am working on my own hobby projects. I felt that I am slow in crunching out and moving towards the goal.

Lastly, a part of the burden is due to the language (definitely). The syntax are really to verbose. Lack of closures, excessive need to declare private variables with accessors. Constructors which simply set values. Lack of named parameters, which result in the need of parameter objects. There are just no lack of complains.

Perhaps I should move to a lighter, more agile language. Maybe even dynamic language.

Monday, July 13, 2009

Audit and Security are not meant to be interceptors

I was working on an administrative web interface for a user management product, and was trying to find out how best to do audit and security.

I should throw up the conclusion first: Audit and Security are not meant to be interceptors.

And now for the journey to that conclusion.

We are in the year 2009. Java annotations are on the rampant. Every problem looks like a nail to that hammer. And it is no surprise that there are many projects which implement audit and security annotations, and run them through interceptors like AOP.

Which I tried as well. It worked great initially. Marking methods to be auditable and securing methods with roles were simple, and does not obstruct code logic.

Until the security got more 'realistic' and requring the audit to be 'useful'.

In the original concept, a user could be an administrator, and able to modify any users. In the updated requirement, a user could manage multiple group of users. No longer that simple to just check by a single role a user has. The role would be dynamic too. To support multiple dynamic groups after deployment, the role name would be built up from the name of the group of users. And annotations for role name support only strings that can be determined from compile time!

Granted, a more complicated security annotation might work. Something along the line of annotating a method with a security policy instead of a role name...

But let's move on to audit first. The most common audit approach is probably marking methods as auditable, and then record all input arguments and output result of that method.

That seemed to me to be a rather... technical audit rather than a business audit trail. Translation of the audit information to business information is not so easy. And there are definitely times when the required audit information might not be part of the input arguments nor output result. And this might violate the SRP to some people, but it's entirely possible that a single method is in fact, multiple events and require two audit entry instead of one.

My point here is probably this, that for a given business scenario, there are definitely clear requirements on what audit information to capture. And these information usually go beyond the simple 'method invocation with arguments, results, or exceptions'.

I'm definitely not against annotations and interceptors. They are excellent idea, and work great for many cases. My grip is that audit and security functional requirements are never that trival, in the sense that many out of the box interceptors will fulfill the requirements. And there is just too much hype in audit and security interceptors as the holy grail.

Monday, July 6, 2009

Danger of being overly focused (to a skillset)

I started playing with XCode, Objective-C and iPhone Application Development recently (which was about 3 days ago).

Some background on my skillset first.

I started out learning Pascal and Foxpro, and then proceed to C++ with Visual C++. Along the way I picked up VB, and then Java. Then I worked with Java professional for about 3 years, and then out came C# (within that 3 years), which I then decided to make a jump for it, but ended up with VB.Net and Visual Studio for about a year. And then I decided that Java had a better career opportunity back then. So it was back to Java and Eclipse for another 2+ years.

And this is where I am now. Which is to say that I am strongly comfortable with Java and Eclipse (but seriously, Eclipse is an excellent IDE).

During the time I read up on Ruby and Python, but resisted developing with them, blaming the lack of an IDE with the quality of Eclipse.

And now that I want to try out iPhone Application Development, I had to pick up Objective-C and XCode.

I started reading up on both topics. And with every read, Objective-C felt like an excellent language. Named parameter was a big plus for me. Categories were a good way to slice up code into separate files (much like .Net's extension methods). Other than that, much was the same as old time (to me). XCode itself was a good tool, with code completion support (though I have yet to read anything on refactoring, and not much books talk about unit testing though they are available online).

And then came the time of actual usage. I was confident that with my experience in C++ I could tame that Objective-C beast. And the tool did seem good.

I was wrong. Not on the account of Objective-C. That was pretty easy. I am a firm believer that language syntax are simply decorations. And so I had concentrated on honing the more important skills of writing clean code, design patterns, building stronger foundation. But development still take a while, because I was so used to Java. Coding in Objective-C requires some mindset changes. Recognizing new symbols.

XCode was a much bigger killer. Code completion works, but code 'suggestion' was a different key. I had to retrain my instincts. I have yet to figure out unit testing, but probably will complain when it get working, citing it as different and weird compared to Eclipse (but that is definitely due to my overly used to Eclipse, not because of XCode). Checking for compilation errors was different and had to get used to. And looking at the logs while debugging was... well just wonder why I had to switch my view in XCode. Perhaps there is a different project layout view (I was sure I read something on it), and I should try that.

So why so much pain? I became overly reliant on a single tool and language. And that is definitely a very dangerous thing. Regardless if you read up on different tools or languages, if one does not actually try using them, they will not be able to operate under different environment. And as developers, we MUST be as flexible as possible. One can never know what tools one might use in the future of their career. (I am currently cursing on using Visual Studio in Windows XP VM on my Mac... GRRRR).

And after years of using Java, a garbage collecting language, I had to revert back to a non-garbage collected language. More care is put into each line, and there is always a wondering thought on if I should release the memory of something returned from a system call, or not if I am returning them as return arguments. It takes some getting used to.

I probably should have tried out Ruby and Python more often, and C++ too, with a variety of tools. Life would be better.

Monday, June 8, 2009

Towards Better Code: No side effects, shorter methods, exit early, and log every invocations

Recall earlier that I had mentioned that I learnt through logs of existing program/framework.

It's weird, but I cannot emphasize enough about the importance and need to log.

I have been using a rather new approach to logging. Well, not exactly to logging, but a new approach combining logging and coding style.

First off, I keep my methods short. For example, given the following algorithm:

public void login(final String username, final String password) {
  // authenticate
  // check if user container is active
  // check if user is active
}

It will most likely evolve into the following code:

public void login(final String username, final String password) {
  authenticate(username, password);
  verifyUserContainerIsActive(username);
  verifyUserIsActive(username);
}
void authenticate(final String username, final String password) {
  final boolean valid = checkUserCredentials(username, password);
  if (!valid) throw new AuthenticationException();
}
void verifyUserContainerIsActive(final String username) {
  final Container container = getUserContainer(username);
  if (container.isInactive()) throw new UserContainerInactiveException();
}
void verifyUserIsActive(final String username) {
  final User user = getUser(username);
  if (user.isInactive()) throw new UserInactiveException();
}

That's just a rough idea. Basically, each method is composed of many more methods, which serve to describe what each step do. Benefits? Eliminate the need to comment complicated algorithms. Comments run the highest risk of being out of sync with the actual code intent.

Also, notice the use of exceptions in the inner methods. I had started out with an alternative idea, that the inner methods would return true/false values, which would. It basically create the following nasty code block:

public void login(final String username, final String password) {
  if (isValidCredentials(username, password)) {
    if (isUserContainerIsActive(username)) {
      if (isUserIsActive(username)) return;
      throw new UserInactiveException();
    }
    throw new UserContainerInactiveException();
  }
  throw new AuthenticationException();
}

Of course, the better way is as follows:

public void login(final String username, final String password) {
  if (isNotValidCredentials(username, password)) throw new AuthenticationException();
  if (isUserContainerNotIsActive(username)) throw new UserContainerInactiveException();
  if (isUserIsNotActive(username)) throw new UserInactiveException();
}

Personally, I still felt that my first example up there produce a much more readable version of the algorithm, which communicates the intent much better.

But the basic idea/rule here is, exit early. What is most important is to communicate the basic intent of the code. The other possible cases are, well, exceptions (pun intended), and should not clutter the original intent of the code.

And finally, on to logging.

Now, I log all entry and exit of a method, as well as method arguments and method return values. Ideally this should be done with AOP, but I am not doing that yet.

What benefit does this logging bring me?

I can actually, from tracing the method entry logs, form a very clear view of how a request pass through the entire system. An example of the above log might be as follows:

enter: login {username:kentlai, password:xxx}
enter: authenticate {username:kentlai, password:xxx}
exit: authenticate
enter: verifyUserContainerIsActive {username: kentlai}
exit: verifyUserContainerIsActive
enter: verifyUserIsActive
exception caught: UserInactiveException

With shorter methods, logging every method invocations, I can trace the flow very accurately. Sure, there is an increase in the size of logs, but the return in such investment is easier debugging. Without even hooking up to the debugger.

With the logged method arguments, and knowing which method is causing a problem (from the logs), I can actually write a quick unit test to fix any potential problem found.

But there is one last magic elixer in this combination. No side effects.

Basically, each class should not have mutable state. The only mutable state should be from external storage (eg. database). And if a function need to use any values from such storage, they should, themselves be encapsulated into methods of their own, with the return values logged.

Why? This allow you to put together a complete picture, the context and environment of which a method was executed under. When every variable becomes known and can be supplied once again to a method during unit testing, a method can be guaranteed to work consistently and constantly as expected, which allows for more robust code.

Funny it took me so long to realize such principals.

Wednesday, May 6, 2009

Learning new frameworks through logs and reading source code

I was having lunch with a colleague, and commenting on how slow another colleague was in learning new frameworks. I wondered if it had to do with experience, since he was relatively new. But still, the rate he was picking up was unimaginable (to me at least).

I cited how I had to help in resolving a compilation and deployment problem, which by just googling a similar pattern of the error message (since the exact message would not yield that close a match with our project names, user folders, etc in it) gave me a hit on the first page (maybe not the first result, but at least it was on the first page).

I wondered if it had to do with the educational system, where they are more spoon-fed and focused on passing exams. Back in the older days we had a genuine desire to learn new stuff. We would borrow and devour books (figuratively speaking)

And then I mentioned about how I went about learning Struts2 and Tiles for this current project.

Which led me to notice the approach I took to learning it (as well as the previous JAX-RS framework).

1. Reading official documents and tutorials. The framework usually have some form of documentations. Even if they are incomplete or few, they are still a source of learning.

2. Google. For something you want to be done, there is a high chance that someone else has done it as well.

Usually these two would be sufficient, but recently, I go a step further.

I just came back from an IDM project. Sun Identity Management product. It was heavily customized. And at times, we had to deal with insufficient documentations, and find out if something is a product bug or a customization done wrong.

And how we do that is to turn on log tracing, and trace the calls, and then check the product source code (decompiled).

After the project, I had just unknowingly do that whenever I hit a problem in using a framework.

I fire up the logging, and find out which method call I had to use, and go through to look at the source.

In fact, it helped me understand the framework much better.

Of course, some would argue that reading source is hard. But I did not find it that way. I could still find my way around it, even though it might take a while.

Let me share my latest experience.

I was using Struts2 with Tiles, and I ended up with a problem of wildcard matching of definition names, and cascading attributes. The attribute was not found in the nested definition.

First, I tried to check the documents. They cited examples of wildcards, and then of cascading attributes, but not together. So I was not able to determine if I was using it correctly.

Next I tried to google. It took a while, but I finally managed to find someone with the same problem. Apparently it was a bug the guy reported, and he was asked to produce a JIRA (not sure if he did), and it was dated April 2009.

But this did not solve my problem, since it was a deliverable project. So I had to fix it myself. I dug through the code, and realized that the problem was in ResolvingLocaleUrlDefinitionDAO.replaceDefinition, where the definition was not duplicating the cascading attributes, only the local.

Now to figure out how to fix it. I just need a custom DAO. From my experiences with the various frameworks, a good framework usually allow you to provide custom components, defined in a property file or web.xml. There was no property to provide a custom DAO (I traced through the DAO creation path by code), but there was a way to provide a custom DefinitionsFactory, which create the DAO.

So I did that, and solved my bug.

If I had gave this task to that new colleague, I doubt that the bug can be resolved. Somehow, decompiling might not have crossed his mind.

I am trying to find out, what was the reason of the difference.

Was it because I was placed in a situation where I had to read the source and debug and fix?

Was it because I had more years of experience working? (approaching 6 this year)

Was it because of inadequate googling skills? (That proved to be a problem for another of our ex-colleague, who was not that good in English)

Was it because I am used to reading logs?

Was it because I am comfortable reading source code?

But in any case, learning new framework and reading source code does give me various new ideas on how to design and develop. A case in point: if it is a framework, most components should be allowed to be defined and replaced by the consumer of the framework, just like the Tiles case above.

Friday, April 3, 2009

Digging into Jersey JAX-RS: 3. Request flow

Logging

Now it is time for a more complete feel of how a request flow through the system. And to do that, we will enable logging for Jersey.

Jersey uses java.util.logging as the logging framework, so enabling logging for it is easy. Just throw in a logging.properties into the WEB-INF/classes with a log level of ALL for com.sun.jersey.

(However, I did not manage to view my log output in tomcat launched via Eclipse. Somehow the console displays only levels up to FINE, and I needed levels up to FINEST... so I launched it via the normal tomcat outside of Eclipse.)

So I made a request for the xml content, and then view the log file.

... and there was nothing logged about the request.

Guess it's back to the code for manual mental tracing :(

Request Flow

The entry point of jersey is of course, the filter we set up, com.sun.jersey.spi.container.servlet.ServletContainer. 

I first looked at the code of com.sun.jersey.spi.container.servlet.ServletContainer#doFilter, and I go... ???

It is part of a filter chain, but however, it does not involve the next filter in the chain! It simply.. consumes the current request and is absolutely sure that it will handle the request. That does not seem to play well with static content though, unless my filter does not catch /*.

But let me put aside that for now. So it flows through to com.sun.jersey.spi.container.servlet.WebComponent#service, where the application context, user principal, form input are set up, and request/response are stored in the local thread storage context (though why I am unsure).

Next would be com.sun.jersey.server.impl.application.WebApplicationImpl#handleRequest, which applies the request filters. 

com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule#accept is triggered, which tries to match the input path and process it. This is where the com.sun.jersey.server.impl.template.ViewableRule mentioned in the previous post comes into play. But there are others as well. 

It is highly likely, in our example, that com.sun.jersey.server.impl.uri.rules.ResourceObjectRule/ResourceClassRule are matched. They are created in com.sun.jersey.server.impl.application.WebApplicationImpl#processRootResources, where all located resources/singletons (that was scanned by jersey runtime in the web.xml configuration) annotated by Path are attached to the root rule.

Eventually, the com.sun.jersey.server.impl.uri.rules.HttpMethodRule comes into play, which dispatch the request to the java method of the resource.

And after the com.sun.jersey.server.impl.uri.rules.RootResourceClassesRule#accept is completed, the response filters are applied to the response, and the output is written out.

javax.ws.rs.ext.MessageBodyWriter & javax.ws.rs.ext.MessageBodyReader

So where do these message body reader/writers come into play?

In com.sun.jersey.spi.container.ContainerResponse#write, the right message body writer is located via best media type match. 

And in com.sun.jersey.spi.container.ContainerRequest#getEntity, the right message body reader is located as well via best media type match.

It is interesting to note however, that the com.sun.jersey.spi.container.ContainerRequest#getEntity result is not cached internally. Which means that multiple calls of it might result in an error. I would assume that it is currently carefully handled.

Summary

It was an interesting exploration, and there are a lot of magic involved (magic to me. I just did a touch and go on purposes of most classes). I was however, rather disappointing in the servlet/filter implementation, in that it does not give the next in chain a chance to handle the request if it was not handled by jersey itself.

Now I am rather curious as to how the other implementations are. I ought to take a look too. And what will I do next after the look? Probably try to code up a website on pure jax-rs if possible, reverting to jersey's helper if unable to.

Monday, March 30, 2009

Digging into Jersey JAX-RS: 2. custom message body writers

I decided to start with something simple, like a service that simply returns a java POJO object.

So with that in mind, I created an EntryPoint that mapped to the root path of /.

Simple Service

@Path("/")
public class EntryPoint {
     @GET
     @Produces(MediaType.APPLICATION_XML)
     public Data getData() {
          final Data data = new Data();
          data.add("key1", "value1");
          data.add("key2", "value2");
          return data;
     }
}

And I tried to access the web application with my Safari, deployed to tomcat automatically with eclipse.

A message body writer for Java type, class me.kentlai.jaxrs.services.Data, and MIME media type, application/xml, was not found.

Hmm.. I wonder what are all the message body writers registered in the system. Other than the com.sun.jersey.server.impl.template.ViewableMessageBodyWriter in jersey-server, the following were found in jersey-core.

com.sun.jersey.core.impl.provider.entity.StringProvider
com.sun.jersey.core.impl.provider.entity.ByteArrayProvider
com.sun.jersey.core.impl.provider.entity.FileProvider
com.sun.jersey.core.impl.provider.entity.InputStreamProvider
com.sun.jersey.core.impl.provider.entity.DataSourceProvider
com.sun.jersey.core.impl.provider.entity.RenderedImageProvider
com.sun.jersey.core.impl.provider.entity.MimeMultipartProvider
com.sun.jersey.core.impl.provider.entity.FormProvider
com.sun.jersey.core.impl.provider.entity.FormMultivaluedMapProvider
com.sun.jersey.core.impl.provider.entity.XMLRootElementProvider$App
com.sun.jersey.core.impl.provider.entity.XMLRootElementProvider$Text
com.sun.jersey.core.impl.provider.entity.XMLRootElementProvider$General
com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider$App
com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider$Text
com.sun.jersey.core.impl.provider.entity.XMLJAXBElementProvider$General
com.sun.jersey.core.impl.provider.entity.XMLListElementProvider$App
com.sun.jersey.core.impl.provider.entity.XMLListElementProvider$Text
com.sun.jersey.core.impl.provider.entity.XMLListElementProvider$General
com.sun.jersey.core.impl.provider.entity.ReaderProvider
com.sun.jersey.core.impl.provider.entity.StreamingOutputProvider
com.sun.jersey.core.impl.provider.entity.SourceProvider$SourceWriter

Ok to be fair, there are various ways to serialize xml. And various xml libraries. There are a few in-built xml providers, but they take in JAXB elements of DOM elements (as far as I can see. I am not an xml guru).

It is a good chance to experiment with my own provider. An xstream writer provider.

Custom XStream Message Body Writer

This is not an entry about how to use xstream. There are much better articles out there for xstream, so I will skim over the details.

I annotated my POJO with xstream annotations, to make the xml output prettier.

The xstream writer provider was rather simple to write. All it had to do was match the xml media type, and write the object out with an xstream instance. I am not using any xstream annotation here yet. Note also that I annotated it with javax.ws.rs.ext.Provider and com.sun.jersey.spi.resource.Singleton. The first allows the class to be picked up when Jersey scans the package for providers, and the second requests jersey to create only a single instance of the writer for the entire web application. This is a better option as readers/writers should be stateless and can be reused.

@Provider
@Singleton
public class XStreamMessageBodyWriter implements MessageBodyWriter<Object> {
     public long getSize(final Object t, final Class<?> type, final Type genericType,
               final Annotation[] annotations, final MediaType mediaType) {
          return -1;
     }
     public boolean isWriteable(final Class<?> type, final Type genericType,
               final Annotation[] annotations, final MediaType mediaType) {
          return mediaType.isCompatible(MediaType.APPLICATION_XML_TYPE) ||
               mediaType.isCompatible(MediaType.TEXT_XML_TYPE);
     }
     public void writeTo(final Object t, final Class<?> type, final Type genericType,
               final Annotation[] annotations, final MediaType mediaType,
               final MultivaluedMap<String, Object> httpHeaders,
               final OutputStream entityStream) throws IOException, WebApplicationException {
          xstream.processAnnotations(type);
          xstream.toXML(t, entityStream);
     }
     private final XStream xstream = new XStream();
}

And accessing the redeployed page gives me the following:

<data>
     <entry key="service1">
          <value>path1</value>
     </entry>
     <entry key="service2">
          <value>path2</value>
     </entry>
</data>

Cool! Now how about json?

Custom JSON Message Body Writer

Now change the javax.ws.rs.Produces to produce MediaType.APPLICATION_JSON instead.

A message body writer for Java type, class me.kentlai.jaxrs.services.Data, and MIME media type, application/json, was not found

That was expected. Now, to handle Json specifically with a JSON message body writer using json-lib.

@Provider
@Singleton
public class JSONMessageBodyWriter implements MessageBodyWriter<Object> {
     public long getSize(final Object t, final Class<?> type, final Type genericType,
               final Annotation[] annotations, final MediaType mediaType) {
          return -1;
     }
     public boolean isWriteable(final Class<?> type, final Type genericType,
               final Annotation[] annotations, final MediaType mediaType) {
          return mediaType.isCompatible(MediaType.APPLICATION_JSON_TYPE);
     }
     public void writeTo(final Object t, final Class<?> type, final Type genericType,
               final Annotation[] annotations, final MediaType mediaType,
               final MultivaluedMap<String, Object> httpHeaders,
               final OutputStream entityStream) throws IOException, WebApplicationException {
          entityStream.write(serializer.toJSON(t).toString().getBytes());
     }
     private final JSONSerializer serializer = new JSONSerializer();
}

Accessing the page gives me the following:

{"data":[{"key":"key1","value":"value1"},{"key":"key2","value":"value2"}]}

End of this post

This is a simple post, with nothing overly hard to try nor implement.

Next up I will probably try to trace the flow of request through the components up to the response.

Update: My bad, I did not notice that if I annotated my POJO with a JAXB @XmlRootEntity/Type, it will trigger the JAXB writer.

Friday, March 27, 2009

Digging into Jersey JAX-RS: 1. setting up

So I started with a maven web application in Eclipse to do my test drive of jersey.

I added the repositories, and the jersey-server dependency (https://jersey.dev.java.net/source/browse/*checkout*/jersey/tags/jersey-1.0.2/jersey/dependencies.html)

I started out with a filter, instead of the servlet. I always prefer filters. I also added the following initialization parameters:

com.sun.jersey.config.property.packages: me.kentlai.jaxrs
com.sun.jersey.config.feature.Redirect: true
com.sun.jersey.config.feature.ImplicitViewables: true

com.sun.jersey.config.property.packages

I'm probably dumb, because I had a hard time finding out documentations for what these fields actually do. I do know that com.sun.jersey.config.property.packages tells jersey the package to find the resource classes, but all examples I saw specified only a single package.

Does it work for multiple packages? It did say packages..

The documentation did mention that the search is for the declared package and all sub-packages, so that would be fine. But how about two different packages?

I check in the source file. Multiple packages is detected by having the com.sun.jersey.config.property.packages. So how can I go about inserting multiple values?

So apparently, the documentation can be found in com.sun.jersey.spi.container.servlet.ServletContainer. Use ; as seperators.

I noticed two other properties mentioned:

com.sun.jersey.config.property.resourceConfigClass
javax.ws.rs.Application

I will probably look at them in a later date.

In conclusion: The value for com.sun.jersey.config.property.packages can be a ; seperated string value, as a list of packages to scan in. The scan works for sub-packages too.

com.sun.jersey.config.feature.Redirect

What is this? After doing a grep through the source, I found the documentation in com.sun.jersey.api.core.ResourceConfig. So apparently, what it does is this (when set to true)

request uri -> /test/abc
path declared -> /test/abc/

And the client will be redirected to /test/abc/. Hmm, seems no particular strong reason for me to turn that on. I'll keep it to false (default value) then.

I saw two other properties of interest in com.sun.jersey.api.core.ResourceConfig

com.sun.jersey.config.feature.NormalizeURI & com.sun.jersey.config.feature.CanonicalizeURIPath

They have a default value of false, and they seem to be about normalizing uri. For example, com.sun.jersey.config.feature.NormalizeURI would turn the request uri of a/b/../c into a/c. I'm not sure if it affects the routing, but I would not mind it turned on. Maybe some people like to check for intrusion attacks?

I'm not too sure what the second one does, but from what I understand after digging into the code, it seems to turn the request uri of ///a/b/c into /a/b/c. Guess it can be turned on as well.

I definitely need enlightenment on why some people might want this to be false.

It is also interesting to note that for com.sun.jersey.config.feature.CanonicalizeURIPath to take place, com.sun.jersey.config.feature.NormalizeURI must be true too.

The code to handle the normalization is found in com.sun.jersey.server.impl.container.filter.NormalizeFilter.

A filter?

Filters

Jersey supports filters! They are implemented in the two interfaces com.sun.jersey.spi.container.ContainerRequestFilter, com.sun.jersey.spi.container.ContainerResponseFilter, such that if your custom filter can care only before processing, or after processing, or if inclined, both. There are quite a few custom filters defined.

So where/how are they registered?

They seemed to be located by com.sun.jersey.server.impl.container.filter.FilterFactory (and it turns out that there are other filters like resource filter too), which the com.sun.jersey.server.impl.application.WebApplicationImpl will query and execute in order of declaration. Additional filters can be defined in the following init-params, which are executed before the apparently in-built filters:

com.sun.jersey.spi.container.ContainerRequestFilters
com.sun.jersey.spi.container.ContainerResponseFilters
com.sun.jersey.spi.container.ResourceFilters

The documentation of these can once again be found in com.sun.jersey.api.core.ResourceConfig. But in short, they are either a single string value, or a list of string values, which are fully qualitifed class names. It is interesting to note that the documentation and code will create these as singletons. That is, if you have a single class specified in all three filters, they are created as a single instance.

But where are the in-built filters defined? I had to dig deep, and found a com.sun.jersey.spi.service.ServiceFinder class. It is a way to locate services 'exported' by jars, as long as they have a folder META-INF/services in the jar, with the class files. So, opening the jersey-server jar, I found a single file with this value:

file: com.sun.jersey.spi.container.ContainerRequestFilters
values: com.sun.jersey.server.impl.container.filter.NormalizeFilter

There are more files in there, but for my current purpose, knowing this is sufficient.

There are more filters available too, like com.sun.jersey.api.container.filter.GZIPContentEncodingFilter, com.sun.jersey.api.container.filter.LoggingFilter, etc.

com.sun.jersey.config.feature.ImplicitViewables

This has to be on for jersey to resolve your jsp view implicitly. I'm curious on a few things.

1. The example had the jsp files in the root of the webapp. Convention had them inside WEB-INF. There should be a way to change that.

2. Is jsp the only option? Does it actually expose the ability to do other views like freemarker/velocity? There is mention of a com.sun.jersey.spi.template.TemplateProcessor, but how is it hooked up?

The magic seems to be in two places: com.sun.jersey.server.impl.model.ResourceClass and com.sun.jersey.server.impl.template.ViewableMessageBodyWriter.

The com.sun.jersey.server.impl.template.ViewableRule seems to be responsible for matching additional subpaths to the original resource. There is an additional match of empty/null subpath, and a match of an additional segment.

Eg, resource A with path /a/b. A request comes in as /a/b. The implicit view will be triggered. A request /a/b/c will be processed. But a request /a/b/c/d will not be processed by the resource A.

It iterates through the template processors from com.sun.jersey.spi.template.TemplateContext (it is injected, but what are the values?). And it only accepts HTTP GET method. What is the implications? I guess it would mean that if a request came in without a properly matched resource/path, it should not be redispatched to an implicit view. I am not too sure what it might really mean until I made experiments with HTTP POST.

A subclass com.sun.jersey.server.impl.template.TemplateFactory is created in com.sun.jersey.server.impl.application.WebApplicationImpl. This will then be injected into the com.sun.jersey.server.impl.template.ViewableRule. So there are two ways to have the com.sun.jersey.spi.template.TemplateProcessor hooked up. Services (as mentioned above), and providers. Providers seems to be classes marked with the annotation javax.ws.rs.ext.Provider.

Except that com.sun.jersey.server.impl.container.servlet.JSPTemplateProcessor is not marked with it! Mysterious..

But it turns out I missed out another way template processors can be injected. Via singleton instances registered with com.sun.jersey.api.core.ResourceConfig. And the creation and registration is done in com.sun.jersey.spi.container.servlet.WebComponent, which is the parent class of the jersey filter/servlet we usually registers.

Now on to com.sun.jersey.server.impl.container.servlet.JSPTemplateProcessor. It actually mentions, in the class, a configuration property used to lookup JSP files! com.sun.jersey.config.property.JSPTemplatesBasePath. Woo hoo!

Now back to com.sun.jersey.server.impl.template.ViewableMessageBodyWriter. Now that I am used to how things work, I check to see if there is a file javax.ws.rs.ext.MessageBodyWriter in the META-INF/services of the jar. Sure enough, there is one, with the entry com.sun.jersey.server.impl.template.ViewableMessageBodyWriter. And this is how it is injected into the application.

Looking at the writeTo code of com.sun.jersey.server.impl.template.ViewableMessageBodyWriter. It iterates through the template processors from com.sun.jersey.spi.template.TemplateContext again. It asks each template processor to resolve a path (which is the fully qualified class name, with . replaced with /, and append with 'index' if the path had been empty.

But wait.

The iteration does not stop with the first resolved template! All resolved template are written to! What could this mean? Potentially we could register two com.sun.jersey.spi.template.TemplateProcessor, and if both matches, could write its output out to the output stream? It does seem to indicate that..

To clarify more on that, we have to find out more about this com.sun.jersey.server.impl.template.ResolvedViewable mentioned in the class. It is the only exception to iterating over the available template processors. Who would set this?

Turns out we have to go back to look at com.sun.jersey.server.impl.template.ViewableRule. In its accept request check, it actually stops at the first resolved template processor, and set the response.

Things might be getting more confusing than required.

I will just dig into com.sun.jersey.server.impl.container.servlet.JSPTemplateProcessor first.

When asked to resolve a template, it actually attempts to locate the resource via the servlet context. If there is a perfect match, the template string is returned immediately. Otherwise it appends '.jsp' and try again. If there is still no match, a null string is returned. Pretty straightforward.

And writeTo basically commits the status/headers out to the response first, and then does a forward with the dispatcher to the new jsp.

While I'm here, I might as well find out what variables are exposed to the dispatched jsp. These are what I found and suspect what they mean (I gotta test it to find out)

1. _basePath: Original path to the resource

2. resource: Resource that handled the request

3. it: The resulting data

4. _request: Original request instance

5. _response: Original response instance

End of this post

This has been a long post (for my standard). I had approached this as a development and exploration diary, to pen down my discovery. All I had done so far has only been to figure out initialization parameters, and how the implicit view might flow.

I should stress that this is not meant to be an introduction on how to use Jersey. This is a post on understanding how Jersey works (even internally), in the hope to be able to use it more effectively. This is an exploration process, and as such, some of what was mentioned might be even wrong, different, or changed in future versions. And some of the exploration might be incomplete, but they are sufficient for the understanding I desired.

Coming up in the next post, I will follow up with a simple, barebone jersey web application with implicit views.

Wednesday, March 25, 2009

Jersey JAX-RS RI: An amazement

I took a longer harder look at Jersey, a JAX-RS Reference Implementation, and I was totally blown away.

Past

A background on the path to amazement.

I had heard about JAX-RS a while back. I think it was when I was playing around with Restlets. But I dismissed it off as I thought it was for web services.

I went around, playing with Wicket and Spring as well. I'll just round up my feelings. They might be outdated conclusions though.

Restlet: When I played with it, it was very much in the pre-annotations craze days. I had to add my restlets to an application router manually, and I had to extend Resource and Finder classes. I did not find out how to integrate with jsp at that time, so I was not sure if it supported it then (I did see freemarker/velocity support). But somehow, I did notice myself repeating a lot of code for different resources.

Wicket: It was a pleasant experience. Testing was easy too, and I felt very confident when developing the web application. There was no doubt, as my html and code are tested together. It was quite heavily session-based though, and turning off session/reducing the usage was harder than it was. Just when you felt used to a desktop-oriented way to developing with Wicket, you had to unlearn that and code session-less page.

Spring MVC: I tried it out with annotations, and it was truly wonderful. The resulting controller did not feel like a web-based java artifact, other than a minor hint of request/response/session here and there. Testing was easy too. But somehow, I am quite unsure of how generation of content other than html can be achieved for selected paths/resources. It felt like a tool for power user, actually.

Now

So fast forward to now. I actually took a look at Jersey again. This time I actually took a longer, harder look.

The code were really clean and simple. It was annotation-driven, with @Get and @Post and @Consume and @Produce. I just came from the Spring playground, and this felt really attractive. It was how Restlet should had been (and which they are now, as far as I know. I heard they currently had JAX-RS implementation within Restlet).

I knew I had to give it a go.

So I pondered about an implementation. I wonder if I had to go with html pages accessing JAX-RS with javascript. I wondered if I could get them to spit out html. I could, but if I had to make it annotation-driven, it would take quite a bit of effort. Then I wonder about which javascript framework would be suitable.

And as I was just going through the links on the jersey site, I saw this post: http://macstrac.blogspot.com/2009/01/jax-rs-as-one-web-framework-to-rule.html

Hmm.. so apparently, Jersey has this feature known as Implicit/Explicit views. That was really fascinating! And then I followed the link: http://blogs.sun.com/sandoz/entry/mvcj

Really very neat, it integrates with JSP! Not that I am a JSP fanboy though. I will get to that point later.

But anyway, all this is good, but I did noticed the issue James Strachan brought up, regarding Safari preferring Xml to Html. I also noticed that the Jersey implementation was registered as a Servlet with /* . That would not play well with static content. I thought to myself, that I probably could hack around a filter version later.

Today

And now, I saw on Paul Sandoz's blog about Jersey 1.0.2. Ok ok I know it was released in February. But I only got to know about it now. It added filter support, as well as ImplicitProduce!

I got excited. I decided to take a closer look at the implicit view now.

The documentation was sparse. There was nothing much on it. I had to download the samples to look at it. I checked out the bookstore sample.

The code was confusing. There was nothing much mentioned. I decided to run it.

And then it all made sense. And I was blew away.

The code was clean. Even annotations were kept to a minimal. It introduced some interesting way of resolving paths beyond the standard Path annotation.

I am really very impressed. I am going to give it a test drive now.

Summary

But to sum it up first, JAX-RS might have just become my preferred choice of web development framework (if it does not disappoint me in my test drive).

It is a very simple model. A very simple concept. Which means a very low entry barrier. I probably can introduce someone to JAX-RS development faster than Spring MVC. It would be even faster if there was a flow diagram of what components a request would go through, and what actors are involved.

And I will stick to JSP for a start, as it is also a low entry barrier choice. One might argue that velocity & freemarker are easy to pick up too, but given a handful of java web developers, there is a higher chance that you get someone who knows JSP than the other two.

I'll see if I can have a followup entry on my test-drive experience in the coming weeks.

Thursday, March 5, 2009

Spring TestContext Framework

It has been an exciting, and at the same time, frustrating time for the past few days.
For better or for worse, I decided to take a dive into Spring MVC and Spring Webflow, trying and playing around with it.

I walked away knowing more about Spring TestContext Framework instead.

Before I begin, I have to conclude that Spring MVC is really very neat. Webflows was interesting, and came with its own testing framework, which kinda irritates me... (they should all stick to Spring TestContext Framework instead). However, the inability to test the resulting view properly (without firing up Selenium) was a sore point for me..

So anyway, back to Spring TestContext Framework. It really is extensible.
Almost.

I could create custom TestExecutionListener, hook it up with the test class, and have it run before/after every test method!

So my first experiment was to do a MockTestExecutionListener. What it basically do is to create mocks of fields marked with @Mock (much like MockitoAnnotations). Pretty simple and easy. Then I hook it up to my test class.

The fastest way was to use @TestExcutionListeners, but I would have to declare DependencyInjectionTestExecutionListener as well, so that it would still remember to perform Spring Injection. Whatever classes declared in @TestExecutionListeners would override the default TestExecutionListener that Spring offers in SpringJunit4ClassRunner.

So instead, I extended SpringJUnit4ClassRunner, and added my TestExecutionListener in TestContextManager#getDefaultTestExecutionListenerClasses.

And then I got abit more adventurous.

I tried to reuse my mocks, and at the same time, verify that there were no more interactions in the afterTestMethod. So I kept track of my mocks in the given TestContext, and verify them.

The painful part was reusing the mocks. There is just no clean way to remove all interactions from mocks created from Mockito cleanly. Short of duplicating code from Mockito internals, cleaning delegates, blah blah blah. I'm trying to forget the ugly part of it.

So now, we have auto-mocking for our Spring Test!

And the next step. Wouldn't it be nice if some of my mocks are autowired to my Spring objects?

There is just no way it could be done nicely, especially if mocks are recreated for every test method execution.

A custom context loader can be provided, which can customize the loading of an application context.

Singletons are created and cached. And the application context are cached as well. So if two classes are using the same set of test configurations, the application context are cached across two class executions.

So it probably is out of the question.

And then I started playing with DbUnit. It is an excellent candidate for TestExecutionListener. It can auto locate a dataset xml file, initialize it for each test method invocation, and verify that the current state of the database match an expected dataset.

Only grip I had was in order to provide a DataSource to my TestExecutionListener, I had to locate it via the current test's application context.

Spring TestContext Framework is a very interesting framework, but it would be better if many methods are not made final, and better ability to declare custom classes instead of the in-built TestContext, TestContextManager, etc.