Saturday, May 30, 2009

New article: how to choose a Java collection

A new addition to the Java Collections section of the Javamex site looks at a question that, perhaps unsurprisingly, crops up fairly frequently: which collections class should you use for a given task? The various collection classes provide a powerful means of managing objects and data in memory, but they're so powerful that choosing between them can sometimes be a bit daunting.

The approach that I take is to split the question firstly into two sub-questions. Firstly, what is the general type of structure that you need? That is, how is the data basically going to be organised? Usually, there is not too much confusion between a list and a map, especially if you consider whether or not you need to answer the question "for a given X, what is the Y"? But the difference between a set and a list is sometimes not well understood, or at least, not considered. With a little bit of thought, it is usually possible to decide in advance whether the purpose of your collection is to decide "is something there or not"? The trick is for programmers to remember to ask that question in the first place, and not simply plump for a list regardless.

Then, having decided on a list, map, set or queue, the next subquestion is obviously which particular flavour is required. In the case of a list, the cases where you wouldn't plump for an ArrayList are relatively uncommon and it should be clear in your head if you do choose something else that you have a "special case".

But in the case of maps, sets and queues, there are definitely more "horses for courses". But, especially in the first two, there are essentially two factors to consider: concurrency and ordering. As in the article, when the various choices are presented in tabular form according to these criteria, things become a little easier. As mentioned, a key rule of thumb is to choose the class that provides the minimum features that you require. If you don't need ordering, don't pay for it.

Java queues present a slightly complex set of choices, but in at least some cases-- especially a DelayQueue or SynchronousQueue-- it should be really clear that you need that class for a special case scenario. It should also be borne in mind that the executors framework means that in many common producer-consumer scenarios, you don't explicitly have to deal with the underlying job queue.

As usual, further questions and comments to any of the articles on Javamex are welcome on either this blog or the associated Java forum.

No comments: