OsterFelder 2007

Just like in 2004, I participated in the OsterFelder Berglauf again this year. This is such a great event that I'm really considering making it a yearly tradition with my good running buddy Thomas (he's the guy on the left, picture taken before the run - notice that I'm still smiling!).

For those of you that don't really know what a mountain run is I can explain this really simple in 2 steps:

  1. Find a mountain similar to the one below (in Belgium this is actually the hardest part!)


  2. Run to the top as fast as you can, sort of like me here:
    L1070605.JPG
I had planned to take my new TZ3 during the run and shoot some pictures along the way but decided against it shortly before the start. It's quite a hassle carrying the camera in-hand whilst running, I didn't have a special pouch for it. Also, people tend to throw away all ballast when low on energy and mental strength - didn't want to risk anything ;-)

I ended up finishing in 1h53m15s, a big improvement over my previous 2h16m25s! After the finish we still sat for an hour or so just to enjoy the view and superb weather.

You can find the complete Flickr photoset here.

The story of Mel


Pure nostalgy ahead !

Every developer will at some point in his career have to deal with an incarnation of Mel.
For those of you that haven't already: May The Force Be With You when you do finally meet him!

Optimizing your maven2 build time

Or perhaps I should say, "how to not slow your maven2 build process to a crawl".

There has been some discussion on the Hudson mailing list recently about the innordinate amount of time, 40+ minutes, it takes to build Hudson from scratch. In maven2 speak, "from scratch" means you have an empty local repository. Normally, the only time you have an empty local repository is when you've just installed maven and not run any build yet. Then the first time you run a build, maven will go out to its central repository and pull down the dependent libraries (also called artifacts) it needs during the build. These artifacts are then cached into the local repository.

Everyone is free to publish their own artifacts to maven's central repository. The process how to do this is documented here.

In addition to this, maven offers you the possibility to specify your 'own' repository, to be used during the build in addition to maven's central repository. You could store your company artifacts for example on a server that is visible on the intranet but invisible to the outside world. This custom repository is specified in your pom.xml, more specifically in the <repositories> element. There are a few caveats with this however, which is exactly what this post is all about.

Caveat 1
Following good inheritance principles, most people define their
<repositories> in the root pom. This makes sense, you define them once and they are visible for all child modules. This way your custom artifacts become shared between all build instances - neat, and problem solved right ? Well ... sort of, but not quite. The thing is that maven2 will now search for its artifacts in your custom repositories before it will attempt to contact its own central repository.

So imagine you have this in your pom:


<repositories>
<repository>
<id>java.net2</id>
<url>http://download.java.net/maven/2/</url>
</repository>
<repository>
<id>java.net1</id>
<url>http://download.java.net/maven/1/</url>
</repository>
</repositories>

You will most likely get this during your build:

Downloading: http://download.java.net/maven/2/commons-collections/commons-collections/2.0/commons-collections-2.0.pom
Downloading: http://download.java.net/maven/1/commons-collections/commons-collections/2.0/commons-collections-2.0.pom
Downloading: http://repo1.maven.org/maven2/commons-collections/commons-collections/2.0/commons-collections-2.0.pom
171b downloaded

You can clearly see that maven contacts your custom defined repositories before maven central. Now if you're doing a build from scratch this will become a costly operation and can easily increase your build time by 100%. The reason for this is that 95% of your dependencies (direct and transitive) are located on central anyway, and perhaps 5% is custom stuff for your project only.

The solution to this is simple, define maven central as a custom repository before your 'real' custom repository:

<repositories>
<repository>
<id>central</id>
<url>http://repo1.maven.org/maven2</url>
</repository>
<repository>
<id>java.net2</id>
<url>http://download.java.net/maven/2/</url>
</repository>
<repository>
<id>java.net1</id>
<url>http://download.java.net/maven/1/</url>
</repository>
</repositories>

You see now that maven central is contacted first and if the dependency is found there maven doesn't bother contacting the other repositories. This is a good optimization for 95% of our dependencies, the impact of the few artifacts we actually do pull from our custom repos is thus greatly reduced.

I will speak about the other caveats and how to optimize them in a next post.

moving to blogspot

Every not-so-well-thought-out concept goes through a relaunch once its creator realizes his mistakes. It's not that I made many mistakes on my old blog, but there was one crucial element missing - C O N T E N T.

Speaking of which, I'm wondering if there exists a way to migrate my wordpress drivels to blogger speak ? If anyone has experience with this please drop me a comment. Its only a dozen or so articles, if nothing else I'll add them by hand.

Why is it that - SMS

Why is it that in times where gigabyte devices and high speed mobile networks are becoming more and more mainstream, we cannot send more than 150 characters in an SMS without getting billed extra ?

(Wikipedia to the rescue)

It seems that technically, there is a limit of 140 bytes in the messaging protocol used between the SMSC and mobile device. However, that protocol also defines that larger messages can be segmented and that it’s up to the receiving device to reassemble the messages.

Operators however insist to bill each segment separately. Why ?

My guess is that they are still trying to compensate for their temporary loss of any common sense during the auctioning of the 3G network licenses :-P

maven 2.0.5 released !

On #maven (irc.codehaus.org) a few minutes ago …
————
[22:53] jvanzyl: done
[22:53] jvanzyl: released
[22:53] jvanzyl: 2.0.5 go away!
————

It’s been long in the making, but finally we’ve got a shiny new version of maven2. Loads of bugs fixed it seems, makes you wonder why they waited almost a year to do this patch release (not that we are any better though … ).

Oracle sequences

Fact: Our sun will run out of hydrogen in about 5 billion years (that’s 5 000 000 000). Remember this, and not only because at that point it starts expanding and will eventually envelop the earth.

Fact: The default maximum value for an oracle sequence is 1 octillion (that’s 1 000 000 000 000 000 000 000 000 000).

Fact: Often I see that people create multiple oracle sequences in their applications e.g. SEQ_PRODUCTS, SEQ_EVENTS, SEQ_REQUEST etc. I’m wondering if this is really necessary.

So: Let’s imagine that you build a really HIGH traffic system that processes about 100 million transactions per day, and let’s assume that this eventually leads to needing 10 billion sequence numbers per day. Every day. How many days will our sequence last?

The math for this is really simple: 1027 / 1010 = 1017 days

1017 days, that’s about 2.714 years = 270 000 000 000 000 = two hundred thousand billion years. (still remember how long before the sun runs out of hydrogen?).

So why ever bother creating multiple sequences ? One argument could be performance, a heavily used sequence can suffer from performance problems due to contention. In this case however, Oracle allows you to cache a predefined number of sequence values upfront. The downside of this that the unused numbers in the cache are lost when the database is restarted.

So, let’s assume we configure a sequence cache of 10 million (10 000 000) values, and we restart our database about COUGH 1 000 times per day . Do the math, the sun will have still ran out of hydrogen before our sequence hits its limit.

If anyone can give me a good reason why you should ever use more than 1 oracle sequence object in your application i’d like to hear it.