Saturday, January 23, 2010

6 months in ING Direct Spain

Today marks 6 months of my tenure in ING Direct Spain. So it is helpful to have some reflections.
1. My job is primarily administrative. During the time, I was involved in a couple of troubleshooting issues, for which we are able to find out a root cause and proposed a conclusive solution.

2. The second thing is my effort to refactor our JBoss infrastructure and to use git to track and manage the change. One lesson I learned from that is not matter what sophisticated we are using, the key of success is ability a) to figure out what is specific and what is common and b) to reduce number of configuration items.
At the beginning I tried to track configuration items on each machine as one git branch but it turns out into a horror of hundreds of un-manageable branches. At the end I thrown it and start to divide the monolithic configuration into smaller parts, parametrize them for reuse. After that using git or any other VCS for tracking them is just as walk in a rose's garden.

3. The third thing I have started is documentation of our infrastructure, I decided to use wiki for doing it. It turned out to be a good choice. The document is very up to date and practical because I use the document during my daily work so whenever I find a gaps between the document and the reality, I correct it. Update documentation is not separated task but a part of my day-to-day activities.

4. An the last thing is my research to look for a central authentication solution for our unix farm of mix LINUX, AIX and Solaris. I have tested PAM & Open LDAP combination for a while but at the end I have selected and recommended Likewise with the existing Active Directory as authentication server due to a lower cost of installation and operation.

Monday, January 18, 2010

java.lang.OutOfMemoryError Out of swap space?

I have encountered this error in one of our Hotspot JVMs. Googling for a while without definitive result, I decided to look at source code of Hotspot VM (happily the source code is available in Sun website).
Within few minutes, with very little effort, I found that JVM print out this message when got NULL from calling malloc. So it turns out to find why malloc fails.
When OS does not have RAM to satisfy malloc, it may try to swap out unused real memory to disk and if the swap space is configured too small, the error appears. But this is just one cause.
Other more likely cause is that JVM run out of memory address space and this is our case. It can happen when we run 32 bit JVM and total memory used by the JVM exceeds magical limit of 3 GB (on 32 bit kernel LINUX).