Monthly Archives: October 2018

Be a Problem Solver Not a DBA #2

As a continuation of this series (first post here), I would like to discuss one of the processes that I use in an attempt to efficiently solve problems.  Recently, I was in the middle of a situation where a client resource came in pounding the table that they believed something needed to be disabled and it had to be done right now!  I didn’t dispute that there was a problem, but there was very little evidence of who was being impacted, what was being impacted and how much the problem was impacting these resources.  Part of being a good IT resource is not only identifying problems and how to fix them, but more importantly what are my alternatives and how is it effecting the system and if we fix it what else might happen.

Perhaps you have noticed, but from time to time I will make references to aviation and aviation related topics.  While it is one of my hobbies, aviation has also taught me to be a better problem solver.  This is because when you are up there and something happens, you need to have the tools to figure things out on your own as more often than not, there isn’t anyone else to help.  So this post will focus on a method of problem solving that I learned throughout my flight training; one that I rely heavily on today, the DECIDE model.

Aeronautical Decision Making (ADM) is a cornerstone of both being a good aviator and being able to solve problems.  So for the context of technology, lets just think of it as Decision Making.  If you ever care to read it, a full chapter on this is publicly available via the FAA website:

Effective Aeronautical Decision-Making


  • Define

Define the problem.  What is going on?  When did it start?  What is the impact?  Is there really a problem?  Is what needs to be happening still happening?

  • Establish the Criteria

Establish the criteria for evaluating the identified problem.  What do you need to achieve in this decision-making process?  What about the current situation do we need to make sure continues to happen?  What additional problems or side effects should we try to avoid inducing?

  • Consider Alternatives

What are all the possible choices which allow us to fulfill our previously defined criteria?  Do we do nothing?  Defer for a period of time?  Which alternative has the least assumptions?

  • Identify the BEST Alternative

Select the best alternative based upon experience, intuition and experimentation.  The important thing to remember here is to rely on all available resources to Identify the alternative.  While it’s important to be fast, do not sacrifice considering all corners of an alternative as part of identifying the BEST alternative.  Its OK to have an interim alternative before implementing the final alternative.

  • Do

Develop and Implement a plan of action.  When will we implement?  Is there a short-term and long-term implementation plan?  What people and other resources are needed?

  • Evaluate

Evaluate and monitor the solution.  Did the implementation go as expected?  What could go wrong?  How will it be handled?  What could be done next time to improve?


Hopefully implementing this into your routine and becoming methodical about solving problems will make you more efficient and more certain about the solutions you are proposing.  I know it works for me.

Additional References:
Problem Solving – Free Management Books


Series: Be a Problem Solver Not a DBA #1

I’m going to try from time to time to publish some scenarios I have been in throughout my career where being a DBA doesn’t mean being a DBA, it means being a problem solver.

In my opinion being a good problem solver requires following several basic tenets, one of which is:

“When presented with competing hypotheses to solve a problem, one should select the solution with the fewest assumptions.” – William of Ockham

Having done many migrations throughout my career, I have learned that performing database migrations is much like a “Reality TV” script. Everything starts out with a plan, the plan is executed and usually, with days to go, there is a big risk that jeopardizes the project. All to be figured out in the end with a successful migration. A recent migration was no different, however this time, it was a perfect example of how to be a Problem Solver not a DBA.

The purpose of this post is to not fully explain problem solving methods, it is more to discuss going outside the comfort zone as a DBA and look at items that you may not normally look at. In this case, I know enough about java and java coding to be dangerous, but knew that the other resources looking at this weren’t going to solve the problem (per the vendor, there was only one person on the planet who could solve this and they were on a bus in Italy) so I had to take things into my own hands.

A little background:

This particular migration was an upgrade from to on a SuperCluster. About a week or so out, I saw a very high spike in memory utilization to the point where the system was out of memory. Upon investigation, we found out that their scheduling agent was utilizing a newer version of Java and in turn, using 4x more heap space than the previous version of Java.

Upon investigation, I found that the process was not utilizing either the -Xms or -Xmx flags when invoking the process so what changed between Java versions to cause the increased utilization?

Since we did not own that portion of the application, the issue was transferred to the responsible party to troubleshoot. After several days of no movement, I decided to put my “Problem Solving” had on.

Using the your tenets of problem solving follow a logical path:
After lots of searching, I tried to check the defaults of what the java uses for min heap and max heap by default. There was a big change from the old and new version. For example, the old version used:

java -XX:+PrintFlagsFinal -version | grep HeapSize
    uintx ErgoHeapSizeLimit               = 0               {product}
    uintx InitialHeapSize                := 126556928       {product}
    uintx LargePageHeapSizeThreshold      = 134217728       {product}
    uintx MaxHeapSize                    := 2025848832      {product}
java version "1.6.0_45"
Java(TM) SE Runtime Environment (build 1.6.0_45-b06)
Java HotSpot(TM) 64-Bit Server VM (build 20.45-b01, mixed mode)

While the new version version used:

java -XX:+PrintFlagsFinal -version | grep HeapSize
    uintx ErgoHeapSizeLimit               = 0              {product}
    uintx HeapSizePerGCThread             = 87241520       {product}
    uintx InitialHeapSize                := 2147483648     {product}
    uintx LargePageHeapSizeThreshold      = 134217728      {product}
    uintx MaxHeapSize                    := 32210157568    {product}
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)


Ultimately, the solution was to add the “-Xms and -Xmx flags” to the program invoking the java process as to not utilize the environment defaults. In addition, this doesn’t waste infrastructure resources and also reduces time to invoke and close the java process by only assigning the memory that you need.

And as part of any problem solving exercise, focus from the bottom of the stack up, especially when multiple changes are in play.  In this case, the path with the least assumptions surrounded the changed java version so thats where I focused my effort.