Tuesday 26 July 2011

Troubleshooting weblogic Server hang


WebLogic Troubleshooting (Server hang)

•         Server Hang

    A server is said to be hung when:

1) Process is still alive
2) Server does not accept any requests because all the execute threads busy or stuck for some reason.
3) No reponse sent to clients.
4) java weblogic.Admin PING command doesn’t return a normal reponse


Server Hang Analysis:

   The first step is to take multiple thread dumps.
•         A thread dump is a snapshot of the JVM at the particular instant.
•         Multiple thread dumps are necessary to conclude that the threads are  stuck and not progressing.

      Procedure to take thread dumps:

      Unix:
       Open shell window and issue the command  kill -3 <PID> where PID is java  processID of weblogic. Thread dumps are logged on to STDout file.

Windows:

    Do ctrl-break on command window where weblogic is running.
    Thread dumps are created on the same command window.

    Windows Service:

    Open a command prompt and issue the command(Make sure beasvc.exe is in the PATH)

    c:\> beasvc -dump -svcname:service-name

     Thread dumps are created in the defined log file.

      While creating service, we can provide log option in installservice script    as:

      -log:"d:\bea\domains\mydomain\myserver-stdout.txt

•         Before we analyze thread dumps, it is important to know the common thread states:

 1)Runnable [marked as R in some VMs]:

        This state indicates that the thread is either running currently or is ready to run the   next time the OS thread scheduler schedules it.
       2)Object.wait() [marked as CW in some VMs]:

         Indicates that the thread is waiting for some condition to be fulfilled.


3)Waiting for monitor entry [marked as MW in some VMs]:

    Indicates that the thread is waiting to enter a synchronized block.

   These threads are something to watch out because there is lock contention here. Thread
   is waiting for a lock on object and some other thread is holding the lock.
   

In case of weblogic, the main worker threads are from group weblogic.kernel.defalt:
    "ExecuteThread: '1' for queue: 'weblogic.kernel.Default'“….
     This is the set of threads we need to look for hang/slow performance issues.
    This is a snapshot of idle thread waiting for some work to be assigned.
     On an idle system you would see lot of threads in the below state:
     
       "ExecuteThread: '1' for queue: 'weblogic.kernel.Default'" daemon prio=5  tid=0x031a6308 nid=0x980 in Object.wait() [2dff000..2dffd8c]
        at java.lang.Object.wait(Native Method)
        - waiting on <0x112cf2c0> (a weblogic.kernel.ExecuteThread)
        at java.lang.Object.wait(Object.java:429)
        at weblogic.kernel.ExecuteThread.waitForRequest(ExecuteThread.java:153)
        - locked <0x112cf2c0> (a weblogic.kernel.ExecuteThread)
        at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:172)


•         As for thread dump analysis & conclusion, lets see a sample thread dump and drill into it further
    Demo of RSD thread dump (Thread stuck issue on UAT)


No comments:

Post a Comment