Tuesday, September 21, 2010

Why ESB server hang after runing for a while?

A frequently asked question on FuseSource[1] forum or Apache Servicemix[2] mailling list is they encounter an issue that ESB server just hang after running a while, can't handle incoming request anymore. This issue could be caused for two major reason

1. customer working flow isn't correct, not all MEPs are handled correctly, especially when use sendSync, but some MessageExchange never end up with DONE/ERROR, this will cause thread lock, and threads would be used up in this case so that can't hanle incoming request anymore.
2. Too many cocurrent client request message cause threadpool used up, customer usually face this problem when they do performance/load test.The solution is quite straightforward for different cases,
for case1. correct your working flow, ensure that all MEPs are handled correctly, use send but not sendSync as much as possible.
for case2. configure threadpool[3] for the performance test component, ensure the threadpool big enough for your performance/load test.

I saw several cases that customer use cxfbc:consumer endpoint as facade to handle performance/load test request, the typical test flow could be
cxfbc:consumer==>servicemix-camel
cxfbc:consumer==>cxfse
cxfbc:consumer==>servicemix-bean
cxfbc:consumer==>cxfbc:provider
 Previously By default cxfbc:consumer endpoint use sendSync(this is a block mehtod,waiting for response message) to send message to NMR, this can cause deadlock for cocurrent request with default threadpool configuration, so if you wanna support heavy cocurrent request better,  you need add synchronous="false" attribute to cxfbc:consumer endpoint, this will leverage CXF continuation API(I will explain how in another blog) and use send(this is a non-block method) method to send message to NMR. Since servicemix-cxf-bc-2010.02, synchronous="false" is the default value, which means the default behavior of cxfbc is asynchronously.

Btw, to configure threadpool for NMR and each component in SMX4 is different with it in SMX3[3]. In SMX4, we leverage OSGi Configuration admin service to configure properties, so customer need edit a file named org.apache.servicemix.nmr.cfg in $SMX_HOME/etc folder to configure threadpool for nmr, and edit a file named org.apache.servicemix.components.component-name(like cxfbc).cfg in $SMX_HOME/etc folder for each component.
[1]http://fusesource.com/
[2]http://servicemix.apache.org/home.html
[3]http://servicemix.apache.org/thread-pools.html

2 comments:

  1. Hello,

    We're working actually with apache-servicemix-4.4.1-fuse-01-13 and we have some performance issues.

    in fact, we are using servicemix with camel and cxf. when runing a stress test on the server we got a lot of sockettimeoutException, We had profiled the server and we had found that fuse can't start more than 35 threads (Net I/O ones)..

    We have googled how to increase that numbre but without a result !!

    can you help us with that ?

    thanks

    ReplyDelete