HttpClient - Best Pratices




URLConnection -> HttpClient


Except to manage specific resources other than http / https (ftp, gopher, file ...), use the HttpClient and not the HttpURLConnection, and avoid to use both at the same time. The HttpClient is more mature and has a richer feature set as multi-threaded management, cookies, pipelining, connection pools ... 

Instantiation


Instantiating a HttpClient is costly in resources, we must focus on reuse, even singletons, as described in the doc:
« Generally it is recommended to have a single instance of HttpClient per communication component or even per application. However, if the application makes use of HttpClient only very infrequently, and keeping an idle instance of HttpClient in memory is not warranted, it is highly recommended to explicitly shut down the multithreaded connection manager prior to disposing the HttpClient instance. This will ensure proper closure of all HTTP connections in the connection pool. »

Concurrent execution of HTTP methods

The instance of HttpClient uses by default a SimpleHttpConnectionManager that allows access to a single resource http simultaneously. In a multithreaded environment, accessing to multiple resources, you have to use an implementation of HttpConnectionManager managing a set of http connections and avoid the notorious error : SimpleHttpConnectionManager being used incorrectly. Be sure that HttpMethod.releaseConnection() is always called and that only one thread and/or method is using this connection manager at a time.


From the doc : « If the application logic allows for execution of multiple HTTP requests concurrently (e.g. multiple requests against various sites, or multiple requests representing different user identities), the use of a dedicated thread per HTTP session can result in a significant performance gain. HttpClient is fully thread-safe when used with a thread-safe connection manager such as MultiThreadedHttpConnectionManager. »

Streaming


When retrieving an HTTP response in bytes or String, the entire contents of the response is loaded into memory. Using Stream via the method getResponseAsStream is more appropriate :
« Note: This will cause the entire response body to be buffered in memory. A malicious server may easily exhaust all the VM memory. It is strongly recommended, to use getResponseAsStream if the content length of the response is unknown or resonably large. »

Connection persistence

Once the call is made ​​to the resource, the connection can be reused by his manager if the method call ReleaseConnections () is done.

Example
HttpClient httpclient = new HttpClient(new MultiThreadedHttpConnectionManager());
  GetMethod httpget = new GetMethod("http://www.myhost.com/");
  try {
    httpclient.executeMethod(httpget);
    Reader reader = new InputStreamReader(
            httpget.getResponseBodyAsStream(), httpget.getResponseCharSet());
    // consume the response entity
  } finally {
    httpget.releaseConnection();
  }

Source:
http://hc.apache.org/httpclient-3.x/threading.html
http://hc.apache.org/httpclient-3.x/performance.html
http://download.oracle.com/javase/1.4.2/docs/api/java/net/HttpURLConnection.html

Labels: , ,