(备注:本文最早写于2016年,发表在网易博客,因网易博客停止运营,将文章转移到云栖社区。
当时因工作的原因用英文书写,表达的不准确的地方望留言指正)
Recently, I looked into the “too many open files” issues on linux server. I summarized the analysis in detail. If you encounter the file descriptor leaks problem, or are interested in this kind of issue,it might be helpful to you.
What’s file descriptor?
On linux server, opening a file or a network connection would cost a file descriptor. A application has limitation on file descriptors, each process can cost at most 1024 file descriptors by default.
However, we can enlarge the value, but in most cases, we don’t need to enlarge the value. And enlarging the value wouldn’t give us much help if there are file descriptor leaks in our code.
How to find out file descriptor leaks?
When our applications cost file descriptors more than the limitation(default 1024),we will encounter the too many open files exception on linux.
As far as I know, there are at least 3 kinds of Exceptions which could be caused by too many open file:FileNotFoundException,ClassNotFoundException,UnknownHostException.
File descriptor leaks happened when we didn’t close Stream ( like IO Streams or Network Connections or DB Connections ) after we finished use it. The object out of reference should not imply that the resource has been released .
We can use the lsof command to show all the file descriptors which are opened by our application.
If there are many connections keeps CLOSE_WAIT or can’t identify protocol for long time (may be some minutes), or the file opening number keeps going up, the file descriptor leaks might happened.
Release file descriptor
It is always a good manner to release the resources using try…catch…finally blocks.
Release File Stream
For IO Streams on local file, the file descriptor could be immediately released after we close the stream. If we don’t close it in the code, it would be released after GC, which could takes much longer time.
Release Network Stream
For Network Connections, it is a litter complex. For any TCP based protocol, It needs four-way handshake to release the connections.
For more detail, we can refer to http://en.wikipedia.org/wiki/Transmission_Control_Protocol
Take Http Connections in Java for example. We use the HttpURLConnection as the Client to connect the Servlet deployed in tomcat server.
Passive Close
If we don’t close the connection actively. The connection status would keep ESTABLISHED until the Initiator close it, meaning send FIN.
How long will it keep ESTABLISHED? It depends on the keep-alive timeoutconfiguration on server. For details on HTTP keep-alive,
we can refer to http://docs.oracle.com/javase/7/docs/technotes/guides/net/http-keepalive.html
After the server close the connection, the connection STATUS on our side would change to CLOSE_WAIT. This status implies that the server is waiting for the client to close the connection, meaning send ACK and FIN.
If we close the connection now, the connection would be release on our side. And the connection on the server side would change to TIME_WAIT, which needs 2MSL to release the connection.
For more detail on 2MSL, we can refer to http://www.vorlesungen.uni-osnabrueck.de/informatik/networking-programming/notes/22Nov96/5.html .
If we don’t ,the file descriptor leaks might happen. The connection which is in CLOSE_WAIT can only be released by 3 ways:
1、 GC
2、 after (net.ipv4.tcp_fin_timeout+2MSL) configured in the other site server. Usually >5 minutes.
3、 After more than net.ipv4.tcp_keepalive_time (default 2 hours) configured in the other site server.
The CLOSE_WAIT connections may change into can’t identify protocol before we close it.
Active Close
If we actively close the connection, meaning we close the connection before the other side server close it. The connection would change into TIME_WAIT before released.
Then it will wait 2MSL to release the connection.
The TIME_WAIT connections are out of application’s control, and it wouldn’t be count in application’s file descriptor.
We usually close http connection like this:
It works fine for most cases.
But in the following case it can’t close the connection as we want.
import java.io.InputStream;
import java.net.HttpURLConnection;
import java.net.URL;
public class HttpConnectionTest {
public static void main(String[] args) throws Exception {
String urlString = "http://www.jupiteronline.com/~/media/Literature/Factsheets/SICAV/Jupiter%20USD%20Dynamic%20Bond%20Factsheet.pdf";
InputStream in = null;
try{
URL url = new URL(urlString);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
in = conn.getInputStream();
}catch(Exception e){
e.printStackTrace();
}finally{
if(null!=in){
in.close();
}
}
}
}
In this case, An exception throw when we try to get the InputStream, since in==null, but the connection has been established.
In order to guarantee the connection would be closed, we change it to this way.
import java.io.InputStream;
import java.net.HttpURLConnection;
import java.net.URL;
public class HttpConnectionTest {
public static void main(String[] args) throws Exception {
String urlString = "http://www.jupiteronline.com/~/media/Literature/Factsheets/SICAV/Jupiter%20USD%20Dynamic%20Bond%20Factsheet.pdf";
InputStream in = null;
HttpURLConnection conn =null;
try{
URL url = new URL(urlString);
conn = (HttpURLConnection) url.openConnection();
in = conn.getInputStream();
}catch(Exception e){
e.printStackTrace();
}finally{
if(null!=in){
in.close();
}
if(null!=conn){
InputStream es = conn.getErrorStream();
if(null!=es){
es.close();
}
}
}
}
}