Concurrent HTTP connections in Node.js

Concurrent HTTP connections in Node.js

Concurrent HTTP connections in Node.js

原文: https://fullstack-developer.academy/concurrent-http-connections-in-node-js/

------------------------------------------------------------------------------------------

Browsers, as well as Node.js, have limitations on concurrent HTTP connections. It is essential to understand these limitations because we can run into undesired situations whereby an application would function incorrectly. In this article, we will review everything that you, as a developer, need to be familiar with regarding concurrent HTTP connections.

Browser

Browsers adhere to protocols - and the HTTP 1.1 protocol states that a single client (a user client) should not maintain more than two concurrent connections. Now, some older browsers do enforce this, however, generally speaking, newer browsers - often referred to as "modern" browsers - allow a more generous limit. Here's a more precise list:

  • IE 7: 2 connections
  • IE 8 & 9: 6 connections
  • IE 10: 8 connections
  • IE 11: 13 connections
  • Firefox, Chrome (Mobile and Desktop), Safari (Mobile and Desktop), Opera: 6 connections

For the rest of this article remember the number 6 - this will play a crucial part when we go through our example.

Node.js

If you have worked with, learned or just read about Node.js before, you know that it is a single-threaded, non-blocking framework. This means that it allows a significant number of concurrent connections - all of this is made available by the JavaScript event loop.

The actual limit of connections in Node.js is determined by the available resources on the machine running the code and by the operating system settings as well.

Back in the early days of Node.js (think v0.10 and earlier), there was an imposed limit of 5 simultaneous connections to/from a single host. What does this mean? Under the hood when you are using the Node.js built-in HTTP module or any other module that uses the HTTP module like Express.js or Restify, you are in fact using a connection pool and HTTP keep-alive. This is great for performance improvement - think about the cycle like the following: an HTTP request is processed, this opens a TCP connection, for a new request an existing TCP connection can be used. (Without the keep-alive the process would be less performant by having to create a TCP connection, serve a response close the TCP connection and start this again for the next request)

In version higher than 0.10 the maxSockets value has been changed to Infinity.

The keep-alive is sent by the browser and we can easily see this if we log the request object in Node.js in the appropriate location. It should yield something similar to this (example taken from a Restify server):

headers:
{ host: 'localhost:3000',
'content-type': 'text/plain;charset=UTF-8',
origin: 'http://127.0.0.1:8080',
'accept-encoding': 'gzip, deflate',
connection: 'keep-alive',

Example

Let's take a look at a very straightforward example. Let's assume that we have some sort of a frontend where we are sending data to a backend (this is usually how modern applications work, a frontend framework making requests to a Backend API). For the our example, the data that we are sending is less important - it's equally applicable to a bulk file upload or anything else.

Trivia: I have in fact came across this issue while working on an application that did a bulk upload of images and sent it to a backend API for further processing.

Let's create a simple Restify API server:

const restify = require('restify');
const corsMiddleware = require('restify-cors-middleware');
const port = 3000;
const server = restify.createServer();
const bunyan = require('bunyan'); const cors = corsMiddleware({
origins: ['*'],
}); server.use(restify.plugins.bodyParser());
server.pre(cors.preflight);
server.use(cors.actual); server.post('/api', (req, res) => {
const payload = req.body;
console.log(`Processing: ${payload}`);
}); server.listen(port, () => console.info(`Server is up on ${port}.`));

This is very straightforward. Astute readers would already have noticed a somewhat crucial mistake in the code above but don't worry; it is made deliberately. So this API receives data sent via an HTTP POST request and displays a log message stating that it is processing whatever was sent as part of the request. (Again, the processing could be whatever we wanted, but for this discussion, it's just a simple console statement.)

Let's also create a simple frontend. Let's create a very simple index.htmland add the following content in between <script> tags:

const array = Array.from(Array(10).keys());
array.forEach(arrayItem => {
fetch('http://localhost:3000/api', {
method: 'POST',
mode: 'cors',
body: JSON.stringify(`hello${arrayItem}`)
})
.then(response => console.log(response.json()))
.catch(error => console.error(`Fetch Error: `, error));
});

Here, the Fetch API is used to iterate through 9 items (mimicking an upload of 9 files for example) and sending 9 HTTP POST requests to the Restify API discussed earlier.

Start up the API, also load the index.html via an HTTP server and let's see the results.

Here are two easy ways of firing up an HTTP server in an easy way: either use python -m SimpleHTTPServer 8000 (v2) or python -m http.server 8080 (v3). Or do a global npm install of http-server and then just do http-server from the folder where you have the index.html file.

Concurrent HTTP connections in Node.js

It's fascinating what we see. Even though we have made 9 HTTP POSTrequests only six have arrived to the Restify API since we see 6 log statements.

However, if you wait about 2 minutes, additional log statements will appear.

So what is going on here?

Remember what we said before - the browser (in this case Safari) is capable of making six requests to the same host (in this case the connection is between our browser and the API running on port 3000 on localhost).

The connection is kept alive because we are not returning anything from the Node.js API. This was the mistake that I have deliberately made to make a point. So the browser sends six requests, and Node.js receives these but it never sends any information back rendering the remaining requests to be blocked.

So why are the other log statements visible later? The answer is simple: there's also a timeout, which is by default 2 minutes. After 2 minutes the request is cleared, so new requests are processed.

Let's update our code with these values:

server.server.maxConnections = 20;
function getConnections() {
server.server.getConnections((error, count) => console.log(count));
}
// add getConnections() in the API call:
server.post('/api', (req, res) => {
// ...
getConnections();
});

The server.server.maxConnections = 20; is there just to make a point that no matter how big this number is it's not going to change the outcome because we are still not returning anything (remember it is set to be Inifity anyway):

Concurrent HTTP connections in Node.js

However, add the following setting to change the behaviour:

server.server.setTimeout(500);

The result is going to be a lot different. Since we are overwriting the timeout of the server, we only wait 500 ms and get rid of a pending request, allowing new requests to come in.

Please note that this is not a real solution to this problem, it is just for demonstration purposes.

Solving the problem

The right way to solve this problem is of course to return a response from the API:

server.post('/api', (req, res) => {
const payload = req.body;
console.log(`Processing: ${payload}`);
return res.json(`Done processing: ${payload}`);
});

Now all data is going to be processed just fine:

Concurrent HTTP connections in Node.js

All uploads are now processed just fine.

Remember, res.json() under the hood uses res.send() which in turn also uses res.end() to send a response and to end it. This is true for both Restify and Express.js as well.

Conclusion

What is the moral of the story? Always close HTTP connections - no matter how, but close them - if you're making API calls consult the API documentation as well to close any active HTTP connection.

上一篇:Python3.x:定时任务实现方式


下一篇:开发指南专题八:JEECG微云高速开发平台数据字典