Nginx thousands of server config files very slow to reload (nginx -s reload)

Ekaitz Hernandez Troyas asked:

I have one master nginx.conf where I include the rest of my servers (sever blocks) with the include directive:

include myservers/*.conf;

My problem is when I have a new configuration file in myservers/ I need to reload the nginx with nginx -s reload

The problem takes long time to reload the server takes 1 minute and this is going to grow, because I will have more and upstream servers.

Do you see any technique to improve this?

The only solution I have found for now is the paid version of Nginx Nginx Plus API https://docs.nginx.com/nginx/admin-guide/load-balancer/dynamic-configuration-api/ where you can add new upstream severs dynamically with a REST API without any reload.

Also I was thinking to have a kind sharding technique with one master wirth hash keys to slave servers (like elasticsearch with the RAFT algorithm to keep the consensus state) so when you need to update you have only reload one slave server with less upstream servers.

My answer:


I fired up a fresh virtual machine (with SSD backed storage) and installed nginx on it. And then I wrote a script to generate a huge number of files, each containing a single server block. They looked a lot like this:

[[email protected] ~]# cat /etc/nginx/sites/server047393.conf 
server {
    listen 80;
    listen [::]:80;
    server_name server047393;
}

At first I made 50,000 of them, but that only took 9 seconds to reload nginx, so then I moved up to 100,000. With that, it was consistently taking 20 seconds to reload nginx. About the first half of that time was disk I/O wait, the latter half was CPU. With this number of server blocks, nginx is using nearly 1GiB of RAM.

This really doesn’t look like a problem, unless you have the nginx configuration on a really slow disk. It does get re-read in its entirety when you reload or restart nginx. With a rotational disk, this easily could take a couple of minutes to reload. Use an SSD or even a RAM disk to store the nginx configuration.

Indeed, nginx’s own optimization advice for server names hardly mentions configuration parsing time. It’s not really something you should care about very much. What it does talk about a lot is the amount of time it takes to locate the correct server block to process an incoming request. By default nginx tries to optimize this to minimize CPU cache line misses.

To optimize this for your larger number of server names, you might need to do nothing, but you probably do need to adjust the server_names_hash_max_size directive. Run nginx -t. If you see a message like this:

nginx: [warn] could not build optimal server_names_hash, you should increase either server_names_hash_max_size: 512 or server_names_hash_bucket_size: 64; ignoring server_names_hash_bucket_size

Then you should tune server_names_hash_max_size. Start by setting this to a power of two larger than the number of server_names you are creating. If you have 30,000 server names, start with server_names_hash_max_size 32768.

The optimization document does mention that if nginx’s start time is unacceptably long, try to increase server_names_hash_bucket_size. I found in testing that this didn’t really help, but if you want to try it, increase it by powers of two each time. This value must be a power of two, or nginx will not start. This value is set by default according to the CPU cache line size, so if you’re on a virtual machine and the CPU properties are not correctly exposed to the VM, you might be able to safely double this number (or nginx refused to start in the first place, but that’s a slightly different error message). Don’t go overboard with it, or your incoming requests will be slowed down by CPU cache misses.


View the full question and any other answers on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.