This document is meant to increase the understanding of engineers as to performance issues related to OpenVPN. This is not a Howto but rather a benchmark guide to assist developers, testers, engineers and deployment technicians to understand the typical advantages and performanc considerations associated with OpenVPN.
OpenVPN is a fantastic technology to use for VPN traffic and it excels in many regards to IPSec and PPTP. It is more secure than PPTP and faster than IPSec (usually). This is due to several factors. OpenVPN uses UDP instead of TCP for its transmission. It handles dropped packets at a higher level in the networking stack. When using clear lines of communication with infrequent drops, OpenVPN will stream data up to twice as fast as PPTP or IPSec. Security with OpenVPN can be made to be as strong as you like it. OpenVPN uses a library approach to encryption. It is far more secure than PPTP for this reason. OpenVPN is also resilient on networks. It doesn't need special packets (like GRE) to function. Router and port forwarding concerns are all but eliminated
The downside to OpenVPN is that in its current architecture, it is not scalable. It runs as a monolithic process and cannot run multi-threaded. This means that if you have a beefy processor with 8 cores OpenVPN will use 1 of them.
This translates to is a cap on the bandwidth which must be shared between all hosts connecting to the ClearOS server running OpenVPN. Here are some typical examples of speed on a ClearBOX 100 when connected with OpenVPN to a local web server.
Users-MacBook-Pro:test user$ curl -O http://10.8.0.1/2GB % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 2030M 100 2030M 0 0 8852k 0 0:03:54 0:03:54 --:--:-- 9594k Users-MacBook-Pro:test user$ curl -O http://10.8.0.1/2GB % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 2030M 100 2030M 0 0 9380k 0 0:03:41 0:03:41 --:--:-- 9625k
The file used in this test was a blob constructed from /dev/urandom. The resulting download average is 9116 kBytes/second or about 73 Mbps. Not too shabby for a singular 1.66MHz core. By default on ClearOS, OpenVPN will use compression to improve performance. This performance boost from compression can be seen with the following example where a 2GB file that is all zeros is downloaded:
Users-MacBook-Pro:test user$ curl -O http://10.8.0.1/2GB.0 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1953M 100 1953M 0 0 13.1M 0 0:02:27 0:02:27 --:--:-- 13.2M Users-MacBook-Pro:test user$ curl -O http://10.8.0.1/2GB.0 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1953M 100 1953M 0 0 13.3M 0 0:02:26 0:02:26 --:--:-- 13.3M
This bandwidth is very 'processor-based' and because it is single threaded, it must be shared between all OpenVPN users. Faster processors will result in greater performance but not commensurate with the total available processor cores. This is because the service is not multithreaded. Under heavy use you can expect to see one processor capped and the bandwidth total distributed among the various connections.
The following table contains benchmarks for various ClearBOX solutions under this test (this test is limited for now and will be populated as results come in):
|ClearBOX Model||Processor||Memory||Encryption||Compression Rate||Bandwidth|
|100||D510-Atom(TM) 1.66GHz||1GB||Standard||High||106 Mbps|
|100||D510-Atom(TM) 1.66GHz||1GB||Standard||Super Low||73 Mbps|
The number of connections creates additional overhead for the server which is also absorbed by the single process. The bandwidth which is available will be shared when using a single instance of OpenVPN.
Increasing performance strategy
Knowing the limitations, what options are available to us when we want to increase the capacity of OpenVPN for the tunnels?
Luckily a method does exist but is beyond the scope of this document. Using OpenVPN in multiple processes can allow your kernel to balance the traffic between multiple cores. To maximize efficiency, you can have 1 process per core. Your kernel will balance the other loads on your system efficiently but considering a single OpenVPN process can maximize the utility of a single core, it is only recommended to run one process per core. You will NOT have to home the process on any particular core. ClearOS is capable of juggling just where that process will run most efficiently.
To figure out how many cores your computer has available to it in ClearOS, run the following from command line:
cat /proc/cpuinfo | grep processor
A two core system might yield the following results:
processor : 0 processor : 1
The problem with scaling road warriors is that in order to make it easy, you typically need to have all of them on the same OpenVPN process. This simplifies configuration distribution. It is possible, however, to run one set of road warriors on one OpenVPN daemon process and another set with a different process and a different port on another.
Site to Site tunnels
Site-to-site tunnels are well suited for the multiple process paradigm. Once you configure the tunnel, you shouldn't have to mess much with it. Moreover, because you are running in a separate process, you don't have to take down ALL of the OpenVPN connections to restart the process associated with one tunnel.
If you are going through the trouble of allocating multiple processes, you can run all your site to site tunnels on a separate process than your users. For example, if you had a quad core server, you might have 4 OpenVPN processes running:
- process 1: port 1194 Road warriors
- process 2: port 1195 VPN to site 1
- process 3: port 1196 VPN to site 2
- process 4: port 1197 VPN to site 3