Fine tuning performance of NIC RX/TX rings

    This Knowledge Base article applies to:
      Amaranten Firewall 8.10.00 and up

As of version 8.10.00, RX/TX ring sizes of some NIC types can be fine-tuned to optimize performance for the
specific task at hand.
These settings can be found in the "HwPerformance" settings section.

Most people will never need to touch these settings; their defaults are good enough for the majority of
cases. In either case, the effects of changing TX/RX ring sizes should be fully understood before attempting
any changes.
The defaults
"e100" driver
»  Ringsize_e100_rx 32 The number of packet buffers always kept on the RX rings of
each e100 NIC, ready to receive packets from the NIC. Most of
our drivers for 100Mbps NICs use 32-slot RX rings today.
»  Ringsize_e100_tx 128 The maximum number of packets that can be kept waiting on the
TX ring, waiting to be picked up for transmission by the NIC.
 
"e1000" driver
»  Ringsize_e1000_rx 64 Gigabit NICs currently use twice the amount of packet
buffers to support the greater link speed. This default may be raised in the future.
»  Ringsize_e1000_tx 256 -"-
Things to keep in mind
»  Each NIC has a set of RX/TX rings
Multiply your ring sizes with the respective number of NICs to find out the
total number of packet buffers.
»  Set the "HighBuffers" setting appropriately
At startup, ~200 buffers are always allocated.
The "HighBuffers" setting controls how many more buffers are allocated. The
"HighBuffers" setting defaults to 1024 buffers, for a total of ~1200.
You should never have initial buffer usage over 50%, and preferably below 20%.
»  RX rings are always "in use"
If you set the RX ring size to 64, and have eight NICs, 512 packet buffers
will be consumed immediately at start-up.
»  TX rings are not immediately used, but may be filled
In an idle systems, there are no packet buffers on the TX rings. However, if a NIC is fed packets faster than it can send them, or faster than the system can push the packet buffers over the PCI bus to the NIC, the TX ring can be filled up completely.
»  Different makes and models support different ring sizes
The default setting will work with all makes and models of our supported NICs. However, increasing them substantially may and may not work, depending on the make and model of NIC you have. Unfortunately, we cannot track all individual chipset differences in one NIC family.
»  Each packet buffer uses about 2.5 Kbytes of RAM
Use the 'memory' console command to see the current usage.
 
Optimizing for "Zero Loss" gigabit performance

A system running well within tolerance limits will not lose packets. However, in a "Zero Loss" performance
test, PCI and host buses may become choke points, and lose an occasional packet.

Optimizing for a gigabit zero loss performance test will require ring sizes of at least 1024 packets and maybe even 8192 packets. This is consistent with observations from other PCI-based network applications.

However, keep in mind that such excessive ring sizes come at a price. While indeed enabling wire-speed zero loss throughput, it will increase latency beyond what would be normally accepted. This is also consistent
with observations from other PCI-based network applications.

As a general recommendation, we, and many others, do not recommend zero-loss performance
testing
for anything but layer two equipment (e.g. switches). Rather, a "goodput" test with a low loss
tolerance like 0.1% or 0.01% will produce much more meaningful values for real-world applications.