Linux Performance Tuning for Network Applications
Getting the best performance out of Linux network applications requires understanding and tuning various system parameters. Here’s a comprehensive guide to optimizing your Linux system for high-performance networking.
Network Stack Tuning
TCP Buffer Sizes
# Increase TCP buffer sizes
echo 'net.core.rmem_max = 134217728' >> /etc/sysctl.conf
echo 'net.core.wmem_max = 134217728' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_rmem = 4096 87380 134217728' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_wmem = 4096 65536 134217728' >> /etc/sysctl.conf
# Apply changes
sysctl -p
TCP Congestion Control
# Check available congestion control algorithms
cat /proc/sys/net/ipv4/tcp_available_congestion_control
# Set BBR for better performance
echo 'net.ipv4.tcp_congestion_control = bbr' >> /etc/sysctl.conf
Connection Tracking
# Increase connection tracking table size
echo 'net.netfilter.nf_conntrack_max = 1048576' >> /etc/sysctl.conf
echo 'net.netfilter.nf_conntrack_tcp_timeout_established = 7200' >> /etc/sysctl.conf
CPU and IRQ Optimization
CPU Affinity
# Bind network interrupts to specific CPUs
for irq in $(grep eth /proc/interrupts | cut -d: -f1); do
echo "2" > /proc/irq/$irq/smp_affinity
done
# Set CPU affinity for your application
taskset -c 2,3 ./your-network-app
RPS (Receive Packet Steering)
# Enable RPS for network interfaces
echo '7f' > /sys/class/net/eth0/queues/rx-0/rps_cpus
echo '32768' > /sys/class/net/eth0/queues/rx-0/rps_flow_cnt
File Descriptor Limits
# Increase file descriptor limits
echo '* soft nofile 1048576' >> /etc/security/limits.conf
echo '* hard nofile 1048576' >> /etc/security/limits.conf
# Set system-wide limits
echo 'fs.file-max = 2097152' >> /etc/sysctl.conf
Memory Management
Huge Pages
# Configure huge pages
echo 'vm.nr_hugepages = 1024' >> /etc/sysctl.conf
echo 'vm.hugetlb_shm_group = 0' >> /etc/sysctl.conf
# Mount huge pages
mount -t hugetlbfs nodev /dev/hugepages
Swappiness
# Reduce swappiness for better performance
echo 'vm.swappiness = 1' >> /etc/sysctl.conf
Application-Level Optimizations
Non-blocking I/O
// Use non-blocking sockets in Go
func setNonBlocking(conn net.Conn) error {
file, err := conn.(*net.TCPConn).File()
if err != nil {
return err
}
defer file.Close()
return syscall.SetNonblock(int(file.Fd()), true)
}
Zero-Copy Networking
// Use sendfile for zero-copy file transfers
func sendFile(w http.ResponseWriter, filePath string) error {
file, err := os.Open(filePath)
if err != nil {
return err
}
defer file.Close()
stat, err := file.Stat()
if err != nil {
return err
}
w.Header().Set("Content-Length", strconv.FormatInt(stat.Size(), 10))
_, err = io.Copy(w, file)
return err
}
Monitoring and Metrics
Network Statistics
# Monitor network interfaces
sar -n DEV 1 10
# Check TCP statistics
ss -s
# Monitor connection tracking
cat /proc/net/nf_conntrack | wc -l
Performance Tools
# Use perf for detailed profiling
perf record -g ./your-app
perf report
# Network latency monitoring
ping -i 0.1 target-host
# Bandwidth testing
iperf3 -c target-host -t 60 -P 4
eBPF for Advanced Monitoring
// eBPF program to monitor TCP connections
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__type(key, __u32);
__type(value, __u64);
__uint(max_entries, 256);
} tcp_stats SEC(".maps");
SEC("kprobe/tcp_connect")
int trace_tcp_connect(struct pt_regs *ctx) {
__u32 key = 0;
__u64 *count = bpf_map_lookup_elem(&tcp_stats, &key);
if (count) {
__sync_fetch_and_add(count, 1);
}
return 0;
}
Best Practices
- Profile before optimizing - Always measure actual bottlenecks
- Test incrementally - Apply changes one at a time
- Monitor continuously - Set up proper monitoring and alerting
- Document changes - Keep track of all tuning parameters
- Test under load - Verify optimizations under realistic conditions
Common Pitfalls
- Over-tuning: Don’t change parameters without understanding their impact
- Ignoring context: Different workloads need different optimizations
- Forgetting security: Performance changes shouldn’t compromise security
- Missing monitoring: You can’t optimize what you can’t measure
Conclusion
Linux performance tuning is both an art and a science. Start with the basics, measure carefully, and iterate based on your specific workload requirements. Remember that the best optimization is often algorithmic improvement rather than system tuning.
In future posts, we’ll explore specific case studies and advanced eBPF techniques for network monitoring.