Linux Performance Tuning for Network Applications

Getting the best performance out of Linux network applications requires understanding and tuning various system parameters. Here’s a comprehensive guide to optimizing your Linux system for high-performance networking.

Network Stack Tuning

TCP Buffer Sizes

# Increase TCP buffer sizes
echo 'net.core.rmem_max = 134217728' >> /etc/sysctl.conf
echo 'net.core.wmem_max = 134217728' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_rmem = 4096 87380 134217728' >> /etc/sysctl.conf
echo 'net.ipv4.tcp_wmem = 4096 65536 134217728' >> /etc/sysctl.conf

# Apply changes
sysctl -p

TCP Congestion Control

# Check available congestion control algorithms
cat /proc/sys/net/ipv4/tcp_available_congestion_control

# Set BBR for better performance
echo 'net.ipv4.tcp_congestion_control = bbr' >> /etc/sysctl.conf

Connection Tracking

# Increase connection tracking table size
echo 'net.netfilter.nf_conntrack_max = 1048576' >> /etc/sysctl.conf
echo 'net.netfilter.nf_conntrack_tcp_timeout_established = 7200' >> /etc/sysctl.conf

CPU and IRQ Optimization

CPU Affinity

# Bind network interrupts to specific CPUs
for irq in $(grep eth /proc/interrupts | cut -d: -f1); do
    echo "2" > /proc/irq/$irq/smp_affinity
done

# Set CPU affinity for your application
taskset -c 2,3 ./your-network-app

RPS (Receive Packet Steering)

# Enable RPS for network interfaces
echo '7f' > /sys/class/net/eth0/queues/rx-0/rps_cpus
echo '32768' > /sys/class/net/eth0/queues/rx-0/rps_flow_cnt

File Descriptor Limits

# Increase file descriptor limits
echo '* soft nofile 1048576' >> /etc/security/limits.conf
echo '* hard nofile 1048576' >> /etc/security/limits.conf

# Set system-wide limits
echo 'fs.file-max = 2097152' >> /etc/sysctl.conf

Memory Management

Huge Pages

# Configure huge pages
echo 'vm.nr_hugepages = 1024' >> /etc/sysctl.conf
echo 'vm.hugetlb_shm_group = 0' >> /etc/sysctl.conf

# Mount huge pages
mount -t hugetlbfs nodev /dev/hugepages

Swappiness

# Reduce swappiness for better performance
echo 'vm.swappiness = 1' >> /etc/sysctl.conf

Application-Level Optimizations

Non-blocking I/O

// Use non-blocking sockets in Go
func setNonBlocking(conn net.Conn) error {
    file, err := conn.(*net.TCPConn).File()
    if err != nil {
        return err
    }
    defer file.Close()
    
    return syscall.SetNonblock(int(file.Fd()), true)
}

Zero-Copy Networking

// Use sendfile for zero-copy file transfers
func sendFile(w http.ResponseWriter, filePath string) error {
    file, err := os.Open(filePath)
    if err != nil {
        return err
    }
    defer file.Close()
    
    stat, err := file.Stat()
    if err != nil {
        return err
    }
    
    w.Header().Set("Content-Length", strconv.FormatInt(stat.Size(), 10))
    _, err = io.Copy(w, file)
    return err
}

Monitoring and Metrics

Network Statistics

# Monitor network interfaces
sar -n DEV 1 10

# Check TCP statistics
ss -s

# Monitor connection tracking
cat /proc/net/nf_conntrack | wc -l

Performance Tools

# Use perf for detailed profiling
perf record -g ./your-app
perf report

# Network latency monitoring
ping -i 0.1 target-host

# Bandwidth testing
iperf3 -c target-host -t 60 -P 4

eBPF for Advanced Monitoring

// eBPF program to monitor TCP connections
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>

struct {
    __uint(type, BPF_MAP_TYPE_ARRAY);
    __type(key, __u32);
    __type(value, __u64);
    __uint(max_entries, 256);
} tcp_stats SEC(".maps");

SEC("kprobe/tcp_connect")
int trace_tcp_connect(struct pt_regs *ctx) {
    __u32 key = 0;
    __u64 *count = bpf_map_lookup_elem(&tcp_stats, &key);
    if (count) {
        __sync_fetch_and_add(count, 1);
    }
    return 0;
}

Best Practices

Profile before optimizing - Always measure actual bottlenecks
Test incrementally - Apply changes one at a time
Monitor continuously - Set up proper monitoring and alerting
Document changes - Keep track of all tuning parameters
Test under load - Verify optimizations under realistic conditions

Common Pitfalls

Over-tuning: Don’t change parameters without understanding their impact
Ignoring context: Different workloads need different optimizations
Forgetting security: Performance changes shouldn’t compromise security
Missing monitoring: You can’t optimize what you can’t measure

Conclusion

Linux performance tuning is both an art and a science. Start with the basics, measure carefully, and iterate based on your specific workload requirements. Remember that the best optimization is often algorithmic improvement rather than system tuning.

In future posts, we’ll explore specific case studies and advanced eBPF techniques for network monitoring.

Sulabh Biswas / Blog