When the service is slow and the network is suspect, these are the tools we reach for, in this order, with the exact flags that find the answer.

On this page

Linux Network Debugging — tcpdump, ss, and eBPF in Anger

When a service is slow and the network is the suspect, you reach for tools. The good news is there are only a few you really need. The bad news is the man pages are long and the right flags are buried. This is a field-notes guide to the commands we run, in the order we run them, when we're trying to figure out why something is broken.

Tools covered: ss, tcpdump, bpftrace, iftop, mtr. Not netstat (deprecated; ss is faster and more accurate).

Start with `ss` — the connections view #

ss (socket statistics) replaces the old netstat. First questions on any latency issue: how many connections do we have, are any in a weird state, are queues backed up?

Connection counts by state.

bash.bash

ss -tan state established | wc -l
ss -tan state time-wait | wc -l
ss -tan | awk 'NR>1 {print $1}' | sort | uniq -c

A huge number of TIME-WAIT (tens of thousands) often points at short-lived connections — every closed connection sits in TIME-WAIT for ~60s. Solutions: connection reuse on the client (keepalive, pooling), or net.ipv4.tcp_tw_reuse = 1 on the kernel side.

A huge number of CLOSE-WAIT is worse — it means the local side hasn't closed sockets that the peer has closed. Usually an application bug (forgot to close the socket).

Listen queue depth.

bash.bash

ss -ltn

The Recv-Q column for a listening socket = current backlog (connections waiting to be accepted). The Send-Q column = configured backlog (somaxconn cap). If Recv-Q is consistently near Send-Q, your application isn't calling accept() fast enough — usually an event loop issue or thread starvation.

Per-socket details.

bash.bash

ss -tin

The -i flag adds TCP-level info: RTT, congestion window, retransmits, RTO. A connection with high retrans is having packet loss; one with high RTT is geographically distant or the network is slow.

Then `tcpdump` — the packet view #

When ss says something is wrong but doesn't tell you what, tcpdump does. Capture, analyze in Wireshark.

Capture to a file (don't try to read live — too noisy).

bash.bash

tcpdump -i any -w /tmp/cap.pcap -s 0 'host <peer-ip> and port <port>'

-s 0 = capture full packet (not just headers). -w writes binary, much faster than printing. Always filter by host + port; capturing everything makes the file unusable.

For TLS-encrypted traffic you can still see TCP-level details (handshake, retransmits, RTT) — you just can't read the payload.

Reading live (for sanity checks).

bash.bash

tcpdump -i any -nn -A 'host <peer> and port 80'

-nn = no DNS/port resolution (faster). -A = ASCII payload (useful for HTTP).

Common scenarios we capture for:

"The server isn't responding." Capture on the server, filter by client IP. Is the SYN arriving? Is the SYN-ACK going out? Did we send a RST? If we never sent SYN-ACK, the server-side listener isn't there (port not bound, firewall, listen queue full).
"Connection dies after 60 seconds." Capture; look for keepalive packets or the FIN/RST that closed it. Often a load balancer idle timeout.
"Some requests are slow." Capture for a few minutes. In Wireshark, sort by RTT or response time. Find the slow ones; look at the packet pattern.

When `tcpdump` is too coarse: `bpftrace`#

tcpdump shows packets. Sometimes you need to know what the kernel is doing with those packets. bpftrace runs eBPF programs from one-liners.

Find which processes are opening connections to a target.

bash.bash

bpftrace -e 'tracepoint:syscalls:sys_enter_connect /args->uaddr->sa_family == 2/ { 
  printf("%s -> %d.%d.%d.%d\n", comm, 
    args->uaddr->sa_data[2], args->uaddr->sa_data[3], 
    args->uaddr->sa_data[4], args->uaddr->sa_data[5]); 
}'

(Cleaner with the official tcpconnect tool from bpfcc-tools, but the inline version works without dependencies.)

Histogram of TCP round-trip times.

bash.bash

bpftrace -e 'kprobe:tcp_rcv_established { @[comm] = hist((nsecs - @start[arg1])/1000000); }'

This gives you per-process RTT distribution. Useful for "this service has weird latency to its dependencies."

Drop tracing.

bash.bash

bpftrace -e 'kprobe:kfree_skb { @[kstack] = count(); }'

Captures every dropped packet by kernel stack. Run for 30s, see the top drop sites. Usually points at firewall rules, queue overflow, or netfilter.

The bcc-tools package has dozens of pre-built tools for common needs: tcptracer, tcpretrans, tcptop, tcpsubnet. We start with these before writing custom bpftrace.

Network rate / who's sending what: `iftop`#

When you want a live view of bandwidth by connection:

bash.bash

iftop -i eth0 -n

Interactive view, sorted by current bandwidth. Useful for "who is saturating my NIC?" Real answer: the egress on a streaming endpoint, the rsync that someone started during business hours, the runaway log shipper.

Path issues: `mtr`#

When the question is "is this a us-or-them problem?", mtr shows the hop-by-hop path with packet loss and latency.

bash.bash

mtr -rwn -c 100 <target>

-r = report mode (run, then exit). -w = wide output. -c 100 = 100 packets. The output shows each hop and what percentage of packets it lost. A high-loss hop in the middle of the path is the network's fault; high loss at the destination is yours.

We use this when escalating to a cloud provider — "your network is dropping at hop X" is much more actionable than "the network is slow."

The order we run them #

For a "service is slow" report:

ss -tin on the source. Are connections established? RTT high? Retransmits?
ss -ltn on the destination. Is the listen queue full?
tcpdump between source and destination. Capture for 30 seconds; look at the pattern.
mtr if cross-region. Is there path loss?
bpftrace / tcptop / tcptracer if the application-level symptoms don't match what packets show.
Read application logs in parallel through all of this — half the time it's an app bug masquerading as a network issue.

Things we wish we'd known earlier #

tcpdump on a busy server can drop packets. The tool itself can't keep up with kernel-level traffic. Use -B (kernel buffer size) and capture to a fast disk. If still dropping, capture on a specific interface or filter aggressively. Dropped tcpdump packets show up as "first one is fine, then traces look broken" — confusing.

tcpdump with TLS reveals less than you think. You see the handshake, sizes, timing. You don't see payload. Use mitmproxy or capture on the unencrypted side (load balancer's internal hop) for content visibility.

Conntrack table fills up. On heavy-traffic gateways, the netfilter connection-tracking table can fill, causing connection drops. cat /proc/sys/net/netfilter/nf_conntrack_count vs nf_conntrack_max. If close, bump the max or skip conntrack for the relevant traffic.

Reverse path filter. rp_filter = 1 (the default) drops packets that arrive on an interface that wouldn't route back to the source. Bites in multi-NIC setups; an asymmetric route silently drops. Check /proc/sys/net/ipv4/conf/all/rp_filter.

MTU mismatches. When packets are dropped silently on certain sizes (large requests hang, small requests succeed), suspect MTU. The classic culprit is a VPN/tunnel reducing MTU below the path. ping -M do -s 1472 <target> to probe.

What to read next #

eBPF tools for everyday ops — bpftrace patterns — the eBPF side expanded
Linux io_uring — async I/O patterns we use — adjacent kernel-side performance
Linux performance tuning for production servers — broader system-perf knobs
Container resource limits — what they actually do — kernel resource accounting under containers

Network debugging rewards calm and a systematic approach. The kernel has been doing this for decades and is usually right about what it's seeing. The tools above expose what the kernel knows; the discipline is reading the output without jumping to conclusions. Most of the time the answer is in ss or tcpdump; the eBPF tools are for the harder remainder.

Linux Network Debugging — tcpdump, ss, and eBPF in Anger

Linux Network Debugging — tcpdump, ss, and eBPF in Anger

Start with `ss` — the connections view #

Then `tcpdump` — the packet view #

When `tcpdump` is too coarse: `bpftrace`#

Network rate / who's sending what: `iftop`#

Path issues: `mtr`#

The order we run them #

Things we wish we'd known earlier #

What to read next #

Stay Updated

LLM Cost Optimization in Production — What Actually Moves the Bill

AWS Reserved Instances vs Savings Plans vs Spot — When Each Fits

More from Linux

Debugging a systemd Service That Won't Start

Linux Troubleshooting — The Complete Guide

Linux "Permission Denied" — Diagnose It Fast

Debugging a systemd Service That Won't Start

Linux Troubleshooting — The Complete Guide

Linux "Permission Denied" — Diagnose It Fast

Tracking Down High CPU on Linux

Cilium vs Istio: eBPF Service Mesh Compared (2026)

The Linux OOM Killer — Why It Fired and How to Prevent It

About Kiril Urbonas

You might have missed

GitOps with Argo CD: Best Practices for 2026

A Pragmatic Multi-Region Strategy for Small Teams

Real-World RAG Incidents: Lessons from a Production Rollout