软路由小包数据慢到拉垮?什么是小包?到底是怎么回事?

2023-06-19 23:49:48

在前几的文章中大家在争论软路由小包拉垮的问题。

今天咱们就来聊聊什么是小包,以及为什么小包会拉垮的问题。在文章最后,iN给大家一个测试程序的源代码,大家也可以实际的去测试一下自己家里的路由器到底有多“拉垮”。

实际上“小包拉垮”是一个网路术语(Small Packet Performance Degradation,小包性能拉垮)Degradation直接翻译过来就是“拉垮”的意思。这是一个放之四海皆准的说法。

但为什么会出现“小包拉垮”的问题呢?

这是因为数据在以太网上传输的过程中往往需要先转换为以太网帧,一个标准的以太网帧(不包括物理层的前导码和帧尾部)的最小长度是64个字节,最大长度为1518个字节(由MTU决定)。这个长度包括了目标MAC地址、源MAC地址、长度/类型字段、有效载荷(数据)和校验和字段。数据就承载在以太网帧的有效载荷内。如果要传输的数据大于MTU(最大传输单元)就会在传输的过程中进行切片,重新封装成数个以太网帧。

这里面就需要网卡、路由器、交换机做出各种解封、封装、组合的动作了。通常的情况下在交换机的电路中可以直接处理以太网帧。不过以太网帧是有最小的长度的,也就是64个字节,除去帧的头部14个字节和尾部4个字节之后其有效载荷是46个字节。小于46个字节的纯数据传输都会直接消耗掉一个以太网帧。这也是最小的包长度了。

为什么叫做以太网帧呢?

因为以太网帧是以太网上有效传输的最小切片信号,你的网络接口带宽在这个层面上有多大并无所谓,关键是你的设备能在单位时间内处理多少以太网帧。所以说其实小包拉垮的问题在任何软硬件网络设备上都是存在的,并不是软路由独有的。

但是到了软路由上,由于来源和目标地址需要软路由再做解析的。软路由并没有专门对应网络数据处理的电路,就需要解码之后再做二次处理和封装。这里就出现了额外的不必要的开销,如果数据包足够大,能以最大传输单元的形式进行传输。软路由反复解码所带来的系统开销就会少一些。但如果是持续的传输小包,要达到一定的传输速率的状况下,软路由的处理开销就相当明显了。尤其是一些软路由是纯靠CPU来计算的,甚至连专门的交换机芯片也不配备,这时候问题就更加显著。

写了个测试程序,在家跑了几轮测试,你会发现当数据包的大小被设置为64个字节的时候

实际上传输速度也就只有200兆上下,这是因为整个网络系统在处理小包数据,而计算开销的大小要远大实际传输有效数据的大小。

但一旦把网络数据包的大小增加,例如增加到6400个字节:

网络的速度就一下子跑到了接近于满速率。

那么小包的极限在哪里呢?当我们把数据包的长度设置在400字节的时候,网络还是几乎跑在了满速状态。

到100个字节的时候,速度一下子就降低下来了:

这其实就是网络数据包的处理开销远远大于了实际承载有效数据的量。

到设定为16个字节的时候,你会发现实际网速已经仅仅剩下两位数字了

这些测试,都是在一台专用路由器上做的测试。

如果换成软路由会是什么样子呢?

恰好,手里还有几个之前玩剩下的软路由,直接换接入到测试的两台电脑上:


在服务器和测试机,以及线路都不变的情况下,跑16字节的数据小包,软路由跑小包的速度大约是路由器交换机的1/3吧。

这其实就是软路由本身性能问题给网络带来的影响了。而且,在测试软路由的时候,我们还可以发现:

在测试进行了大约80秒之后,软路由的缓冲区耗尽,包转发速率出现大幅度下降的问题。

这种问题,基本上也都是软路由通用CPU处理网络数据有局限性的实时要背锅的了。

程序给大家附加在文末了,每次测试后,会生成一张图表,大家也可以测测自己的网络到底是怎么样的:

import socket import threading import time import argparse import matplotlib.pyplot as plt from datetime import datetime import numpy as np def send_small_packet_data(host, port, data, packet_size, test_duration, packet_rates): sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) start_time = time.time() packet_count = 0 second_count = 0 packet_rate = 0 while time.time() - start_time < test_duration: try: sock.sendto(data.encode(), (host, port)) packet_count += 1 elapsed_time = time.time() - start_time if int(elapsed_time) > second_count: packet_rate = packet_count second_count += 1 packet_rates.append(packet_rate) human_readable_rate = get_human_readable_packet_rate(packet_rate) transmission_rate = packet_rate * packet_size * 8 # 以比特/秒为单位 human_readable_transmission_rate = get_human_readable_transmission_rate(transmission_rate) print(f"Sent {packet_rate} packets in last second, current rate: {human_readable_rate}, packet size: {len(data)} bytes, transmission rate: {human_readable_transmission_rate}") packet_count = 0 except socket.error as e: print(f"Error sending data: {e}") break print(f"Finished sending packets for {second_count} seconds.") sock.close() def get_human_readable_packet_rate(packet_rate): if packet_rate >= 10**9: return f"{packet_rate / 10**9:.2f} gpps" elif packet_rate >= 10**6: return f"{packet_rate / 10**6:.2f} mpps" elif packet_rate >= 10**3: return f"{packet_rate / 10**3:.2f} kpps" else: return f"{packet_rate:.2f} pps" def get_human_readable_transmission_rate(transmission_rate): units = ["bps", "Kbps", "Mbps", "Gbps"] unit_index = 0 while transmission_rate >= 1000 and unit_index < 3: transmission_rate /= 1000 unit_index += 1 return f"{transmission_rate:.2f} {units[unit_index]}" def receive_small_packet_data(host, port, buffer_size, packet_rates): sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) sock.bind((host, port)) start_time = time.time() packet_count = 0 second_count = 0 packet_rate = 0 try: while True: data, _ = sock.recvfrom(buffer_size) packet_count += 1 elapsed_time = time.time() - start_time if int(elapsed_time) > second_count: packet_rate = packet_count second_count += 1 packet_rates.append(packet_rate) human_readable_rate = get_human_readable_packet_rate(packet_rate) print(f"Received {packet_count} packets in last second, current rate: {human_readable_rate}, packet size: {len(data)} bytes") packet_count = 0 except KeyboardInterrupt: print(f"Stopped receiving packets. Total received: {packet_count} packets") sock.close() def generate_packet_rate_chart(packet_rates, host, port, packet_size, test_duration): plt.plot(packet_rates) plt.xlabel('Time (seconds)') plt.ylabel('Packet Rate (packets/second)') plt.xticks(np.arange(0, test_duration + 1, step=max(1, test_duration // 10))) timestamp = datetime.now().strftime("%Y%m%d%H%M%S") plt.title(f"Packet Rate Test - {timestamp}", fontsize=10) test_info = f"Test Parameters:\n Host: {host} Port: {port} Packet Size: {packet_size} bytes Test Duration: {test_duration} seconds" plt.figtext(0.5, -0.05, test_info, wrap=True, horizontalalignment='center', fontsize=10) plt.grid(True) plt.tight_layout() plt.savefig(f"packet_rate_{timestamp}.jpeg") plt.show() def send_tcp_data(host, port, data, packet_size, test_duration, packet_rates): sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.connect((host, port)) start_time = time.time() packet_count = 0 second_count = 0 packet_rate = 0 while time.time() - start_time < test_duration: try: sock.send(data.encode()) packet_count += 1 elapsed_time = time.time() - start_time if int(elapsed_time) > second_count: packet_rate = packet_count second_count += 1 packet_rates.append(packet_rate) human_readable_rate = get_human_readable_packet_rate(packet_rate) transmission_rate = packet_rate * packet_size * 8 # 以比特/秒为单位 human_readable_transmission_rate = get_human_readable_transmission_rate(transmission_rate) print(f"Sent {packet_rate} packets in last second, current rate: {human_readable_rate}, packet size: {len(data)} bytes, transmission rate: {human_readable_transmission_rate}") packet_count = 0 except socket.error as e: print(f"Error sending data: {e}") break print(f"Finished sending packets for {second_count} seconds.") sock.close() def receive_tcp_data(host, port, buffer_size, packet_rates): sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) sock.bind((host, port)) sock.listen(1) conn, addr = sock.accept() start_time = time.time() packet_count = 0 second_count = 0 packet_rate = 0 try: while True: data = conn.recv(buffer_size) if not data: break packet_count += 1 elapsed_time = time.time() - start_time if int(elapsed_time) > second_count: packet_rate = packet_count second_count += 1 packet_rates.append(packet_rate) human_readable_rate = get_human_readable_packet_rate(packet_rate) print(f"Received {packet_count} packets in last second, current rate: {human_readable_rate}, packet size: {len(data)} bytes") packet_count = 0 except KeyboardInterrupt: print(f"Stopped receiving packets. Total received: {packet_count} packets") sock.close() def run_test(server_mode, tcp_mode, host, port, data, packet_size, test_duration): packet_rates = [] if tcp_mode: send_func = send_tcp_data receive_func = receive_tcp_data else: send_func = send_small_packet_data receive_func = receive_small_packet_data if server_mode: print("Running in server mode...") receive_func(host, port, packet_size, packet_rates) else: print("Running in client mode...") start_time = time.time() thread = threading.Thread(target=send_func, args=(host, port, data, packet_size, test_duration, packet_rates)) thread.start() thread.join() end_time = time.time() elapsed_time = end_time - start_time print(f"Sent packets for {elapsed_time:.4f} seconds.") generate_packet_rate_chart(packet_rates, host, port, packet_size, test_duration) if __name__ == "__main__": parser = argparse.ArgumentParser(description="Small Packet Data Test") parser.add_argument("--server", action="store_true", help="Run in server mode") parser.add_argument("--client", action="store_true", help="Run in client mode") parser.add_argument("--tcp", action="store_true", help="Run in TCP mode") parser.add_argument("host", type=str, help="Server IP address") parser.add_argument("-z", "--packet-size", type=int, default=64, help="Packet size in bytes (default: 64)") parser.add_argument("-d", "--test-duration", type=float, default=10.0, help="Test duration in seconds (client mode only, default: 10.0)") args = parser.parse_args() host = args.host port = 12345 data = 'X' * args.packet_size packet_size = args.packet_size test_duration = args.test_duration if args.server: run_test(True, args.tcp, host, port, data, packet_size, None) elif args.client: run_test(False, args.tcp, host, port, data, packet_size, test_duration) else: print("Please specify either server mode or client mode.")