Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The checksum calculation of the vhost user device is different from the f-stack #850

Open
zcjie1 opened this issue Dec 12, 2024 · 1 comment

Comments

@zcjie1
Copy link

zcjie1 commented Dec 12, 2024

I added extra parameters to support the vhost_user device and to enable UDP communication with the virtio_user device. However, I encountered an issue where virtio can send data to vhost, but vhost cannot send data back to virtio.

I found that the UDP checksum calculation method used by the vhost_user device differs from that of F-stack, and I am unsure which one is correct.

When I disabled the checksum calculation offload in vhost_user, all communications worked normally.

The UDP checksum calculation code of vhost_user device:

	if (rte_net_intel_cksum_prepare(mbuf) < 0)
		return;

	if (rte_raw_cksum_mbuf(mbuf, hdr_len, rte_pktmbuf_pkt_len(mbuf) - hdr_len, &csum) < 0)
		return;

	csum = ~csum;
	if (unlikely((mbuf->packet_type & RTE_PTYPE_L4_UDP) && csum == 0))
		csum = 0xffff;

	if (rte_pktmbuf_data_len(mbuf) >= csum_offset + 1)
		*rte_pktmbuf_mtod_offset(mbuf, uint16_t *, csum_offset) = csum;
static inline int
rte_raw_cksum_mbuf(const struct rte_mbuf *m, uint32_t off, uint32_t len,
	uint16_t *cksum)
{
	const struct rte_mbuf *seg;
	const char *buf;
	uint32_t sum, tmp;
	uint32_t seglen, done;

	/* easy case: all data in the first segment */
	if (off + len <= rte_pktmbuf_data_len(m)) {
		*cksum = rte_raw_cksum(rte_pktmbuf_mtod_offset(m,
				const char *, off), len);
		return 0;
	}

	if (unlikely(off + len > rte_pktmbuf_pkt_len(m)))
		return -1; /* invalid params, return a dummy value */

	/* else browse the segment to find offset */
	seglen = 0;
	for (seg = m; seg != NULL; seg = seg->next) {
		seglen = rte_pktmbuf_data_len(seg);
		if (off < seglen)
			break;
		off -= seglen;
	}
	RTE_ASSERT(seg != NULL);
	if (seg == NULL)
		return -1;
	seglen -= off;
	buf = rte_pktmbuf_mtod_offset(seg, const char *, off);
	if (seglen >= len) {
		/* all in one segment */
		*cksum = rte_raw_cksum(buf, len);
		return 0;
	}

	/* hard case: process checksum of several segments */
	sum = 0;
	done = 0;
	for (;;) {
		tmp = __rte_raw_cksum(buf, seglen, 0);
		if (done & 1)
			tmp = rte_bswap16((uint16_t)tmp);
		sum += tmp;
		done += seglen;
		if (done == len)
			break;
		seg = seg->next;
		buf = rte_pktmbuf_mtod(seg, const char *);
		seglen = rte_pktmbuf_data_len(seg);
		if (seglen > len - done)
			seglen = len - done;
	}

	*cksum = __rte_raw_cksum_reduce(sum);
	return 0;
}
static inline uint32_t
__rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
{
	const void *end;

	for (end = RTE_PTR_ADD(buf, RTE_ALIGN_FLOOR(len, sizeof(uint16_t)));
	     buf != end; buf = RTE_PTR_ADD(buf, sizeof(uint16_t))) {
		uint16_t v;

		memcpy(&v, buf, sizeof(uint16_t));
		sum += v;
	}

	/* if length is odd, keeping it byte order independent */
	if (unlikely(len % 2)) {
		uint16_t left = 0;

		memcpy(&left, end, 1);
		sum += left;
	}

	return sum;
}

The UDP checksum calculation code of f-stack:

} else {
	// printf("%s\n", __func__);
	char b[9];

	bcopy(((struct ipovly *)ip)->ih_x1, b, 9);
	bzero(((struct ipovly *)ip)->ih_x1, 9);
	((struct ipovly *)ip)->ih_len = (proto == IPPROTO_UDP) ?
			  uh->uh_ulen : htons(ip_len);
	uh_sum = in_cksum(m, len + sizeof (struct ip));
	bcopy(b, ((struct ipovly *)ip)->ih_x1, 9);
}
if (uh_sum) {
	printf("udp_cksum_result: %u\n", uh_sum);
	UDPSTAT_INC(udps_badsum);
	m_freem(m);
	return (IPPROTO_DONE);
}

I'm sorry that I cannot fully understand the implementation of the in_cksum function, but it probably does not appear to be performing a 16-bit checksum calculation.

@zcjie1
Copy link
Author

zcjie1 commented Dec 12, 2024

I found the reason.

The pseudo-header checksum of UDP packet in vhost_user is calculated by the following code:

if ((ol_flags & RTE_MBUF_F_TX_L4_MASK) == RTE_MBUF_F_TX_UDP_CKSUM) {
if (ol_flags & RTE_MBUF_F_TX_IPV4) {
	udp_hdr = (struct rte_udp_hdr *)((char *)ipv4_hdr +
			m->l3_len);
	udp_hdr->dgram_cksum = rte_ipv4_phdr_cksum(ipv4_hdr,
			ol_flags);
	// printf("ipv4\n");
} else {
	ipv6_hdr = rte_pktmbuf_mtod_offset(m,
		struct rte_ipv6_hdr *, inner_l3_offset);
	/* non-TSO udp */
	udp_hdr = rte_pktmbuf_mtod_offset(m,
			struct rte_udp_hdr *,
			inner_l3_offset + m->l3_len);
	udp_hdr->dgram_cksum = rte_ipv6_phdr_cksum(ipv6_hdr,
			ol_flags);
	// printf("ipv6\n");
}

with the definition:

#define RTE_MBUF_F_TX_IPV4 (1ULL << 55)

Packet is IPv4. This flag must be set when using any offload feature
(TSO, L3 or L4 checksum) to tell the NIC that the packet is an IPv4
packet. If the packet is a tunneled packet, this flag is related to
the inner headers.

but the ff_dpdk_if_send function in f-stack only sets the flag of RTE_MBUF_F_TX_UDP_CKSUM and does not specify the IP type of packet:

if (offload.udp_csum) {
    head->ol_flags |= RTE_MBUF_F_TX_UDP_CKSUM;
    head->l2_len = RTE_ETHER_HDR_LEN;
    head->l3_len = iph_len;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant