Skip to content

Commit

Permalink
linux-gen: fix unaligned access in checksum calculation
Browse files Browse the repository at this point in the history
Architectures that support unaligned access do not necessarily support
it for all instructions that a compiler might choose to use. Therefore
we should not mispresent the alignment of an object to the compiler even
when unaligned access is typically supported.

The current checksum code casts a potentially unaligned address to a
pointer to uint32_t when _ODP_UNALIGNED is set. This typically works but
may, in some cases, result in generation of invalid code. In particular,
when targeting AArch32 (which supports unaligned access for regular loads
and stores) gcc-12.2 generated LDM instructions which always reguire
4-byte alignment.

Fix the problem by informing the compiler about the non-standard alignment
by using a type attribute.

Signed-off-by: Janne Peltonen <janne.peltonen@nokia.com>
  • Loading branch information
JannePeltonen committed Nov 10, 2023
1 parent fd091a6 commit 49b2c62
Showing 1 changed file with 22 additions and 7 deletions.
29 changes: 22 additions & 7 deletions platform/linux-generic/include/odp_chksum_internal.h
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
/* Copyright (c) 2020, Nokia
/* Copyright (c) 2020, 2023, Nokia
* All rights reserved.
*
* SPDX-License-Identifier: BSD-3-Clause
Expand Down Expand Up @@ -62,8 +62,23 @@ static inline uint16_t chksum_finalize(uint64_t sum)
static uint64_t chksum_partial(const void *addr, uint32_t len, uint32_t offset)
{
const uint8_t *b;
const uint16_t *w;
const uint32_t *d;
#if _ODP_UNALIGNED
/*
* _ODP_UNALIGNED does not guarantee that all possible ways of
* accessing memory can be unaligned. Make the compiler aware
* of the possible unalignment so that it does not generate
* instructions (such as LDM of AArch32) that require higher
* alignment than one byte.
*/
typedef uint32_t x_uint32_t ODP_ALIGNED(1);
typedef uint16_t x_uint16_t ODP_ALIGNED(1);
#else
/* In this case we can use normal types as we align manually. */
typedef uint32_t x_uint32_t;
typedef uint16_t x_uint16_t;
#endif
const x_uint16_t *w;
const x_uint32_t *d;
uint64_t sum = 0;

/*
Expand All @@ -77,7 +92,7 @@ static uint64_t chksum_partial(const void *addr, uint32_t len, uint32_t offset)
* We have efficient unaligned access. Just read
* dwords starting at the given address.
*/
d = (const uint32_t *)addr;
d = (const x_uint32_t *)addr;
} else {
/*
* We must avoid unaligned access, so align to 4 bytes
Expand All @@ -102,7 +117,7 @@ static uint64_t chksum_partial(const void *addr, uint32_t len, uint32_t offset)
* This cast increases alignment, but it's OK, since
* we've made sure that the pointer value is aligned.
*/
w = (const uint16_t *)(uintptr_t)b;
w = (const x_uint16_t *)(uintptr_t)b;

if ((uintptr_t)w & 2 && len >= 2) {
/* Align bytes by handling an odd word. */
Expand All @@ -111,7 +126,7 @@ static uint64_t chksum_partial(const void *addr, uint32_t len, uint32_t offset)
}

/* Increases alignment. */
d = (const uint32_t *)(uintptr_t)w;
d = (const x_uint32_t *)(uintptr_t)w;
}

while (len >= 32) {
Expand Down Expand Up @@ -159,7 +174,7 @@ static uint64_t chksum_partial(const void *addr, uint32_t len, uint32_t offset)

len &= 3;

w = (const uint16_t *)d;
w = (const x_uint16_t *)d;
if (len > 1) {
/* Last word. */
sum += *w++;
Expand Down

0 comments on commit 49b2c62

Please sign in to comment.