Clamp every element of an array to the range [lo, hi] in place using AVX2 intrinsics.
Each element should be replaced with lo if it is below lo, hi if it is above hi, or left unchanged if it is already within the range.
#include <immintrin.h>
#include <cstdint>
void clamp_array(int32_t* arr, int n, int32_t lo, int32_t hi);
Parameters:
arr — pointer to an array of n signed 32-bit integers, guaranteed 32-byte aligned (modified in place)n — number of elements, guaranteed to be a multiple of 8 and at least 8lo — lower bound of the clamp rangehi — upper bound of the clamp range (guaranteed lo ≤ hi)Returns: nothing (array is modified in place)
Input: arr = [-5, 3, 10, 0, -2, 7, 15, 1], lo = 0, hi = 10
Output: arr = [ 0, 3, 10, 0, 0, 7, 10, 1]
8 ≤ n ≤ 1,000,000n is always a multiple of 8[-1,000,000, 1,000,000]lo ≤ hiarr is 32-byte alignedClamping is just max(lo, min(hi, x)) — or equivalently, min(hi, max(lo, x)). With AVX2, each of these maps to a single intrinsic, making this a clean two-instruction-per-element problem.
| Intrinsic | Description |
|---|---|
_mm256_load_si256(ptr) | Load 256 bits from aligned memory |
_mm256_store_si256(ptr, v) | Store 256 bits to aligned memory |
_mm256_set1_epi32(x) | Broadcast a 32-bit integer to all 8 lanes |
_mm256_min_epi32(a, b) | Packed 32-bit integer minimum |
_mm256_max_epi32(a, b) | Packed 32-bit integer maximum |
Output will appear here after you run or submit.