-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Closed
Labels
C-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.
Description
Code taken from #79451.
#![feature(min_const_generics, array_value_iter)]
use std::array::IntoIter;
use std::mem::MaybeUninit;
pub fn zip<T, U, const N: usize>(lhs: [T; N], rhs: [U; N]) -> [(T, U); N] {
let mut dst = MaybeUninit::<[(T, U); N]>::uninit();
let ptr = dst.as_mut_ptr() as *mut (T, U);
for (idx, (lhs, rhs)) in IntoIter::new(lhs).zip(IntoIter::new(rhs)).enumerate() {
unsafe { ptr.add(idx).write((lhs, rhs)) }
}
unsafe { dst.assume_init() }
}
pub fn zip_8xu64(lhs: [u64; 8], rhs: [u64; 8]) -> [(u64, u64); 8] {
zip(lhs, rhs)
}
Godbolt (llvm-ir / asm): https://godbolt.org/z/Yq7W98
It seems that llvm is unable to eliminate the memcpys and thus results in suboptimal code.
Also there are dead stores which haven't been eliminated as well:
store i64 8, i64* %_7.sroa.0.sroa.0.i.sroa.5.0..sroa_idx33, align 8
store i64 8, i64* %_7.sroa.0.sroa.5.0._7.sroa.0.0..sroa_cast.sroa_idx106.i, align 8
store i64 8, i64* %_7.sroa.0.sroa.0.i.sroa.4.0..sroa_idx31, align 8
store i64 8, i64* %_7.sroa.0.sroa.4.0._7.sroa.0.0..sroa_cast.sroa_idx104.i, align 8
A not quite equivalent c++ example produces "optimal" code where no memcpy/dead stores occurs: https://godbolt.org/z/sdfa13
EDIT:
On second thought, I'd assume that LLVM's GVN pass should have eliminated the memcpys but it seems that this isn't supported?
usbalbin, poliorcetics and sandmor
Metadata
Metadata
Assignees
Labels
C-enhancementCategory: An issue proposing an enhancement or a PR with one.Category: An issue proposing an enhancement or a PR with one.I-slowIssue: Problems and improvements with respect to performance of generated code.Issue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.Relevant to the compiler team, which will review and decide on the PR/issue.