Become a leader in the IoT community!
Join our community of embedded and IoT practitioners to contribute experience, learn new skills and collaborate with other developers with complementary skillsets.
Join our community of embedded and IoT practitioners to contribute experience, learn new skills and collaborate with other developers with complementary skillsets.
I am analyzing the performance of `memcpy` on an Intel Core i7 10700K CPU , using GCC 10.2 on Linux kernel 5.10. My assumption is that its speed should be close to the time it takes to transfer one long multiplied by the number of longs being copied. Could `memcpy` be optimized to exceed this expectation, possibly using `SIMD` or other CPU specific features?
Are there any compiler flags or hardware optimizations I should be aware of to get the best performance out of `memcpy`?
CONTRIBUTE TO THIS THREAD