Aaron Sawdey
2018-10-17 20:28:15 UTC
I've previously posted a patch to add vector/vsx inline expansion of
strcmp/strncmp for the power8/power9 processors. Here are some of the
other items I have in the pipeline that I hope to get into gcc9:
* vector/vsx support for inline expansion of memcmp to non-loop code.
This improves performance of small memcmp.
* vector/vsx support for inline expansion of memcmp to loop code. This
will close the performance gap for lengths of about 128-512 bytes
by making the loop code closer to the performance of the library
memcmp.
* generate inline expansion to a loop for strcmp/strncmp. This closes
another performance gap because strcmp/strncmp vector/vsx code
currently generated is lots faster than the library call but we
only generate comparison of 64 bytes to avoid exploding code size.
Similar code in a loop would be compact and allow inline comparison
of maybe the first 512 bytes inline before dumping to the library
function.
If anyone has any other input on the inline expansion work I've been
doing for the rs6000 target, please let me know.
Thanks!
Aaron
strcmp/strncmp for the power8/power9 processors. Here are some of the
other items I have in the pipeline that I hope to get into gcc9:
* vector/vsx support for inline expansion of memcmp to non-loop code.
This improves performance of small memcmp.
* vector/vsx support for inline expansion of memcmp to loop code. This
will close the performance gap for lengths of about 128-512 bytes
by making the loop code closer to the performance of the library
memcmp.
* generate inline expansion to a loop for strcmp/strncmp. This closes
another performance gap because strcmp/strncmp vector/vsx code
currently generated is lots faster than the library call but we
only generate comparison of 64 bytes to avoid exploding code size.
Similar code in a loop would be compact and allow inline comparison
of maybe the first 512 bytes inline before dumping to the library
function.
If anyone has any other input on the inline expansion work I've been
doing for the rs6000 target, please let me know.
Thanks!
Aaron
--
Aaron Sawdey, Ph.D. ***@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain
Aaron Sawdey, Ph.D. ***@linux.vnet.ibm.com
050-2/C113 (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain