Discussion:
LP64, unsigned int, vectorization, and PR 61247
Steve Ellcey
2018-10-04 20:48:48 UTC
Permalink
I was looking at PR tree-optimization/61247, where a loop with an unsigned
int index on an LP64 platform was not getting vectorized and I noticed an
odd thing.  In the function below, if I define N as 1000 or 10000, the
loop does get vectorized, even in LP64 mode.  But if I define N as 100000,
the loop does not get vectorized in LP64 mode.  I have not been able to
figure out why this is or where the decision to vectorize (or not) is
getting made.  Does anyone have an idea?  100000 is not a large enough value
to hit the limit of a 32 bit int or unsigned int value so why can't it be
vectorized like the other two cases?

In the original test case that I added to this PR, N is an argument and
we don't know what value it has.  It seems like this could be vectorized
by including a test to make sure that the value is not larger than MAXINT
and thus could not wrap when doing the array indexing.

Steve Ellcey
***@cavium.com



/* define N as 1000 - gets vectorized .... */
/* define N as 10000 - gets vectorized .... */
/* define N as 100000 - does not get vectorized .... */

#define N 100000

typedef unsigned int TYPE;
void f(int *C, int *A, int val)
{
        TYPE i,j;
        for (i=0; i<N; i++) {
                for (j=0; j<N; j++) {
                        C[i*N+j]=A[i*N+j] * val;
                }
  
Andrew Pinski
2018-10-04 20:54:33 UTC
Permalink
Post by Steve Ellcey
I was looking at PR tree-optimization/61247, where a loop with an unsigned
int index on an LP64 platform was not getting vectorized and I noticed an
odd thing. In the function below, if I define N as 1000 or 10000, the
loop does get vectorized, even in LP64 mode. But if I define N as 100000,
the loop does not get vectorized in LP64 mode. I have not been able to
figure out why this is or where the decision to vectorize (or not) is
getting made. Does anyone have an idea? 100000 is not a large enough value
to hit the limit of a 32 bit int or unsigned int value so why can't it be
vectorized like the other two cases?
i*N+j doesn't that wrap; e.g. when i is 100000-1, it is wrapping as
100000*(100000-1) needs 34 bits to be represented without wrapping ?

Thanks,
Andrew Pinski
Post by Steve Ellcey
In the original test case that I added to this PR, N is an argument and
we don't know what value it has. It seems like this could be vectorized
by including a test to make sure that the value is not larger than MAXINT
and thus could not wrap when doing the array indexing.
Steve Ellcey
/* define N as 1000 - gets vectorized .... */
/* define N as 10000 - gets vectorized .... */
/* define N as 100000 - does not get vectorized .... */
#define N 100000
typedef unsigned int TYPE;
void f(int *C, int *A, int val)
{
TYPE i,j;
for (i=0; i<N; i++) {
for (j=0; j<N; j++) {
C[i*N+j]=A[i*N+j] * val;
}
}
}
Loading...