The directives discussed in this topic support vectorization and used for IA-32 applications only.
The compiler supports IVDEP directive which instructs the compiler to ignore assumed vector dependences. Use this directive when you know that the assumed loop dependences are safe to ignore.
For example, if the expression j >= 0 is always true in the code fragment bellow, the IVDEP directive can communicate this information to the compiler. This directive informs the compiler that the conservatively assumed loop-carried flow dependences for values j < 0 can be safely ignored:
!DIR$ IVDEP
do i = 1, 100
a(i) = a(i+j)
enddo
Note
The proven dependeces that prevent vectorization are not ignored, only assumed dependeces are ignored.
The syntax for the directive is:
CDIR$IVDEP
!DIR$IVDEP
The usage of the directive differs depending on the loop form, see examples
below.
Loop 1 |
Do i |
Loop 2 |
Do i |
For loops of the form 1, use old values of a, and assume that there is no loop-carried flow dependencies from DEF to USE.
For loops of the form 2, use new values of a, and assume that there is no loop-carried anti-dependencies from USE to DEF.
In both cases, it is valid to distribute the loop, and there is no loop-carried output dependency.
Example 1 |
CDIR$IVDEP |
Example 2 |
CDIR$IVDEP |
Example 1 ignores the possible backward dependencies and enables the loop to get software pipelined.
Example 2 shows possible forward and backward dependencies involving array a in this loop and creating a dependency cycle. With IVDEP, the backward dependencies are ignored.
IVDEP has options: IVDEP:LOOP and IVDEP:BACK. The IVDEP:LOOP option implies no loop-carried dependencies. The IVDEP:BACK option implies no backward dependencies.
The IVDEP directive is also used for Itanium®-based applications.
For more details on the IVDEP directive, see Appendix A in the Intel® Fortran Programmer's Reference.
In addition to IVDEP directive, there are three directives that can be used to override the efficiency heuristics of the vectorizer:
!DIR$VECTOR ALWAYS
!DIR$NOVECTOR
!DIR$VECTOR ALIGNED
!DIR$VECTOR UNALIGNED
The VECTOR ALWAYS directive overrides the efficiency heuristics of the vectorizer, but it only works if the loop can actually be vectorized, that is: use IVDEP to ignore assumed dependences.
The VECTOR ALWAYS directive can be used to override the default behavior of the compiler in the following situation. Vectorization of non-unit stride references usually does not exhibit any speedup, so the compiler defaults to not vectorizing loops that have a large number of non-unit stride references (compared to the number of unit stride references). The following loop has two references with stride 2. Vectorization would be disabled by default, but the directive overrides this behavior.
Vector Aligned |
!DIR$ VECTOR ALWAYS |
If, on the other hand, avoiding vectorization of a loop is desirable (if vectorization results in a performance regression rather than improvement), the NOVECTOR directive can be used in the source text to disable vectorization of a loop. For instance, the Intel® Compiler vectorizes the following example loop by default. If this behavior is not appropriate, the NOVECTOR directive can be used, as shown below.
NOVECTOR |
!DIR$ NOVECTOR |
Like VECTOR ALWAYS, these directives also override the efficiency heuristics. The difference is that the qualifiers UNALIGNED and ALIGNED instruct the compiler to use, respectively, unaligned and aligned data movement instructions for all array references. This disables all the advanced alignment optimizations of the compiler, such as determining alignment properties from the program context or using dynamic loop peeling to make references aligned.
The directives VECTOR [ALWAYS, UNALIGNED, ALIGNED] should be used with care. Overriding the efficiency heuristics of the compiler should only be done if the programmer is absolutely sure the vectorization will improve performance. Furthermore, instructing the compiler to implement all array references with aligned data movement instructions will cause a runtime exception in case some of the access patterns are actually unaligned.