The -par_threshold{n} option sets a threshold for the auto-parallelization of loops based on the probability of profitable execution of the loop in parallel. The value of n can be from 0 to 100. The default value is 75. This option is used for loops whose computation work volume cannot be determined at compile-time. The threshold is usually relevant when the loop trip count is unknown at compile-time.
The -par_threshold{n} option has the following versions and functionality:
Default: -par_threshold is not specified in the command line, which is the same as when -par_threshold0 is specified. The loops get auto-parallelized regardless of computation work volume, that is, parallelize always.
-par_threshold100 - loops get auto-parallelized only if profitable parallel execution is almost certain.
The intermediate 1 to 99 values represent the percentage probability for profitable speed-up. For example, n=50 would mean: parallelize only if there is a 50% probability of the code speeding up if executed in parallel.
The default value of n
is n=75
(or -par_threshold75). When
-par_threshold is used on the command line without a number,
the default value passed is 75.
The compiler applies a heuristic that tries to balance the overhead of creating multiple threads versus the amount of work available to be shared amongst the threads.
The -par_report{0|1|2|3} option controls the auto-parallelizer's diagnostic levels 0, 1, 2, or 3 as follows:
-par_report0 = no diagnostic information is displayed.
-par_report1 = indicates loops successfully auto-parallelized (default). Issues a "LOOP AUTO-PARALLELIZED" message for parallel loops.
-par_report2 = indicates successfully auto-parallelized loops as well as unsuccessful loops.
-par_report3 = same as 2 plus additional information about any proven or assumed dependences inhibiting auto-parallelization (reasons for not parallelizing).
Example below shows an output generated by -par_report3 as a result from the command:
prompt>ifl -c /Qparallel /Qpar_report3 myprog.f90
where the program myprog.f90 is as follows:
program myprog integer a(10000), q C Assumed side effects do i=1,10000 a(i) = foo(i) enddo C Actual dependence do i=1,10000 a(i) = a(i-1) + i enddo end |
Example of -par_report Output |
program myprog procedure: myprog serial
loop: line 5: not a parallel candidate serial loop: line 9 flow
data dependence from line 10 to line 12 Lines Compiled |
Use -par_threshold0 to see if the compiler assumed there was not enough computational work
Use -par_report3 to view diagnostics
Use !DIR$ PARALLEL directive to eliminate assumed data dependencies
Use -ipo to eliminate assumed side-effects done to function calls.