Auto-parallelization Threashold Control and Diagnostics

Threshold Control

The -par_threshold{n} option sets a threshold for the auto-parallelization of loops based on the probability of profitable execution of the loop in parallel. The value of n can be from 0 to 100. The default value is 75. This option is used for loops whose computation work volume cannot be determined at compile-time. The threshold is usually relevant when the loop trip count is unknown at compile-time.

The -par_threshold{n} option has the following versions and functionality:

Default: -par_threshold is not specified in the command line, which is the same as when -par_threshold0 is specified. The loops get auto-parallelized regardless of computation work volume, that is, parallelize always.
-par_threshold100 - loops get auto-parallelized only if profitable parallel execution is almost certain.
The intermediate 1 to 99 values represent the percentage probability for profitable speed-up. For example, n=50 would mean: parallelize only if there is a 50% probability of the code speeding up if executed in parallel.
The default value of n is n=75 (or -par_threshold75). When
-par_threshold is used on the command line without a number, the default value passed is 75.

The compiler applies a heuristic that tries to balance the overhead of creating multiple threads versus the amount of work available to be shared amongst the threads.

Diagnostics

The -par_report{0|1|2|3} option controls the auto-parallelizer's diagnostic levels 0, 1, 2, or 3 as follows:

-par_report0 = no diagnostic information is displayed.

-par_report1 = indicates loops successfully auto-parallelized (default). Issues a "LOOP AUTO-PARALLELIZED" message for parallel loops.

-par_report2 = indicates successfully auto-parallelized loops as well as unsuccessful loops.

-par_report3 = same as 2 plus additional information about any proven or assumed dependences inhibiting auto-parallelization (reasons for not parallelizing).

Example of Parallelization Diagnostics Report

Example below shows an output generated by -par_report3 as a result from the command:

prompt>ifl -c /Qparallel /Qpar_report3 myprog.f90

where the program myprog.f90 is as follows:

program myprog

integer a(10000), q

C Assumed side effects

do i=1,10000

a(i) = foo(i)

enddo

C Actual dependence

do i=1,10000

a(i) = a(i-1) + i

enddo

end

Example of -par_report Output

program myprog

procedure: myprog

serial loop: line 5: not a parallel candidate
due to statement at line 6

serial loop: line 9

flow data dependence from line 10 to line
10, due to "a"

12 Lines Compiled

Troubleshooting Tips

Use -par_threshold0 to see if the compiler assumed there was not enough computational work
Use -par_report3 to view diagnostics
Use !DIR$ PARALLEL directive to eliminate assumed data dependencies
Use -ipo to eliminate assumed side-effects done to function calls.