Best coding and benchmarking practices for Fortran

aerosayan · November 19, 2020, 07:49

What are the best coding and benchmarking practices for Fortran?

I was surprised to see that x(i) = x(i)**4 was WAAAAAYYYYYY faster than x(i) = x(i)**4.0

It was quite surprising to see this. I thought the compiler would understand how to optimize the code since the compiler has access to the constant term 4.0. The compiler could have simply converted 4.0 to 4. But for some reason it didn't.

Kindly share some things you know.

Thanks

Here's my benchmark code and compiler flags:

Code:

! COMPILE : gfortran -g -O3 -mavx2  power.F90
! DISASM  : objdump -d -SC -Mintel  a.out

program main
implicit none
integer *4 :: n, r, rmax
real       :: time_start, time_end

n = 8**8
rmax = 10

print *, "N            : ", n
print *, "REPITIONS    : ", rmax
print *, "--------------------------------------------"

print *, "BENCHMARK x(i) = x(i)**4"
print *, ""

do r=1,rmax
    call cpu_time(time_start)
    call power_integer(n)
    call cpu_time(time_end)

    print *, " - TIME ELAPSED      ", time_end - time_start
end do
print *, "--------------------------------------------"

print *, "BENCHMARK x(i) = x(i)**4.0"
print *, ""

do r=1,rmax
    call cpu_time(time_start)
    call power_real(n)
    call cpu_time(time_end)

    print *, " - TIME ELAPSED      ", time_end - time_start
end do

contains

subroutine power_integer(n)
implicit none
real    *4 , dimension(:), allocatable :: x
integer *4 :: n, i

allocate(x(n))

do i=1,AND(n,-8)
    x(i) = 2.0
end do

do i=1,AND(n,-8)
    x(i) = x(i)**4
end do

deallocate(x)
end subroutine

subroutine power_real(n)
implicit none
real    *4 , dimension(:), allocatable :: x
integer *4 :: n, i

allocate(x(n))

do i=1,AND(n,-8)
    x(i) = 2.0
end do

do i=1,AND(n,-8)
    x(i) = x(i)**4.0
end do

deallocate(x)
end subroutine

end program

Here's my result :

Code:

 N            :     16777216
 REPITIONS    :           10
 --------------------------------------------
 BENCHMARK x(i) = x(i)**4
 
  - TIME ELAPSED        0.105583996    
  - TIME ELAPSED         6.43480048E-02
  - TIME ELAPSED         4.91680056E-02
  - TIME ELAPSED         4.82190102E-02
  - TIME ELAPSED         4.89619970E-02
  - TIME ELAPSED         4.88489866E-02
  - TIME ELAPSED         4.95190024E-02
  - TIME ELAPSED         4.94549870E-02
  - TIME ELAPSED         4.96839881E-02
  - TIME ELAPSED         4.93779778E-02
 --------------------------------------------
 BENCHMARK x(i) = x(i)**4.0
 
  - TIME ELAPSED        0.121948004    
  - TIME ELAPSED        0.121810973    
  - TIME ELAPSED        0.121183991    
  - TIME ELAPSED        0.120921016    
  - TIME ELAPSED        0.120067954    
  - TIME ELAPSED        0.121345043    
  - TIME ELAPSED        0.120756984    
  - TIME ELAPSED        0.120249987    
  - TIME ELAPSED        0.120095015    
  - TIME ELAPSED        0.120734930

sbaffini · November 19, 2020, 10:14

The general reasoning is that you are in charge of stuff. Should the compiler analyze every occurrence of reals in your code to understand what you actually want to do? Intel would probably do it (see here https://community.intel.com/t5/Intel...on/td-p/924287), but your mileage with other compilers might vary. So it is certainly a good practice to use integers in this case and whenever the result is translatable to a very different operation. Here integer exponentiation is just multiplication. But this might actually be an extreme case.

Tricks really depend from what you know or don't. Also, most things in Fortran are, purpotedly, not standardized. For example, do you know that, despite the work by reference, most compilers (probably all of them) will likely do a copy of your input to a subroutine if it is a non-conitguous stride of a larger array? Which means that, sometimes, it is just better to pass the whole thing and indexing in the subroutine, if that makes sense.

November 19, 2020, 10:14		#2
sbaffini Senior Member Paolo Lampitella Join Date: Mar 2009 Location: Italy Posts: 2,195 Blog Entries: 29 Rep Power: 39	The general reasoning is that you are in charge of stuff. Should the compiler analyze every occurrence of reals in your code to understand what you actually want to do? Intel would probably do it (see here https://community.intel.com/t5/Intel...on/td-p/924287), but your mileage with other compilers might vary. So it is certainly a good practice to use integers in this case and whenever the result is translatable to a very different operation. Here integer exponentiation is just multiplication. But this might actually be an extreme case. Tricks really depend from what you know or don't. Also, most things in Fortran are, purpotedly, not standardized. For example, do you know that, despite the work by reference, most compilers (probably all of them) will likely do a copy of your input to a subroutine if it is a non-conitguous stride of a larger array? Which means that, sometimes, it is just better to pass the whole thing and indexing in the subroutine, if that makes sense. aerosayan likes this.