Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

README.md

PWR012: Pass only required fields from derived type as arguments to minimize data movements

Issue

Pass only used fields from derived data types as arguments to minimize data movements.

Actions

Pass the used fields as separate arguments instead of the whole derived type.

Relevance

Derived data types (such as structs in C or derived types in Fortran) are convenient constructs to group and move around related variables. While in many cases this is an effective method to organize data, the compilers can have a hard time optimizing this code because increased visibility of data also renders optimizations more complex.

Functions having derived data types used as arguments should make use of most if not all its fields. Ensuring that all fields from derived types passed as function arguments are used in the function body benefits optimization by making it easier to reason about inputs and outputs, thus improving compiler and static analyzer code coverage.

In parallel programming, derived data types are often discouraged when offloading to the GPU because they may inhibit compiler analyses and optimizations due to pointer aliasing. Also, it can cause unnecessary data movements impacting performance or incorrect data movements impacting correctness and even crashes impacting code quality.

Note

This issue can also hurt the code clarity. See check PWR074 for more details.

Code example

C

In the following example, a struct containing two arrays is passed to the foo function, which only uses one of the arrays:

// example.c
#include <stdlib.h>

typedef struct {
  int A[1000];
  int B[1000];
} data;

__attribute__((pure)) int foo(const data *d) {
  int result = 0;
  for (int i = 0; i < 1000; i++) {
    result += d->A[i];
  }
  return result;
}

void example() {
  data *d = (data *)malloc(sizeof(data));
  for (int i = 0; i < 1000; i++) {
    d->A[i] = d->B[i] = 1;
  }
  int result = foo(d);
  free(d);
}

This can be easily addressed by only passing the required array and rewriting the function body accordingly:

// solution.c
#include <stdlib.h>

typedef struct {
  int A[1000];
  int B[1000];
} data;

__attribute__((pure)) int foo(const int *A) {
  int result = 0;
  for (int i = 0; i < 1000; i++) {
    result += A[i];
  }
  return result;
}

void solution() {
  data *d = (data *)malloc(sizeof(data));
  for (int i = 0; i < 1000; i++) {
    d->A[i] = d->B[i] = 1;
  }
  int result = foo(d->A);
  free(d);
}

Fortran

In the following example, a derived type containing two arrays is passed to the foo function, which only uses one of the arrays:

! example.f90
program example

  implicit none

  type data
    integer :: a(10)
    integer :: b(10)
  end type data

contains

  pure subroutine foo(d)
    implicit none
    type(data), intent(in) :: d
    integer :: i, sum

    sum = 0
    do i = 1, 10
      sum = sum + d%a(i)
    end do
  end subroutine foo

  pure subroutine bar()
    implicit none
    type(data) :: d
    integer :: i

    do i = 1, 10
      d%a(i) = 1
      d%b(i) = 1
    end do

    call foo(d)
  end subroutine bar

end program example

This can be easily addressed by only passing the required array and rewriting the procedure body accordingly:

! solution.f90
program solution

  implicit none

  type data
    integer :: a(10)
    integer :: b(10)
  end type data

contains

  pure subroutine foo(a)
    implicit none
    integer, intent(in) :: a(:)
    integer :: i, sum

    sum = 0
    do i = 1, size(a, 1)
      sum = sum + a(i)
    end do
  end subroutine foo

  pure subroutine bar()
    implicit none
    type(data) :: d
    integer :: i

    do i = 1, 10
      d%a(i) = 1
      d%b(i) = 1
    end do

    call foo(d%a)
  end subroutine bar

end program solution

Related resources