2023年1月9日 06:58:01go评论70阅读模式

英文:

Passing arrays to subroutines in Fortran: Assumed shape vs explicit shape

问题

在将数组传递给过程时，从速度和内存的角度来看，假定形状（assumed-shape）和显式形状（explicit shape）哪个更好？在此论坛上之前有类似的问题：

https://stackoverflow.com/questions/47913804/passing-size-as-argument-vs-assuming-shape-in-fortran-procedures

我提供一个简单的程序来说明我的意思：

module mymod
    USE iso_Fortran_env, ONLY: dp => real64
    implicit none
    private
    public :: dp, sub_trace, sub_trace_es
    
    contains
    
    subroutine sub_trace(mat, trace)
    ! 假定形状
        implicit none
        real(dp), intent(in) :: mat(:,:)
        real(dp), intent(out) :: trace
        real(dp) :: V(size(mat, dim=1))
        integer :: i, N
        
        if (size(mat, dim=1) /= size(mat, dim=2)) then
            error stop "输入矩阵不是方阵！"
        endif
        
        N = size(mat, dim=1)
        do i = 1, N
            V(i) = mat(i, i)
        enddo
        trace = sum(V)
    
    end subroutine sub_trace
    
    subroutine sub_trace_es(n, mat, trace)
    ! 显式形状传递数组
        implicit none
        integer, intent(in) :: n
        real(dp), intent(in) :: mat(n, n)
        real(dp), intent(out) :: trace
        real(dp) :: V(n)
        integer :: i
        
        do i = 1, n
            V(i) = mat(i, i)
        enddo
        trace = sum(V)
    
    end subroutine sub_trace_es
    
end module mymod    

program main
    use mymod, only: dp, sub_trace, sub_trace_es
    implicit none
    integer, parameter :: nn = 2
    real(dp) :: mat(nn, nn)
    real(dp), allocatable :: mat4(:,:)
    real(dp) :: trace1, trace2, trace3, trace4
    
    write(*,*) "将数组传递给子程序："
    write(*,*) "假定形状 vs 显式形状"
    
    mat(1,:) = [2_dp, 3_dp]
    mat(2,:) = [4_dp, 5_dp]
    
    call sub_trace(mat, trace1)
    
    write(*,*) "trace1 = ", trace1
    
    call sub_trace_es(nn, mat, trace2)
    
    write(*,*) "trace2 = ", trace2
    
    ! francescalus提供的第一个示例：
    call sub_trace_es(2, real([1, 2, 3, 4, 5, 6, 7, 8, 9], dp), trace3)
    
    write(*,*) "trace3 = ", trace3
    
    ! 第二个示例
    mat4 = reshape(real([1, 2, 3, 4, 5, 6, 7, 8, 9], dp), [3, 3])
    call sub_trace(mat4, trace4)
    
    write(*,*) "trace4 = ", trace4
    
    pause

end program

英文:

When passing arrays to procedures, what is best in terms of (1) speed and (2) memory, assumed-shape or explicit shape? A similar question was asked some time ago in this forum but not in these terms:
https://stackoverflow.com/questions/47913804/passing-size-as-argument-vs-assuming-shape-in-fortran-procedures

I provide a simple program to show what I mean

! Compile with
! ifort /O3 main.f90 -o run_win.exe
module mymod
USE iso_Fortran_env, ONLY: dp =&gt; real64
implicit none
private
public :: dp, sub_trace, sub_trace_es
contains
subroutine sub_trace(mat,trace)
! Assumed shape
implicit none
real(dp), intent(in) :: mat(:,:)
real(dp), intent(out) :: trace
real(dp) :: V(size(mat,dim=1))
integer :: i,N
if (size(mat,dim=1) /= size(mat,dim=2)) then
error stop &quot;Input matrix is not square!&quot;
endif
N = size(mat,dim=1)
do i=1,N
V(i) = mat(i,i)
enddo
trace = sum(V)
end subroutine sub_trace
subroutine sub_trace_es(n,mat,trace)
! Passing array explicit shape
implicit none
integer, intent(in) :: n
real(dp), intent(in) :: mat(n,n)
real(dp), intent(out) :: trace
real(dp) :: V(n)
integer :: i
do i=1,n
V(i) = mat(i,i)
enddo
trace = sum(V)
end subroutine sub_trace_es
end module mymod    
program main
use mymod, only: dp, sub_trace,sub_trace_es
implicit none
integer, parameter :: nn=2
real(dp) :: mat(nn,nn)
real(dp), allocatable :: mat4(:,:)
real(dp) :: trace1,trace2,trace3,trace4
write(*,*) &quot;Passing arrays to subroutines:&quot;
write(*,*) &quot;Assumed-shape vs explicit shape&quot;
mat(1,:) = [2_dp,3_dp]
mat(2,:) = [4_dp,5_dp]
call sub_trace(mat,trace1)
write(*,*) &quot;trace1 = &quot;, trace1
call sub_trace_es(nn,mat,trace2)
write(*,*) &quot;trace2 = &quot;, trace2
! First example offered by francescalus:
call sub_trace_es(2,real([1,2,3,4,5,6,7,8,9],dp), trace3)
write(*,*) &quot;trace3 = &quot;, trace3
! Second example
mat4 = reshape(real([1,2,3,4,5,6,7,8,9],dp),[3,3])
call sub_trace(mat4, trace4)
write(*,*) &quot;trace4 = &quot;, trace4
pause
end program

答案1

得分: 4

使用假定形状数组，您可以在不进行临时复制的情况下传递非连续数组或它们的部分。接收子程序知道内存中各个部分的位置，并可以借助数组描述符中的dope矢量在它们之间跳转。这意味着您可以避免临时复制，但是对数组的迭代更加复杂，可能会更慢。

如果假定形状数组具有contiguous属性，编译器可以生成更简单和更快的代码，但如果实际参数不连续，就必须进行临时复制。

对于显式大小的数组，虚拟参数始终是连续的。但是，如果实际参数不连续，仍然需要进行临时复制。

对于假定形状数组，编译器在编译过程中可以获得更好的参数检查，因为显式接口始终可用。即使对于显式大小的数组，如果显式接口可用，有时也可以进行一些检查，甚至在接口不可用时也有一些可能性，但可能性更有限。

其中一个原因是，由于存储关联规则的原因，可以传递具有不同秩和比显式大小数组的形状声明中声明的总大小（元素数量）更大（或相等）的数组。

对于假定形状数组，形状会自动与数组描述符一起传递。因此，对于它们来说，传递比声明大小更小或更大的概念不存在，它们只是以不同的方式工作。

在许多方面，这些类型的数组虚拟参数非常不同，它们在您可以使用它们的方式上不同。不仅仅是因为速度而选择一个或另一个。它们在使用方式上存在显著差异。对于显式大小的数组，您必须以某种方式提供大小。

这些差异可以通过francescalus提供的示例来说明：

call sub_trace_es(2, real([1,2,3,4,5,6,7,8,9], dp), trace)

这要求跟踪一个2x2的数组。传递的参数是包含9个元素的1D数组。但只有前四个将被考虑。子程序将查看的矩阵是

1 3
2 4

（列优先顺序），跟踪值将为5。

对于

call subtrace( reshape(real([1,2,3,4,5,6,7,8,9], dp), [3,3]), trace)

相同的9元数序列被重新整形为3x3数组。因此，子程序将查看的矩阵是

1   4   7
2   5   8
3   6   9

跟踪值将为15。

我个人在我的超级计算机生产代码的几个地方使用了具有显式contiguous属性的假定形状。但是，请注意启用有关临时复制的警告，很容易在一个位置忘记这一点，然后通过不必要的临时复制破坏一切。

在我代码中的大多数部分，那些不太关注性能的部分，我只是使用假定形状而没有进一步的属性。

英文:

With assumed shape you can achieve passing non-contiguous arrays or their without temporary copies. The receiving subroutine knows where the individual parts are in memory and can jump between them thanks to the dope vector in the array descriptor. That means that you avoid a temporary copy, but the iteration through the array is more complicated and may be slower.

If an assumed shape array has the contiguous attribute, the compiler can generate simpler and faster code, but if the actual argument is not contiguous, a temporary copy must be made.

For explicit-size arrays, the dummy argument is always contiguous. However, a temporary copy will be necessary if the actual argument is not contiguous.

With assumed shape arrays you get the benefit of better argument checking by the compiler during the compilation, because the explicit interface is always available. Some checking will be possible even for explicit-size arrays if the explicit interface is available and sometimes even when it is not, but the possibilities are more limited.

One reason for that is that thanks or due to the storage association rules it is possible to pass an array with a different rank and with a total size (number of elements) larger (or equal) to the size declared in the shape of the dummy argument of the explicit size array.

For an assumed shape the shape is passed automatically with the array descriptor. Therefore passing a smaller or larger than a declared size is a concept that does not exist for them, they simply work differently.

In many ways these types of array dummy arguments are just way too different and differ in what you can do with them. It is not just one or the other because of speed. They strongly differ in the way they are used. For explicit size arrays you have to provide the size somehow.

This differences can be illustrated by the examples offered by francescalus:

call sub_trace_es(2, real([1,2,3,4,5,6,7,8,9], dp), trace)

this asks for a trace of a 2x2 array. The argument being passed is a 1D array containing 9 elements. However, only the first four will be considered. The matrix the subroutine will look at is

1 3
2 4

(column-major order) and the trace will be 5.

For

call subtrace( reshape(real([1,2,3,4,5,6,7,8,9], dp), [3,3]), trace)

the same 9-element numeric sequence is reshaped into a 3x3 array. Hence the matrix the subroutine will look at is

1   4   7
2   5   8
3   6   9

and the trace will be 15.

I personally use assumed shape with the explicit contiguous attribute in several places of my production code for supercomputers where large arrays are passed around. However, be careful to enable warnings about temporary copies, it is easy to forget this in one location and than you spoil everything by unnecessary temporaries.

In most parts of my code, that are not so performance-critical, I just use assumed shape without further attributes.

通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库，让每个人都能够通过互相帮助和分享经验来进步。

在Fortran中将数组传递给子程序：假定形状 vs 显式形状

问题

答案1

Go语言是否会对切片的部分进行垃圾回收？

Jolt 转换：如何将值从列表移动到每个映射中

返回了空切片

我的方法为什么没有返回任何内容？

What's the correct way to type hint an empty list as a literal in python?

如何在Highcharts Gantt中更改本地化的星期名称

如何在同一个流中使用多个过滤器和映射函数？

如何使用Map/Set来将代码优化到O(n)？

.NET MAUI Android在GitHub Actions上构建失败，错误代码为1。

如何在Playwright视觉比较中屏蔽多个定位器？

在C++中，可以使用可变模板参数来检索类型的内部类型。

selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: stale element not found

Creating and opening a URL to log in to Website via Basic Auth with Robot Framework/Selenium (Python)

AG Grid 在上下文菜单中以大文本形式打开

发表评论