preamble
This article only discusses the use of permute in 2D and 3D.
One of the permute functions in the recent Attention study puzzled me
It's too abstract to talk about.
I'll just combine the code with the picture to explain
First create a small instance of a three-dimensional array
import torch x = (1, 30, steps=30).view(3,2,5) # Set up a three-dimensional array print(x) print(()) # View the dimensions of an array
This is to prevent coincidental situations where the values of the dimensions are the same (e.g. three-dimensional arrays (3,3,3) or (2,4,4), etc.)
The output is shown below
The general interpretation of (3,2,5) as 3 dimensions, 2 rows and 5 columns is very confusing here.
Then it will be easier to understand by block, row and column
For example, (3,2,5), which is an array of 3 blocks of 2*5.
Below I'll simply use 3 pieces of 3*3 diagrams for a lazy example
Then it's stacked up into what we know as the three-dimensional matrix
Let's start with a brief introduction to the permute() function.
permute(dims)
The parameter dims is substituted with the number of dimensions of the matrix, usually starting from 0 by default. That is, dimension 0, dimension 1, etc.
It can also be interpreted as, block 0, block 1, etc. Of course the matrix must be at least two dimensions to use permute
If it is two-dimensional, dims are 0 and 1, respectively.
It can be written as permute(0,1) No changes are made here, the dimensions are the same as before
If you write permute(1,0) you get the transpose of the matrix
If 3D is permute(0,1,2)
0 represents a total of several dimensions: 0 in this example corresponds to a 3-block matrix
1 represents the number of rows in each block: 1 in this example corresponds to 2 rows per block
2 represents the number of columns in each block: 2 in this example corresponds to 5 columns per block
So it's 3 blocks of 2 rows and 5 columns of 3D matrices.
These 0,1,2 don't have any real meaning and are not numerical values, they are just used to identify the difference. Sort of like x, y, z to distinguish between the three coordinate dimensions, which are artificially defined
The three-dimensional case is given directly in the following code
Three-dimensional situation
Change 1: No change in any parameter
b = (0,1,2) # No change in dimension print(b) print(())
It was found that at this point the matrix was unchanged and still arranged in the same way as before
Variation 2: 1 and 2 exchange
b = (0,2,1) # Rows and columns of each block are swapped, i.e., each block does transpose behavior print(b) print(())
Two pictures to compare
Pairing the ranks of each piece (i.e., transposing a 2D matrix) without changing each piece (i.e.,)
Variation 3: 0 and 1 exchange
b = (1,0,2) # Swap blocks and rows print(b) print(())
A comparison of the two shows a change in the number of blocks and rows per block
That is, the value 3 blocks corresponding to parameter 0 becomes 2 blocks.
The 2 lines corresponding to parameter 1 become 3 lines
This change coincides with the switching of the positions of 0 and 1, causing the parameters to be swapped.
This becomes 2 blocks * 3 rows * 5 columns (initially 3 blocks * 2 rows * 5 columns)
Variation 4: 0 and 2 exchange
b = (2,1,0) # Swap blocks and columns print(b) print(())
At this point, the 3 blocks corresponding to parameter 0 have been permuted to 5 blocks.
The 5 columns corresponding to parameter 2 have become 3 columns.
Variation 5: 0 is exchanged with 1 and 1 is exchanged with 2
b = (2,0,1) # Swap blocks and rows and columns print(b) print(())
At this point, the 3 blocks corresponding to parameter 0 become 5 blocks
The 2 lines corresponding to parameter 1 become 3 lines
The 5 columns corresponding to parameter 2 become 2 columns.
Variation 6: 0 is exchanged with 1 and 0 is exchanged with 2
b = (1,2,0) # Swap blocks and rows and columns print(b) print(())
At this point, the 3 blocks corresponding to parameter 0 become 2 blocks
The 2 lines corresponding to parameter 1 become 5 lines
The 5 columns corresponding to parameter 2 become 3 columns.
summarize
Based on the 2D and 3D examples given above, you can see that the permute() function actually swaps the rows and columns of the blocks of the matrix.
The parameters inside are not specific values
Rather, it's a proxy for the block rows and columns
put at the end
I didn't think I'd get so much attention from readers for a piece I just wrote.
I built on this post by explaining the dimensional change process in more detail
Can better help you understand the use of the permute function
For advanced articles, please poke me
to this article on pytorch permute () function usage of the article is introduced to this, more related pytorch permute () function usage content please search for my previous posts or continue to browse the following related articles I hope you will support me in the future!