Skip to content

MOD: optimize DGEMM of large matrices on cortex A53 & A55#3451

Merged
martin-frbg merged 1 commit intoOpenMathLib:developfrom
wjc404:optimize-A53-dgemm
Nov 18, 2021
Merged

MOD: optimize DGEMM of large matrices on cortex A53 & A55#3451
martin-frbg merged 1 commit intoOpenMathLib:developfrom
wjc404:optimize-A53-dgemm

Conversation

@wjc404
Copy link
Copy Markdown
Contributor

@wjc404 wjc404 commented Nov 17, 2021

No description provided.

@wjc404
Copy link
Copy Markdown
Contributor Author

wjc404 commented Nov 17, 2021

image
Test programs:
OpenBLAS-test.tar.gz
(src/test_kernel.cpp for the correctness of kernel function, src/test_dgemm.cpp for the result and performance of DGEMM)

@wjc404
Copy link
Copy Markdown
Contributor Author

wjc404 commented Nov 17, 2021

I need some help for figuring out the syntax error in inline asm:) The asm block can be complied with NDK-r23b and gcc-7, but fails in the CI job.

@martin-frbg
Copy link
Copy Markdown
Collaborator

Sorry I do not see why it does not like the local labels here. As I read the error messages, it seems to think that jumping forward to 3: in the "blt 3f; beq 2f; .align 4; 1:\n\t" of line 84 would cross into a different loop section, and anything behind the "blt3f" on that line seems to get ignored so label 1: does not get defined at all(?) The only thing that stands out to me is the ".align 4" (and I remember earlier Mac assemblers being peculiar in their handlng of .align, making it necessary to use .p2align instead).

@martin-frbg
Copy link
Copy Markdown
Collaborator

Looks like the Apple Clang assembler simply wants to see the labels on a separate line (cf my test in #3452) - go figure...

@wjc404 wjc404 force-pushed the optimize-A53-dgemm branch from c2c3fa8 to 302f226 Compare November 18, 2021 13:32
@wjc404
Copy link
Copy Markdown
Contributor Author

wjc404 commented Nov 18, 2021

Thank you very much @martin-frbg :)

@martin-frbg martin-frbg added this to the 0.3.19 milestone Nov 18, 2021
@martin-frbg martin-frbg merged commit ec4daf4 into OpenMathLib:develop Nov 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants