Skip to content

Partly vectorize CompactProtocol list read #9606

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Nicoshev
Copy link
Contributor

Summary:
Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt 36.10ns 27.70M
CompactProtocol_read_BigListByte 18.32us 54.57K 10005
CompactProtocol_read_BigListShort 27.57us 36.27K 27489
CompactProtocol_read_BigListInt 22.74us 43.97K 49370
CompactProtocol_read_BigListBigInt 25.26us 39.59K 49696
CompactProtocol_read_BigListFloat 18.62us 53.69K 40005
CompactProtocol_read_BigListDouble 18.81us 53.16K 80005

after:

CompactProtocol_read_SmallListInt 27.07ns 36.94M 52
CompactProtocol_read_BigListByte 185.48ns 5.39M 10005
CompactProtocol_read_BigListShort 5.97us 167.42K 27489
CompactProtocol_read_BigListInt 8.67us 115.37K 49370
CompactProtocol_read_BigListBigInt 13.01us 76.87K 49696
CompactProtocol_read_BigListFloat 827.75ns 1.21M 40005
CompactProtocol_read_BigListDouble 1.67us 600.49K 80005

Differential Revision: D73063243

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Apr 24, 2025
Summary:

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            5.97us   167.42K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          13.01us    76.87K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Apr 24, 2025
Summary:
Pull Request resolved: facebook#9606

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            5.97us   167.42K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          13.01us    76.87K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@Nicoshev Nicoshev force-pushed the export-D73063243 branch 2 times, most recently from 6a149e8 to f04b1fc Compare April 24, 2025 15:08
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Apr 24, 2025
Summary:

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            5.97us   167.42K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          13.01us    76.87K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Apr 24, 2025
Summary:

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            5.97us   167.42K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          13.01us    76.87K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Apr 24, 2025
Summary:
Pull Request resolved: facebook#9606

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            5.97us   167.42K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          13.01us    76.87K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Apr 24, 2025
Summary:

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            5.97us   167.42K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          13.01us    76.87K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Apr 24, 2025
Summary:
Pull Request resolved: facebook#9606

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            5.97us   167.42K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          13.01us    76.87K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Apr 24, 2025
Summary:

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Apr 30, 2025
Summary:

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request Apr 30, 2025
Summary:

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

@Nicoshev Nicoshev force-pushed the export-D73063243 branch from 1ab0c76 to 3aa6e1e Compare May 5, 2025 16:21
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request May 5, 2025
Summary:
Pull Request resolved: facebook#9606

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request May 5, 2025
Summary:

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@Nicoshev Nicoshev force-pushed the export-D73063243 branch from 3aa6e1e to 5687efb Compare May 5, 2025 20:58
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request May 5, 2025
Summary:
Pull Request resolved: facebook#9606

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@Nicoshev Nicoshev force-pushed the export-D73063243 branch from 5687efb to eabcad3 Compare May 5, 2025 21:05
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request May 5, 2025
Summary:

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@Nicoshev Nicoshev force-pushed the export-D73063243 branch from eabcad3 to bd62f96 Compare May 5, 2025 21:21
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

@Nicoshev Nicoshev force-pushed the export-D73063243 branch from bd62f96 to c388da1 Compare May 9, 2025 05:59
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request May 9, 2025
Summary:

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request May 9, 2025
Summary:

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@Nicoshev Nicoshev force-pushed the export-D73063243 branch from c388da1 to 7ce970c Compare May 9, 2025 06:01
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

@Nicoshev Nicoshev force-pushed the export-D73063243 branch from 7ce970c to cf10c0b Compare May 9, 2025 06:04
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request May 9, 2025
Summary:
Pull Request resolved: facebook#9606

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@Nicoshev Nicoshev force-pushed the export-D73063243 branch from cf10c0b to 1c15fb5 Compare May 12, 2025 23:30
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request May 12, 2025
Summary:

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request May 12, 2025
Summary:
Pull Request resolved: facebook#9606

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@Nicoshev Nicoshev force-pushed the export-D73063243 branch from 1c15fb5 to 06821f7 Compare May 12, 2025 23:33
Summary:
Pull Request resolved: facebook#9605

Vectorize CompactProtocol's list writing on aarch64.

Maybe new code should be placed in a different file instead.

Performance gains varies by type:

before:

CompactProtocol_write_SmallListInt                         38.38ns    26.05M
CompactProtocol_write_BigListByte                          18.40us    54.33K
CompactProtocol_write_BigListShort                         19.30us    51.82K
CompactProtocol_write_BigListInt                           19.96us    50.11K
CompactProtocol_write_BigListBigInt                        26.54us    37.68K
CompactProtocol_write_BigListFloat                         18.54us    53.92K
CompactProtocol_write_BigListDouble                        18.79us    53.22K

after:

CompactProtocol_write_SmallListInt                         31.65ns    31.60M
CompactProtocol_write_BigListByte                         223.77ns     4.47M
CompactProtocol_write_BigListShort                          6.58us   152.07K
CompactProtocol_write_BigListInt                            8.26us   121.06K
CompactProtocol_write_BigListBigInt                        11.40us    87.73K
CompactProtocol_write_BigListFloat                        830.74ns     1.20M
CompactProtocol_write_BigListDouble                         1.55us   645.79K

Differential Revision: D72810122

Reviewed By: vitaut
@Nicoshev Nicoshev force-pushed the export-D73063243 branch from 06821f7 to 4645951 Compare May 28, 2025 05:16
Nicoshev added a commit to Nicoshev/hhvm that referenced this pull request May 28, 2025
Summary:

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
Summary:
Pull Request resolved: facebook#9606

Partly vectorize CompactProtocol's list reading, mainly on aarch64.

Performance gains varies by type:

before:
CompactProtocol_read_SmallListInt                                           36.10ns    27.70M
CompactProtocol_read_BigListByte                                            18.32us    54.57K            10005
CompactProtocol_read_BigListShort                                           27.57us    36.27K            27489
CompactProtocol_read_BigListInt                                             22.74us    43.97K            49370
CompactProtocol_read_BigListBigInt                                          25.26us    39.59K            49696
CompactProtocol_read_BigListFloat                                           18.62us    53.69K            40005
CompactProtocol_read_BigListDouble                                          18.81us    53.16K            80005

after:

CompactProtocol_read_SmallListInt                                           27.07ns    36.94M               52
CompactProtocol_read_BigListByte                                           185.48ns     5.39M            10005
CompactProtocol_read_BigListShort                                            6.01us   166.50K            27489
CompactProtocol_read_BigListInt                                              8.67us   115.37K            49370
CompactProtocol_read_BigListBigInt                                          11.33us    88.26K            49696
CompactProtocol_read_BigListFloat                                          827.75ns     1.21M            40005
CompactProtocol_read_BigListDouble                                           1.67us   600.49K            80005

Differential Revision: D73063243
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D73063243

@Nicoshev Nicoshev force-pushed the export-D73063243 branch from 4645951 to 0062b58 Compare May 28, 2025 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants