Skip to content

Conversation

anonrig
Copy link
Member

@anonrig anonrig commented Feb 23, 2023

Thanks to @miguelteixeiraa and @lemire

Before

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                            Time             CPU   Iterations        GHz cycle/byte cycles/url instructions/byte instructions/cycle instructions/ns instructions/url     ns/url      speed  time/byte   time/url      url/s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BasicBench_AdaURL_With_Copy       2342 ns         2341 ns       298997    4.81719    13.6646    1003.73            64.271            4.70347         22.6575           4.721k    208.364 345.132M/s  2.89744ns   212.83ns 4.69858M/s
BasicBench_AdaURL_With_Move       2028 ns         2027 ns       346784      5.063    12.5322    920.545           58.1597            4.64083         23.4965         4.27209k    181.818 398.596M/s  2.50881ns  184.283ns 5.42643M/s

After

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                            Time             CPU   Iterations        GHz cycle/byte cycles/url instructions/byte instructions/cycle instructions/ns instructions/url     ns/url      speed  time/byte   time/url      url/s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BasicBench_AdaURL_With_Copy       2351 ns         2351 ns       300044    4.77282    13.7809    1012.27           64.1646            4.65604         22.2225         4.71318k    212.091 343.716M/s  2.90938ns  213.707ns 4.67931M/s
BasicBench_AdaURL_With_Move       2003 ns         2003 ns       349664     5.0045    12.3874    909.909           58.1361            4.69318          23.487         4.27036k    181.818 403.434M/s  2.47872ns  182.073ns  5.4923M/s

@anonrig anonrig requested a review from lemire February 23, 2023 15:58
@anonrig anonrig force-pushed the improve-authority-performance branch from c12fa66 to 8218997 Compare February 23, 2023 16:00
@anonrig anonrig force-pushed the improve-authority-performance branch 2 times, most recently from e76bd37 to 98763f8 Compare February 24, 2023 00:34
@lemire
Copy link
Member

lemire commented Feb 24, 2023

@anonrig This is now very good looking.

Can you have a look at what I did in PR #232 ? It is possible that you might be able to gain a little bit of extra performance by mimicking my crazy code. At the very least, you might enjoy it.

@anonrig anonrig force-pushed the improve-authority-performance branch 2 times, most recently from 7997753 to 19c944d Compare February 25, 2023 17:07
@anonrig
Copy link
Member Author

anonrig commented Feb 25, 2023

The benchmark results after SWAR routine is like this.

Before

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                            Time             CPU   Iterations        GHz cycle/byte cycles/url instructions/byte instructions/cycle instructions/ns instructions/url     ns/url      speed  time/byte   time/url      url/s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BasicBench_AdaURL_With_Copy       2356 ns         2356 ns       295865    4.81269     13.896    1020.73           64.1745            4.61819         22.2259         4.71391k    212.091 342.922M/s  2.91611ns  214.202ns 4.66849M/s
BasicBench_AdaURL_With_Move       2053 ns         2050 ns       347598      5.026    12.4406    913.818           58.1733            4.67608          23.502         4.27309k    181.818 394.192M/s  2.53684ns  186.342ns 5.36647M/s

After

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                            Time             CPU   Iterations        GHz cycle/byte cycles/url instructions/byte instructions/cycle instructions/ns instructions/url     ns/url      speed  time/byte   time/url      url/s
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BasicBench_AdaURL_With_Copy       2320 ns         2320 ns       301131    4.82366     13.677    1004.64           64.1993            4.69396         22.6421         4.71573k    208.273 348.337M/s  2.87078ns  210.872ns 4.74221M/s
BasicBench_AdaURL_With_Move       1998 ns         1998 ns       350259    5.09857    12.3552    907.545           58.1881            4.70961         24.0123         4.27418k        178 404.501M/s  2.47218ns  181.593ns 5.50682M/s

@anonrig anonrig force-pushed the improve-authority-performance branch from 19c944d to 1c9a86f Compare February 25, 2023 17:17
@lemire lemire self-requested a review February 25, 2023 19:47
@anonrig anonrig merged commit cff8745 into main Feb 25, 2023
@anonrig anonrig deleted the improve-authority-performance branch February 25, 2023 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants