Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues #27

Open
mingodad opened this issue Jul 16, 2022 · 4 comments
Open

Performance issues #27

mingodad opened this issue Jul 16, 2022 · 4 comments

Comments

@mingodad
Copy link
Contributor

Trying to build the grammar from postgresql mentioned here #22 (comment) it takes a long time to generate the parser and looking at the profile it seems that most of the time is spent on size_t plist_union( plist* all, plist* from ) (see bellow) that is doing a naive exponential comparison/insertion between two lists.

Looking through the code I think that creating a sorted list first and then doing a binary search on it to decide to push/add one list element could improve things (not ideal but it's something), but I'm not sure about it because I don't know if the lists can have duplicates.

Any thought about this ?

/usr/bin/time unicc-nb postgresql-13.3.y.par 
104.40user 0.10system 1:44.57elapsed 99%CPU (0avgtext+0avgdata 135872maxresident)k
152inputs+11096outputs (0major+34371minor)pagefaults 0swaps
Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 10.95      8.67     8.67 1398180161     0.00     0.00  plist_compare
 10.50     16.98     8.31 12371648     0.00     0.00  plist_union
  9.83     24.76     7.78 3259644116     0.00     0.00  plist_next
  9.61     32.37     7.61  2294100     0.00     0.00  close_item
  7.97     38.68     6.31        1     6.31     6.38  detect_default_productions
  7.06     44.27     5.59     8438     0.00     0.01  lalr1_closure
  5.59     48.69     4.43 2272182630     0.00     0.00  plist_access
  5.48     53.03     4.34 1343718440     0.00     0.00  pccl_count
  5.00     56.99     3.96  6007214     0.00     0.00  list_push
  3.27     59.58     2.59 209994052     0.00     0.00  list_count
  2.69     61.71     2.13 113372032     0.00     0.00  plistel_drop
  2.30     63.53     1.82                             plist_key
  2.08     65.18     1.65 582855415     0.00     0.00  pccl_intersect
  1.91     66.69     1.51     3306     0.00     0.00  pregex_dfa_from_nfa
  1.63     67.98     1.29  1223780     0.00     0.00  list_find
  1.48     69.15     1.18 78921959     0.00     0.00  test_same_kernel
  1.26     70.15     1.00 586092893     0.00     0.00  pccl_compat
  1.12     71.04     0.89        1     0.89     3.53  check_regex_anomalies
  0.86     71.72     0.68 11977201     0.00     0.00  plist_erase
  0.84     72.38     0.67 113382156     0.00     0.00  plist_insert
  0.78     73.00     0.62 27035168     0.00     0.00  pccl_testrange
  0.75     73.59     0.59 179834200     0.00     0.00  sort_symbols
  0.57     74.04     0.45   855253     0.00     0.00  pregex_ptn_to_nfa
  0.44     74.39     0.35                             plist_set_printfn
  0.40     74.71     0.32    11081     0.00     0.00  plist_subsort
  0.37     75.00     0.29 35355186     0.00     0.00  plist_remove
  0.35     75.28     0.28 130690767     0.00     0.00  pmalloc
  0.32     75.53     0.25 247164803     0.00     0.00  pfree
  0.30     75.77     0.24 10357446     0.00     0.00  plist_get_by_ptr
  0.26     75.97     0.21 154645260     0.00     0.00  plist_first
  0.24     76.16     0.19  4939448     0.00     0.00  pccl_create
  0.24     76.35     0.19  1447245     0.00     0.00  pccl_compare
  0.20     76.51     0.16 124362026     0.00     0.00  plist_last
@mingodad
Copy link
Contributor Author

Again looking through the code I can see that instead of using a BitArray for computing first/follow sets it's using plist.
Probably plist is fine to use when parsing the grammar but then converting then to arrays and doing all processing on then would improve the performance.

@phorward
Copy link
Owner

Hello @mingodad, thank you very much for your detailed report on this problem.

You're totally right, the current implementation is a naive exponential comparison of lists, mostly because the lists are self-implemented in libphorward. There was no focus on performance yet, but on usability.

There are couple of things inside unicc that could be improved. But due its very rare use, lack of time and priority on other projects, I currently don't have the time on dealing with this issues, because it might require a deeper examination to resolve the problem properly.

Looking through the code I think that creating a sorted list first and then doing a binary search on it to decide to push/add one list element could improve things (not ideal but it's something), but I'm not sure about it because I don't know if the lists can have duplicates.

Normally, there won't be any duplicates, so your attempt sounds useful. There's also the parray structure which might be useful as well for specific use-cases (this was used in the unfinished unicc2 code base in several situations to increase speed and decrease memory usage), but it requires a lot of effort to rewrite the current implementation from plist to parray.

If you have any proposal, or a quick working solution in form of a pull request, this is very welcome!

@mingodad
Copy link
Contributor Author

mingodad commented Jul 22, 2022

Adding a cached sorted list array the total time to parse was reduced around 40% and the profile output is now:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 14.87      7.97     7.97  2368251     0.00     0.00  close_item
 13.43     15.17     7.20     8735     0.00     0.00  lalr1_closure
 11.77     21.48     6.31        1     6.31     6.37  detect_default_productions
  9.86     26.77     5.29  6222279     0.00     0.00  list_push
  6.39     30.19     3.43 208112882     0.00     0.00  list_count
  6.32     33.58     3.39 2069372468     0.00     0.00  plist_access
  4.91     36.21     2.63 1860430152     0.00     0.00  plist_next
  3.40     38.03     1.82 65828038     0.00     0.00  bsearch_r
  3.02     39.65     1.62 865039536     0.00     0.00  plist_compare
  2.97     41.24     1.59 1444959779     0.00     0.00  plist_prev
  2.95     42.82     1.58  1206508     0.00     0.00  list_find
  2.72     44.28     1.46 34071855     0.00     0.00  plist_get_by_ptr
  2.19     45.45     1.18 77750733     0.00     0.00  test_same_kernel
  1.72     46.37     0.92  2109590     0.00     0.00  plist_offset
  1.66     47.26     0.89 234329705     0.00     0.00  sort_symbols
  1.65     48.15     0.89 390224369     0.00     0.00  cmp_ordered_list_bsearch
  1.51     48.96     0.81                             ordered_list_compare
  0.95     49.47     0.51 73347230     0.00     0.00  plist_insert
  0.76     49.88     0.41                             plist_set_printfn
  0.76     50.28     0.41                             plist_key
  0.60     50.60     0.32 73337106     0.00     0.00  plistel_drop
  0.52     50.88     0.28     6008     0.00     0.00  plist_subsort
  0.50     51.15     0.27 12956766     0.00     0.00  plist_union
  0.49     51.41     0.26 10123214     0.00     0.00  plist_erase
  0.40     51.63     0.22  1657945     0.00     0.00  recreate_ordered_list
  0.34     51.81     0.18 13145292     0.00     0.00  plist_get
  0.28     51.96     0.15 166448742     0.00     0.00  pfree
  0.28     52.11     0.15 85400518     0.00     0.00  pmalloc
  0.24     52.24     0.13 78722663     0.00     0.00  plist_first
  0.22     52.36     0.12                             plist_riter_access
  0.21     52.47     0.11 26824786     0.00     0.00  list_free
  0.20     52.57     0.11                             list_dup
/usr/bin/time unicc-nb postgresql-13.3.y.par 
60.34user 0.09system 1:00.44elapsed 99%CPU (0avgtext+0avgdata 140336maxresident)k
0inputs+11096outputs (0major+60347minor)pagefaults 0swaps

@mingodad
Copy link
Contributor Author

Profiling with with optimization now I'm getting this:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 38.37     17.40    17.40        1    17.40    18.36  pregex_dfa_from_nfa
 25.76     29.08    11.68                             generate_tables
  9.13     33.22     4.14                             detect_default_productions
  6.68     36.25     3.03  4922934     0.00     0.00  list_push
  4.07     38.10     1.85 12956766     0.00     0.00  plist_union
  3.37     39.63     1.53  1206508     0.00     0.00  list_find
  2.94     40.96     1.34 442540698     0.00     0.00  plist_compare
  2.40     42.05     1.09        1     1.09     1.11  pregex_dfa_minimize
  1.55     42.76     0.71 234330865     0.00     0.00  sort_symbols
  1.03     43.22     0.47 489312439     0.00     0.00  plist_access
  0.82     43.59     0.37 73347229     0.00     0.00  plist_insert
  0.79     43.95     0.36    11971     0.00     0.00  ordered_list_compare
  0.71     44.27     0.32 10123214     0.00     0.00  plist_erase
  0.64     44.56     0.29     6008     0.00     0.00  plist_subsort
  0.33     44.71     0.15  1901053     0.00     0.00  plist_concat
  0.13     44.77     0.06 84101173     0.00     0.00  pmalloc
  0.11     44.82     0.05  8899391     0.00     0.00  seek_rhs_first
  0.11     44.87     0.05      486     0.00     0.00  pregex_ptn_to_NFA
  0.10     44.92     0.05 14962100     0.00     0.00  plist_count
  0.10     44.96     0.05                             pregex_qreplace
  0.09     45.00     0.04 13080956     0.00     0.00  plist_get
  0.09     45.04     0.04                             build_code
  0.08     45.08     0.04                             plist_set_printfn
  0.07     45.11     0.03  8212527     0.00     0.00  plist_get_by_ptr
  0.07     45.14     0.03  2106378     0.00     0.00  plist_diff
  0.07     45.17     0.03    24343     0.00     0.00  list_count
  0.06     45.19     0.03  1208019     0.00     0.00  create_item
  0.04     45.21     0.02  1423413     0.00     0.00  plist_init
  0.04     45.23     0.02  1244475     0.00     0.00  plist_first
  0.04     45.25     0.02   458244     0.00     0.00  pstrrender
  0.04     45.27     0.02    57916     0.00     0.00  find_tabcol
  0.03     45.29     0.02    66068     0.00     0.00  parray_compare
  0.02     45.30     0.01  1037143     0.00     0.00  list_free
  0.02     45.31     0.01     3306     0.00     0.00  free_state
  0.02     45.32     0.01        1     0.01     0.16  _parse
  0.02     45.33     0.01                             pdbl_to_wcs
  0.02     45.34     0.01                             plist_dbgstats

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants