-
Notifications
You must be signed in to change notification settings - Fork 18
subscription api timing out on large dataset #1079
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The Oracle DBAs reported a query plan instability issue for the subscription query, which may hinder the performance in unpredictable ways. |
Hi Jean-Roch, Meanwhile I will follow up with Kate on performance issue. She mentioned at yesterday's compops meeting about seeing about 100 concurrent sessions for the subscriptions query. If this is initiated by the Unified scripts, could you point me to the corresponding code. I'd like to see if there is a way this could be optimized. |
Hi @nataliaratnikova , are you suggesting that instead of the wildcard search I first list of blocks and make a phedex call per block ? I can do that of course, no pb, I am unsure on how much load this will put on datasvc. Unified does not do concurrent calls to the subscription API @sidnarayanan might be able to say more about transfer team, @yiiyama for dynamo. Is there a way you can trace the IP from which the numerous concurrent calls are coming from ? |
AFAIK the transfer team should not be making 100 concurrent calls to the subscriptions (or any) API. |
Dynamo can issue up to 64 concurrent blockreplicas queries, but it shouldn't be using subscriptions. I will double check but indeed it will be great if the IP can be known. |
I found ~40K hits coming from the MIT server, which constitute the majority of all calls to susbscription API. However, this is not necessarily the reason for the problem with the large datasets Jean-Roch reported here. Will investigate further. |
mit server is dynamo @yiiyama indeed. |
I cannot get https://cmsweb.cern.ch/phedex/datasvc/json/prod/subscriptions?block=/Neutrino_E-10_gun/RunIISpring15PrePremix-PUMoriond17_80X_mcRun2_asymptotic_2016_TrancheIV_v2-v2/GEN-SIM-DIGI-RAW%23*&node=T2_DE_DESY&collapse=n to not timeout and therefore unified cannot identify programatically the location of the pileup.
Is there a way to break down further the request so as to get it to converge ?
FYI @areinsvo @sidnarayanan
The text was updated successfully, but these errors were encountered: