Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RESP3 causes RedisServerException only (P|S)SUBSCRIBE / (P|S)UNSUBSCRIBE / PING / QUIT are allowed in this context #2783

Open
shwallmic opened this issue Aug 30, 2024 · 10 comments

Comments

@shwallmic
Copy link

I have an application that is communicating with Azure Redis Enterprise (running version 6.0). I wanted to reduce the number of connections opened to the Redis Server, so I switched to using RESP3 protocol. However, now the application gets into a state that all commands result in error:

"Error while incrementing key TROR5646ed02-5ddf-45f8-92d0-6a2709b83b98|Exception:StackExchange.Redis.RedisServerException: ERR Can't execute 'evalsha': only (P|S)SUBSCRIBE / (P|S)UNSUBSCRIBE / PING / QUIT are allowed in this context\r\n at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()\r\n at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n at StackExchange.Redis.RedisDatabase.d__223.MoveNext()"

My application never calls Subscribe, so it's unclear to me how the client / server gets into this bad state.

  1. Is this some sort of bug on the client/server side?
  2. Is there a way to get out of this state?
  3. With RESP2 protocol, is there a way to prevent the redis client from opening a separate connection for Pub/Sub, given that I never intend to use that functionality?
@mgravell
Copy link
Collaborator

mgravell commented Sep 1, 2024

I'll investigate the failure - no idea what broke there! To disable subscription in RESP2, put ,$SUBSCRIBE= on the end of your config string (which disables that command); however, this will disable auto discovery of server uptime related events.

@mgravell
Copy link
Collaborator

mgravell commented Sep 3, 2024

(update: working with @philon-msft to see if there's an Azure Redis Enterprise thing going on here)

@slorello89
Copy link
Collaborator

RESP3 support was introduced to Redis Enterprise in 7.2, so that would be the issue here.

You can upgrade your instance to 7.2 (which is currently in Preview), or revert to using the RESP2 connection, I'd recommend disabling subscribe if you aren't using any subscribe channels and are ok missing those uptime events from the multiplexer. It will cut the number of connections to the proxy in half (and could reduce the number of connections to your shards by more than that depending on your deployment size).

@shwallmic
Copy link
Author

Thanks for investigating. I can look into implementing these solutions.

However, even with the version mismatch, does it make sense that the server could get into a state where it only accepts PSUBSCRIBE etc if I never call subscribe?

Also, if the server is not on a version that accepts RESP3, shouldn't the client be falling back to RESP2?

@mgravell
Copy link
Collaborator

mgravell commented Sep 3, 2024

@slorello89 that still sounds like a library bug, though; we're meant to be detecting the outcome of the HELLO attempt. I need to investigate this and make it not break so spectacularly.

@slorello89
Copy link
Collaborator

@mgravell aye, this is odd, the HELLO should fail and (in my testing) it looks like it opens up a RESP 2 connection, (at least it sends the AUTH command after the HELLO fails), interestingly though it does not seem like the client attempts to subscribe to the normal channels, nor does it appear to open a subscriber connection. Also I'm not able to reproduce @shwallmic's issue, any time I try to subscribe after trying to connect with RESP 3 the subscription fails because the Subscriber connection was never opened.

@shwallmic - are you doing something odd like trying to subscribe over the ad-hoc interface? e.g.

db.Execute("SUBSCRIBE", "test"); // subscribing over the interactive connection is connection altering
db.StringSet("foo", "bar"); // now the connection is unusable so we get your error.

that's the only way I can reproduce this error.

@mgravell
Copy link
Collaborator

mgravell commented Sep 4, 2024

(at least it sends the AUTH command after the HELLO fails)

(we actually send AUTH before the HELLO has failed: https://github.com/StackExchange/StackExchange.Redis/blob/main/src/StackExchange.Redis/ServerEndPoint.cs#L943-L964)

@slorello89
Copy link
Collaborator

slorello89 commented Sep 4, 2024

Got it, regardless, I can see that the connection is RESP2:

id=11615003000 addr=x.x.x.x:58624 laddr=10.0.0.6:10000 fd=138 name=machine-name(SE.Redis-v2.8.12.45748) age=4 idle=3 flags=N db=0 sub=0 psub=0 ssub=0 multi=-1 obl=0 events=r cmd=get user=default resp=2 lib-name=SE.Redis lib-ver=2.8.12.45748

@shwallmic
Copy link
Author

We don't call SUBSCRIBE in our code. I've connected to the server and run INFO commandstats. it shows that SUBSCRIBE has never been called.

INFO commandstats

Commandstats

cmdstat_psync:calls=9,usec=5088,usec_per_call=565.33
cmdstat_expire:calls=1894160611,usec=33737120714,usec_per_call=17.81
cmdstat_replconf:calls=1497934,usec=2028041,usec_per_call=1.35
cmdstat_evalsha:calls=1194244389,usec=59167244130,usec_per_call=49.54
cmdstat_ping:calls=953430,usec=232276,usec_per_call=0.24
cmdstat_hello:calls=1618093,usec=3527337,usec_per_call=2.18
cmdstat_client:calls=24499224,usec=98611057,usec_per_call=4.03
cmdstat_hdel:calls=87410294,usec=910576785,usec_per_call=10.42
cmdstat_time:calls=1196364535,usec=8783809366,usec_per_call=7.34
cmdstat_del:calls=1578607,usec=5243121,usec_per_call=3.32
cmdstat_echo:calls=4005252,usec=2313563,usec_per_call=0.58
cmdstat_eval:calls=2120146,usec=150306675,usec_per_call=70.89
cmdstat_exec:calls=1325069336,usec=8681112978,usec_per_call=6.55
cmdstat_hkeys:calls=1196364535,usec=21961135355,usec_per_call=18.36
cmdstat_hincrby:calls=1894160611,usec=30776527192,usec_per_call=16.25
cmdstat_hvals:calls=581450130,usec=20381002643,usec_per_call=35.05
cmdstat_select:calls=9,usec=6,usec_per_call=0.67
cmdstat_multi:calls=1325069336,usec=253850681,usec_per_call=0.19
cmdstat_get:calls=3170980,usec=5975640,usec_per_call=1.88

Maybe something else is going on. Could something happen to the connection (too many timeouts for example?) that could cause it to get into a state that requires the connection to QUIT / reconnect?

@mgravell
Copy link
Collaborator

mgravell commented Sep 5, 2024

I will need to investigate. However, for now the workaround is simple: don't try to enable RESP3 on this setup, since the server doesn't support it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants