Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis 3.0 and Zabbix monitoring #9

Open
msims-okta opened this issue Apr 16, 2015 · 3 comments
Open

Redis 3.0 and Zabbix monitoring #9

msims-okta opened this issue Apr 16, 2015 · 3 comments

Comments

@msims-okta
Copy link

We're using the recently released Redis 3.0 with clustering enabled.

I have Zabbix monitoring configured via cron, pushing data to our Zabbix server.

Every so often a key the zabbix python script sends to the localhost redis node errors:

#  /etc/zabbix/zabbix_agentd.d/zbx_redis_stats.py localhost -p 6379

Traceback (most recent call last):
  File "/etc/zabbix/zabbix_agentd.d/zbx_redis_stats.py", line 145, in <module>
    main()
  File "/etc/zabbix/zabbix_agentd.d/zbx_redis_stats.py", line 137, in main
    if client.type(key) == 'list':
  File "/usr/lib/python2.6/site-packages/redis/client.py", line 1112, in type
    return self.execute_command('TYPE', name)
  File "/usr/lib/python2.6/site-packages/redis/client.py", line 565, in execute_command
    return self.parse_response(connection, command_name, **options)
  File "/usr/lib/python2.6/site-packages/redis/client.py", line 577, in parse_response
    response = connection.read_response()
  File "/usr/lib/python2.6/site-packages/redis/connection.py", line 574, in read_response
    raise response
redis.exceptions.ResponseError: MOVED 8833 10.139.103.247:6379

This is on a cluster slave. The IP is the cluster master.

@msims-okta
Copy link
Author

To fix this, I had to perform a FLUSHDB on the cluster master. This is less than ideal.

I'll look through the code to find what keys this Zabbix python script is using and see if I can narrow it down.

@msims-okta
Copy link
Author

After some addition use, it error occurs when a cluster slave failovers as the new master. The error then occurs on both slaves. Even after performing a 'cluster failover' back to the original master, the two slaves continue to produce this error while the master is fine.

We are no longer able to monitor the clustered slaves as no new data is able to make it back to the Zabbix server.

@msims-okta
Copy link
Author

OK I may have found the culprit.

The client.keys(*) appears to be the issue. In a clustered state (sharding) keys can exist on another node.

I commented out the following:

134         #keys = client.keys('*')
135         #llensum = 0
136         #for key in keys:
137         #    if client.type(key) == 'list':
138         #        llensum += client.llen(key)
139         #a.append(Metric(redis_hostname, 'redis[llenall]', llensum))

And I'm no longer getting these errors, and data is making its way back to the zabbix server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant