Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnavailableDataError for Key "COUNTERS_PORT_NAME_MAP" in COUNTERS_DB #84

Open
sripalnati opened this issue Aug 31, 2018 · 3 comments
Open

Comments

@sripalnati
Copy link

Aug 31 20:27:22.308214 switch75 ALERT pmon/sensord: Sensor alarm: Chip max1617a-i2c-40-2a: ASIC: -5.0 C (min = -5.0 C, max = -5.0 C) [ALARM]
Aug 31 20:27:44.565279 switch75 ERR snmp/snmp-subagent [ax_interface] ERROR: MIBUpdater.start() caught an unexpected exception during update_data()#012Traceback (most recent call last):#12 File "/usr/local/lib/python3.6/dist-packages/ax_interface/mib.py", line 40, in start#012 self.reinit_data()#12 File "/usr/local/lib/python3.6/dist-packages/sonic_ax_impl/mibs/vendor/cisco/ciscoPfcExtMIB.py", line 38, in reinit_data#012 self.oid_name_map = mibs.init_sync_d_interface_tables(self.db_conn)#12 File "/usr/local/lib/python3.6/dist-packages/sonic_ax_impl/mibs/init.py", line 88, in init_sync_d_interface_tables#012 if_name_map, if_id_map = port_util.get_interface_oid_map(db_conn)#12 File "/usr/local/lib/python3.6/dist-packages/swsssdk/port_util.py", line 40, in get_interface_oid_map#012 if_name_map = db.get_all('COUNTERS_DB', 'COUNTERS_PORT_NAME_MAP', blocking=True)#12 File "/usr/local/lib/python3.6/dist-packages/swsssdk/interface.py", line 38, in wrapped#012 ret_data = f(inst, db_name, *args, **kwargs)#12 File "/usr/local/lib/python3.6/dist-packages/swsssdk/interface.py", line 310, in get_all#012 raise UnavailableDataError(message, _hash)#012swsssdk.exceptions.UnavailableDataError: Key 'COUNTERS_PORT_NAME_MAP' unavailable in database 'COUNTERS_DB'

@sripalnati
Copy link
Author

Hi, the above unavailableDataError is always seen in the syslog when swss is restarted. But, when I connect to the redis-cli and I could able to dump the COUNTERS_PORT_NAME_MAP as follows:

127.0.0.1:6379[2]> hgetall COUNTERS_PORT_NAME_MAP

  1. "Ethernet0"
  2. "oid:0x100000000049a"
  3. "Ethernet4"
  4. "oid:0x100000000049b"
  5. "Ethernet8"
  6. "oid:0x100000000049c"
  7. "Ethernet12"
  8. "oid:0x100000000049d"
  9. "Ethernet16"
  10. "oid:0x100000000049e"
  11. "Ethernet20"
  12. "oid:0x100000000049f"
  13. "Ethernet24"
  14. "oid:0x10000000004a0"
  15. "Ethernet28"
  16. "oid:0x10000000004a1"
  17. "Ethernet32"
  18. "oid:0x10000000004a2"
  19. "Ethernet36"
  20. "oid:0x10000000004a3"
  21. "Ethernet40"
  22. "oid:0x10000000004a4"
  23. "Ethernet44"
  24. "oid:0x10000000004a5"
  25. "Ethernet48"
  26. "oid:0x10000000004a6"
  27. "Ethernet52"
  28. "oid:0x10000000004a7"
  29. "Ethernet56"
  30. "oid:0x10000000004a8"
  31. "Ethernet60"
  32. "oid:0x10000000004a9"
  33. "Ethernet64"
  34. "oid:0x10000000004aa"
  35. "Ethernet68"
  36. "oid:0x10000000004ab"
  37. "Ethernet72"
  38. "oid:0x1000000000488"
  39. "Ethernet76"
  40. "oid:0x1000000000489"
  41. "Ethernet80"
  42. "oid:0x100000000048a"
  43. "Ethernet84"
  44. "oid:0x100000000048b"
  45. "Ethernet88"
  46. "oid:0x100000000048c"
  47. "Ethernet92"
  48. "oid:0x100000000048d"
  49. "Ethernet96"
  50. "oid:0x100000000048e"
  51. "Ethernet100"
  52. "oid:0x100000000048f"
  53. "Ethernet104"
  54. "oid:0x1000000000490"
  55. "Ethernet108"
  56. "oid:0x1000000000491"
  57. "Ethernet112"
  58. "oid:0x1000000000492"
  59. "Ethernet116"
  60. "oid:0x1000000000493"
  61. "Ethernet120"
  62. "oid:0x1000000000494"
  63. "Ethernet124"
  64. "oid:0x1000000000495"
  65. "Ethernet128"
  66. "oid:0x1000000000496"
  67. "Ethernet132"
  68. "oid:0x1000000000497"
  69. "Ethernet136"
  70. "oid:0x1000000000498"
  71. "Ethernet140"
  72. "oid:0x1000000000499"
    127.0.0.1:6379[2]>

@jerry-chang3300
Copy link

We have similar error log as previous, it seems that SNMP tries to get queue_stat_map before it's created. Since SNMP would retry to get it again, it should not be a problem, so we low down the log level from error to warning.

"Oct 5 06:26:20.005020 as7816-64x ERR snmp#snmp-subagent [sonic_ax_impl] ERROR: No queue stat counters found in the Counter DB. SyncD database is incoherent.",
"Oct 5 06:26:20.006722 as7816-64x ERR snmp#snmp-subagent [ax_interface] ERROR: MIBUpdater.start() caught an unexpected exception during update_data()#012Traceback (most recent call last):#012 File \"/usr/local/lib/python3.6/dist-packages/ax_interface/mib.py\", line 40, in start#012 self.reinit_data()#012 File \"/usr/local/lib/python3.6/dist-packages/sonic_ax_impl/mibs/vendor/cisco/ciscoSwitchQosMIB.py\", line 79, in reinit_data#012 mibs.init_sync_d_queue_tables(self.db_conn)#012 File \"/usr/local/lib/python3.6/dist-packages/sonic_ax_impl/mibs/_init_.py\", line 332, in init_sync_d_queue_tables#012 raise RuntimeError('The queue_stat_map is not defined')#012RuntimeError: The queue_stat_map is not defined",

@jerry-chang3300
Copy link

The root cause should be the difference in time between containers.
PortOrch would write QUEUE_COUNTER_ID_LIST to Flex_Counter_DB when initialize, and SyncD_Flex_Counter would write counter to Counter_DB after it collects counter.
But SNMP may query Counter_DB by QUEUE_COUNTER_ID_LIST before SyncD_Flex_Counter write counters to Counter_DB, then SNMP can't get it and write this log.

jerry-chang3300 added a commit to jerry-chang3300/sonic-snmpagent that referenced this issue Oct 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants