Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request - Bitmap/Bitfield extraction support #1310

Open
lod opened this issue Dec 7, 2024 · 3 comments
Open

Feature request - Bitmap/Bitfield extraction support #1310

lod opened this issue Dec 7, 2024 · 3 comments

Comments

@lod
Copy link

lod commented Dec 7, 2024

A bunch of devices provide status or alarm bytes, where each bit corresponds to a different system or element. I'm currently playing with ViaLite products but I expect that it is a common problem. The desired data here is actually the bit values, such as bit 3, not the byte.

The standard stack handles this poorly.

  • The snmp device provides the byte when queried
  • snmp_exporter fetches the byte and provides it as a metric gauge
  • Prometheus stores it and provides the byte on request
  • Grafana and Alertmanager use promql which provides the byte

There is discussion on providing bitwise operators to Prometheus which would allow retrieving specific bits using promql prometheus/prometheus#14493 and VictoriaMetrics has basic bitwise operators. While using bitwise_and() would allow extracting a specific bit it is an ugly solution and requires a query for each bit. Other existing promql query options are even worse.

I believe a much better solution is to solve it in snmp_exporter much like the existing enum system. Each bit would be broken out into a different label. The resulting metric would be something like monStatus{bit=LaserStatus} 1

This would require some work to the snmp exporter code, and support for the generator. I believe it is inline with the design philosophy of the snmp_exporter, the functionality is conceptually similar to the existing scale and offset options. As it would be a new configuration option there shouldn't be any backwards compatibility concerns.

I am happy to prepare a design proposal and pull request, but before I did the work I would like some input on if this is desired and any issues which are obvious to the maintainers but probably not to me.

@lod lod changed the title Feature request - Bitmap extraction support Feature request - Bitmap/Bitfield extraction support Dec 7, 2024
@SuperQ
Copy link
Member

SuperQ commented Dec 7, 2024

Related: #855.

I think it would be great to have this be an option in the generator. We can extract bits similarly to how we do EnumAsStateSet.

@lod
Copy link
Author

lod commented Dec 10, 2024

Pulling draft requirements/design together.

Using upsHighPrecBatteryPackStatus and lgpPduEntrySysStatus as well documented example fields.

The first is of type OCTET STRING, the second is Unsigned32, a variety of types must be supported. Bits are in big-endian format, this is explicit in the lgpPduEntrySysStatus description and the SNMP standard, if a bit-field is little endian then the user can define the bit field in reverse to obtain desired functionality.

   upsHighPrecBatteryPackStatus OBJECT-TYPE
       SYNTAX     OCTET STRING
       ACCESS     read-only
       STATUS     mandatory
       DESCRIPTION
               "The battery status for the pack only.
                bit 0 Disconnected
                bit 1 Overvoltage
                bit 2 NeedsReplacement
                bit 3 OvertemperatureCritical
                bit 4 Charger
                bit 5 TemperatureSensor
                bit 6 BusSoftStart
                bit 7 OvertemperatureWarning
                bit 8 GeneralError
                bit 9 Communication
                bit 10 DisconnectedFrame
                bit 11 FirmwareMismatch
    lgpPduEntrySysStatus OBJECT-TYPE
        SYNTAX      Unsigned32
        MAX-ACCESS  read-only
        STATUS      current
        DESCRIPTION
            "This value represents a bit-field of the various operational
             states of the PDU. The value is a logical OR of all of the
             following potential states of the PDU.  Note the bit-position
             is given parenthetically next to the operational state in the
             description below.  The bit position is assumed to be a big-endian
             format (least significant digit is the right-most digit).  The
             state is present in the PDU when the bit is on (value = 1).

             normalOperation(1)
                 The PDU is operating normally with no active warnings or alarms.
             startUp(2)
                 The PDU is in the startup state (initializing).  Control
                 and monitoring operations maybe inhibited or unavailable
                 while the PDU is in this state.  This state will clear
                 automatically when the PDU(s) are fully initialized and
                 ready to accept control and monitoring commands.
             normalWithWarning(8)
                 The PDU is operating normally with one or more active
                 warnings.  Appropriate personnel should investigate the
                 warning(s) as soon as possible and take appropriate action.
             normalWithAlarm(16)
                 The PDU is operating normally with one or more active
                 alarms.  Appropriate personnel should investigate the alarm(s)
                 as soon as possible and take appropriate action.
             abnormalOperation(32)
                The PDU is operating abnormally.  That is there is some
                failure within the system that is unexpected under normal
                operating conditions.  Appropriate personnel should investigate
                the cause as soon as possible.  The normal functioning of
                the system is likely inhibited.
  • The desired output is a label for each bit, much like is produced by enumAsStateSet
  • There will be a new type option for snmp.yml, BitsAsStateSet
  • Labels must be specified for each bit, in the new bit_values field
  • Bits will not all be mandatory, if a bit is not defined then its value will be ignored, a label will not be created
  • The first bit is specified as bit zero
  • Label names must meet the prometheus rules but not the best practices
  • There will be no facility for bit detailed descriptions, prometheus provides no label description option

Example SNMP description

    name: upsHighPrecBatteryPackStatus
      oid: 1.3.6.1.4.1.318.1.1.1.2.3.10.2.1.6
      type: BitsAsStateSet
      help: The battery status for the pack only - 1.3.6.1.4.1.318.1.1.1.2.3.10.2.1.6
      indexes:
      - labelname: upsHighPrecBatteryPackIndex
        type: gauge
      - labelname: upsHighPrecBatteryCartridgeIndex
        type: gauge
      bit_values:
        0: Disconnected
        1: Overvoltage
        2: NeedsReplacement
        3: OvertemperatureCritical
        4: Charger
        5: TemperatureSensor
        6: BusSoftStart
        7: OvertemperatureWarning
        8: GeneralError
        9: Communication
        10: DisconnectedFrame
        11: FirmwareMismatch

    - name: lgpPduEntrySysStatus
      oid: 1.3.6.1.4.1.476.1.42.3.8.20.1.25
      type: BitsAsStateSet
      help: This value represents a bit-field of the various operational states of
        the PDU - 1.3.6.1.4.1.476.1.42.3.8.20.1.25
      indexes:
      - labelname: lgpPduEntryIndex
        type: gauge
      lookups:
      - labels:
        - lgpPduEntryIndex
        labelname: lgpPduEntrySysAssignLabel
        oid: 1.3.6.1.4.1.476.1.42.3.8.20.1.15
        type: DisplayString
      bit_values:
        0: normalOperation
        1: startUp
        3: normalWithWarning
        4: normalWithAlarm
        5: abnormalOperation

@lod
Copy link
Author

lod commented Dec 10, 2024

@SuperQ is is possible to reuse the Bits type for this?

The snmp Bits type is handled by the bits function at https://github.com/prometheus/snmp_exporter/blob/main/collector/collector.go#L734

This function does almost exactly what I would like to do.

However I believe it is constrained to the OctetString SNMP type. Specifically I believe it is constrained to SnmpPdu objects which have an array value, which is only the SNMP 0x04 BER tag, which is the OCTET STRING and BITS types. Invoking the function by overriding a gauge type to Bits does not work, I believe because my underlying type was INTEGER which causes the array length check to fail.

I could be very wrong, I'm new to SNMP and golang.

Modifying the bits function to support direct values as well as arrays will probably achieve the desired outcome.

It does confuse things with the underlying SNMP BITS type though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants