Skip to content

Add endpoint rules engine #681

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Add endpoint rules engine #681

wants to merge 3 commits into from

Conversation

mtdowling
Copy link
Member

@mtdowling mtdowling commented May 7, 2025

This commit implements the Smithy rules engine, which will primarily allow clients to resolve endpoints with the rules engine.

Note: this PR requires a few unreleased changes from smithy/smithy. You'll need to build that locally and pTML to test this until a release is made for Smithy.

Overview

The rules engine is implemented as a stack-based VM. First it compiles rules engine expressions into bytecode, then evaluates it. These are the main building blocks of the VM:

  1. Opcodes: a single byte that tells the VM what to do.
  2. Operands: zero or more bytes that follow an opcode to do things like load a specific variable or jump to a bytecode position.
  3. Stack: the stack of variables that are pushed, popped, and evaluated by opcodes.
  4. Registers: variables that can change during execution. This includes input parameters, named variables captured by a function, and any generated registers (like when we eliminate common subexpressions). Registers are stored in an array and referenced throughout the bytecode using a byte. There can be up to 256 registers.
  5. Constant pool: a pool of variables that can be referenced by opcodes. There can be up to MAX_SHORT constants. Constants are referenced by their array index position. When compiling, duplicate constants referenced in rules resolve to the same constant.
  6. RulesProgram: a compiled program, constant pool, registers, and functions.

Bytecode layout

  • Byte 0: the version number. The first byte of bytecode is a version number represented as a negative byte. The version is decremented each time a change is made to the bytecode or related functionality. The version is validated when loading bytecode.
  • Byte 1: the number of required input parameters.
  • Byte 2: the number of synthetic registers required during evaluation (e.g., r0, r1, etc).

Example program

Registers:
  0: ParamDefinition[name=Endpoint, required=false, defaultValue=null, builtin=null]
  1: ParamDefinition[name=r0, required=false, defaultValue=null, builtin=null]

Constants:
  0: AttrExpression: isIp
  1: AttrExpression: normalizedPath
  2: AttrExpression: authority
  3: AttrExpression: scheme
  4: Template: StringTemplate[template=""{url#scheme}://{url#authority}{url#normalizedPath}is-ip-addr""]
  5: AttrExpression: path
  6: String: /port
  7: Template: StringTemplate[template=""{url#scheme}://{url#authority}/uri-with-port""]
  8: String: /
  9: Template: StringTemplate[template=""https://{url#scheme}-{url#authority}-nopath.example.com""]
  10: Template: StringTemplate[template=""https://{url#scheme}-{url#authority}.example.com/path-is{url#path}""]
  11: String: endpoint was invalid

Functions:
  0: parseURL
  1: stringEquals

Instructions: (version=1)
  003: TEST_REGISTER_SET       0
  005: JUMP_IF_FALSEY          120
  008: LOAD_REGISTER           0
  010: FN                      0
  012: SET_REGISTER            1
  014: JUMP_IF_FALSEY          120
  017: LOAD_REGISTER           1
  019: GET_ATTR                0
  022: IS_TRUE
  023: JUMP_IF_FALSEY          46
  026: LOAD_REGISTER           1
  028: GET_ATTR                1
  031: LOAD_REGISTER           1
  033: GET_ATTR                2
  036: LOAD_REGISTER           1
  038: GET_ATTR                3
  041: RESOLVE_TEMPLATE        4
  044: RETURN_ENDPOINT         0
  046: LOAD_REGISTER           1
  048: GET_ATTR                5
  051: LOAD_CONST              6
  053: FN                      1
  055: JUMP_IF_FALSEY          73
  058: LOAD_REGISTER           1
  060: GET_ATTR                2
  063: LOAD_REGISTER           1
  065: GET_ATTR                3
  068: RESOLVE_TEMPLATE        7
  071: RETURN_ENDPOINT         0
  073: LOAD_REGISTER           1
  075: GET_ATTR                1
  078: LOAD_CONST              8
  080: FN                      1
  082: JUMP_IF_FALSEY          100
  085: LOAD_REGISTER           1
  087: GET_ATTR                2
  090: LOAD_REGISTER           1
  092: GET_ATTR                3
  095: RESOLVE_TEMPLATE        9
  098: RETURN_ENDPOINT         0
  100: LOAD_REGISTER           1
  102: GET_ATTR                5
  105: LOAD_REGISTER           1
  107: GET_ATTR                2
  110: LOAD_REGISTER           1
  112: GET_ATTR                3
  115: RESOLVE_TEMPLATE        10
  118: RETURN_ENDPOINT         0
  120: LOAD_CONST              11
  122: RETURN_ERROR

Original endpoint rules definition:

@endpointRuleSet({
  "version": "1.3",
  "parameters": {
    "Endpoint": {
      "type": "string",
      "documentation": "docs"
    }
  },
  "rules": [
    {
      "documentation": "endpoint is set and is a valid URL",
      "conditions": [
        {
          "fn": "isSet",
          "argv": [
            {
              "ref": "Endpoint"
            }
          ]
        },
        {
          "fn": "parseURL",
          "argv": [
            "{Endpoint}"
          ],
          "assign": "url"
        }
      ],
      "rules": [
        {
          "conditions": [
            {
              "fn": "booleanEquals",
              "argv": [
                {
                  "fn": "getAttr",
                  "argv": [
                    {
                      "ref": "url"
                    },
                    "isIp"
                  ]
                },
                true
              ]
            }
          ],
          "endpoint": {
            "url": "{url#scheme}://{url#authority}{url#normalizedPath}is-ip-addr"
          },
          "type": "endpoint"
        },
        {
          "conditions": [
            {
              "fn": "stringEquals",
              "argv": [
                "{url#path}",
                "/port"
              ]
            }
          ],
          "endpoint": {
            "url": "{url#scheme}://{url#authority}/uri-with-port"
          },
          "type": "endpoint"
        },
        {
          "conditions": [
            {
              "fn": "stringEquals",
              "argv": [
                "{url#normalizedPath}",
                "/"
              ]
            }
          ],
          "endpoint": {
            "url": "https://{url#scheme}-{url#authority}-nopath.example.com"
          },
          "type": "endpoint"
        },
        {
          "conditions": [],
          "endpoint": {
            "url": "https://{url#scheme}-{url#authority}.example.com/path-is{url#path}"
          },
          "type": "endpoint"
        }
      ],
      "type": "tree"
    },
    {
      "error": "endpoint was invalid",
      "conditions": [],
      "type": "error"
    }
  ]
})

Using the rules engine with clients

Codegenerated clients:

A code generated client will (eventually) use a precompiled rules engine program and apply the EndpointRulesPlugin to the client in its constructor. We will eventually code generate the code needed to call RulesEngine#fromPrecompiled. This will avoid any need to load the complex rules engine traits at runtime or compile them.

var engine = new RulesEngine()
    .precompiledBuilder()
    .bytecode(bytecode)
    .constantPool("a", 1)
    .parameters(new ParameterDefinition("foo"))
    .functionNames("parseURL")
    .build();

var plugin = EndpointRulesPlugin.from(program);

Dynamic clients:

Dynamic clients can apply the EndpointRulesPlugin using EndpointRulesPlugin#create. This method will look for the rules engine traits on the service's schema, and if found, compile them and apply an endpoint resolver. If the config already has an endpoint resolver or if no traits are found, the plugin does nothing.

myBuilder.applyPlugin(EndpointRulesPlugin.create());

Benchmarks

On my M1 mac:

Benchmark         (optimize)                      (testName)  Mode  Cnt    Score     Error  Units
VmBench.evaluate         yes  example-complex-ruleset.json-1  avgt    5   60.173 ±   0.937  ns/op
VmBench.evaluate         yes          minimal-ruleset.json-1  avgt    5  224.231 ±   7.958  ns/op
VmBench.evaluate          no  example-complex-ruleset.json-1  avgt    5   61.350 ±   4.003  ns/op
VmBench.evaluate          no          minimal-ruleset.json-1  avgt    5  225.001 ±  20.896  ns/op

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@mtdowling mtdowling force-pushed the rules-engine branch 3 times, most recently from 9816154 to d32e674 Compare May 8, 2025 04:40
@mtdowling mtdowling force-pushed the rules-engine branch 3 times, most recently from a31dfa0 to f907ec4 Compare May 9, 2025 16:05
@mtdowling mtdowling force-pushed the rules-engine branch 12 times, most recently from 43abf17 to 7aec981 Compare May 14, 2025 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants