[FFL-1319] Add feature flags events exposure #5024

Strech · 2025-11-05T16:26:49Z

What does this PR do?

Adds new component under lib/datadog/open_feature/ folder that is providing customer an ability to configure Datadog feature flags provider. This provider is going to relay on Remote Configuration to deliver feature flags configurations (aka UFC Universal Flag Configuration).

Said provider is going into customer code as a part of configuration for OpenFeature

require 'open_feature/sdk'
require 'datadog/open_feature/provider'

Datadog.configure do |config|
  config.open_feature.enabled = true
end

OpenFeature::SDK.configure do |config|
  config.set_provider(Datadog::OpenFeature::Provider.new)
end

client = OpenFeature::SDK.build_client
client.fetch_string_value(flag_key: 'banner', default_value: 'default')
# => 'something-for-that-context'

Motivation:

This is a part of upcoming work across all libs.

Important

This code doesn't contain actual evaluation logic, but rather establish everything for it to be placed in the next PR.

Change log entry

Yes. OpenFeature: Add experimental OpenFeature component.

Additional Notes:

This PR was already reviewed as a part of #5022, but accidentally was marked as ready for review. This code is fresh and already applied some suggestions from the previous review, but not everything is possible due to the existing established way or the time boundaries.

Structure:

lib/datadog/open_feature
├── binding <------- A pre-made place for upcoming binding (written in Ruby)
├── exposures <----- Every time we fetch the flag value we are sending exposure event
├── provider.rb <--- A public API class we expose (see example above)
└── transport <----- Implementation that goes along with existing components

I would like to ask some pair of eyes on thread-safety and potential issues of new component. I've run the code over certain checks already, but never enough.

How to test the change?

CI and we will have new set of ST in #5022 enabled to prove that everything works.

github-actions · 2025-11-05T16:27:03Z

Thank you for updating Change log entry section 👏

^{Visited at: 2025-11-11 09:56:41 UTC}

pr-commenter · 2025-11-05T16:58:25Z

Benchmarks

Benchmark execution time: 2025-11-11 15:59:53

Comparing candidate commit 760f2c5 in PR branch ffl-1319-add-agent-communication-for-openfeature with baseline commit c560834 in branch master.

Found 1 performance improvements and 0 performance regressions! Performance is the same for 43 metrics, 2 unstable metrics.

scenario:tracing - Tracing.log_correlation

🟩 throughput [+9417.989op/s; +9696.611op/s] or [+9.509%; +9.791%]

github-actions · 2025-11-10T13:10:22Z

Typing analysis

Note: Ignored files are excluded from the next sections.

Untyped methods

This PR introduces 2 untyped methods and 10 partially typed methods. It increases the percentage of typed methods from 53.12% to 54.68% (+1.56%).

Untyped methods (+2-0)

❌ Introduced:

sig/datadog/open_feature/provider.rbs:8
└── def initialize: () -> void
sig/datadog/open_feature/transport/http/api.rbs:8
└── def self?.defaults: () -> untyped

Partially typed methods (+10-0)

❌ Introduced:

sig/datadog/open_feature/binding/evaluator.rbs:17
└── def generate: (::Symbol expected_type) -> untyped
sig/datadog/open_feature/configuration/settings.rbs:12
└── def self.add_settings!: (untyped base) -> void
sig/datadog/open_feature/exposures/batch_builder.rbs:9
└── def payload_for: (::Array[Models::Event] events) -> ::Hash[::Symbol, untyped]
sig/datadog/open_feature/exposures/models/event.rbs:18
└── def initialize: (::Hash[::Symbol, untyped]) -> void
sig/datadog/open_feature/exposures/models/event.rbs:28
└── def to_h: () -> ::Hash[::Symbol, untyped]
sig/datadog/open_feature/exposures/models/event.rbs:32
└── def self.extract_attributes: (::OpenFeature::SDK::EvaluationContext) -> ::Hash[::String, untyped]
sig/datadog/open_feature/exposures/worker.rbs:41
└── def perform: (*untyped) -> void
sig/datadog/open_feature/exposures/worker.rbs:45
└── def send_payload: (::Hash[::Symbol, untyped]) -> Core::Transport::Response?
sig/datadog/open_feature/transport/exposures.rbs:14
└── def initialize: (OpenFeature::Transport::Exposures::EncodedParcel, ?::Hash[::Symbol, untyped]?) -> void
sig/datadog/open_feature/transport/exposures.rbs:28
└── def send_exposures: (::Hash[::Symbol, untyped], ?headers: ::Hash[::Symbol, untyped]?) -> Core::Transport::Response

Untyped other declarations

This PR introduces 1 untyped other declaration and 4 partially typed other declarations. It increases the percentage of typed other declarations from 66.92% to 67.95% (+1.03%).

Untyped other declarations (+1-0)

❌ Introduced:

sig/datadog/open_feature/binding/resolution_details.rbs:5
└── attr_accessor value: untyped

Partially typed other declarations (+4-0)

❌ Introduced:

sig/datadog/open_feature/binding/resolution_details.rbs:15
└── attr_accessor flag_metadata: ::Hash[::String, untyped]?
sig/datadog/open_feature/binding/resolution_details.rbs:21
└── attr_accessor extra_logging: ::Hash[::String, untyped]?
sig/datadog/open_feature/exposures/models/event.rbs:10
└── @payload: ::Hash[::Symbol, untyped]
sig/datadog/open_feature/transport/exposures.rbs:12
└── attr_reader headers: ::Hash[::Symbol, untyped]

If you believe a method or an attribute is rightfully untyped or partially typed, you can add # untyped:accept to the end of the line to remove it from the stats.

lib/datadog/core/configuration/components.rb

sameerank · 2025-11-10T23:12:44Z

lib/datadog/open_feature/provider.rb

+          evaluation_context: evaluation_context
+        )
+
+        if result.error_code


Sorry I didn't have a quick answer for this earlier if all results with an error code should return the default_value. I'm still not sure if this works with libdatadog but I wrote the code in #5022 so that it works with the Ruby evaluator by setting error_code: nil when there is a successful match in the UFC.

dd-trace-rb/lib/datadog/open_feature/binding/internal_evaluator.rb

Lines 147 to 163 in 8352517

def create_success_result(value, variant, allocation_key, variation_type, do_log, reason)

ResolutionDetails.new(

value: value,

variant: variant,

error_code: nil, # nil for successful cases (so "if result.error_code" works correctly)

error_message: '', # Empty string for Ok cases (matches libdatadog FFI)

reason: reason,

allocation_key: allocation_key,

do_log: do_log,

flag_metadata: {

"allocation_key" => allocation_key,

"variation_type" => variation_type,

"do_log" => do_log

},

extra_logging: {}

)

end

I don't think this is blocking our ability to merge #5022 though because this interface is internal and we can adjust it as needed with no external impact.

I see two cases in libdatadog where there would be an error code Ok. For these two cases, this check works as intended. Since the error code is not null, we will return default_value.

EvaluationError::FlagDisabled => ErrorCode::Ok, EvaluationError::DefaultAllocationNull => ErrorCode::Ok,

https://github.com/DataDog/libdatadog/blob/3d9e641016f3a6c05bfdb1786c4b1a84342d4bbf/datadog-ffe-ffi/src/assignment.rs#L417-L418

The part where I'm confused --> is the error code also "Ok" if the evaluation is successful? Specifically, I'm looking at

Ok(_) => ErrorCode::Ok, // Always returns ErrorCode::Ok for success? Err(err) => ErrorCode::from(err),

https://github.com/DataDog/libdatadog/blob/3d9e641016f3a6c05bfdb1786c4b1a84342d4bbf/datadog-ffe-ffi/src/assignment.rs#L402-L409

If the error code is never nil, then this if result.error_code check won't work with libdatadog. I'd imagine we need to check if result.value is nil or if result.flag_metadata is an empty hash or check some other attribute to differentiate a disabled flag from a successful match

Maybe this is a question for @dd-oleksii: how do we differentiate a successful evaluation (with an allocation match) from the FlagDisabled/DefaultAllocationNull cases if both have a "Ok" error code?

The best option is to create interface we want, the best option is not to guess on my side and have it done via

if result.error? ...

Hence the simple error? method would solve the issue and could be replicated later with C-extension

Based on the Rust code I've looked at, yes, it returns in few places Ok, when it actually must return Err. Not sure what was the intent, but it should not shallow errors and return things as it and it's our job to represent it to the customer properly.

Yeah a method or an attribute that gives a simple answer to that question could be part of the binding code eventually

appraisal/ruby-3.4.rb

lib/datadog/open_feature/binding/evaluator.rb

lib/datadog/open_feature/exposures/models/event.rb

lib/datadog/open_feature/exposures/deduplicator.rb

y9v · 2025-11-11T10:48:06Z

lib/datadog/open_feature/binding/resolution_details.rb

+        :error_message,
+        :flag_metadata,
+        :allocation_key,
+        :do_log,


I think a method named log? would be more usual for Ruby

Yes, but that's an interface I would get from the binding 🤷🏼
To be honest I'm a bit desperate with the binding.

But I think that makes sense to do @sameerank

y9v · 2025-11-11T10:49:56Z

lib/datadog/open_feature/exposures/reporter.rb

+
+          @worker.enqueue(event)
+        rescue => e
+          @logger.debug { "OpenFeature: Reporter failed to enqueue exposure: #{e.class}: #{e.message}" }


this log message could be a bit more user-friendly. How about: "OpenFeature: Failed to report evaluation result: ..."?

y9v · 2025-11-11T10:51:12Z

lib/datadog/open_feature/exposures/worker.rb

+        end
+
+        def enqueue(event)
+          return false if forked?


is this correct? It would prevent event reporting for all forks

This one - not sure, I think I will allow it in forks and then see if we need to restrict it. So far I did it to limit things I have to care.

y9v · 2025-11-11T10:52:25Z

lib/datadog/open_feature/exposures/worker.rb

+
+        def flush
+          events, dropped = dequeue
+          send_events(events || [], dropped.to_i)


you can use Array(events) instead of events || []

and could dropped be anything other than an integer?

Yes, both could come as nil because of the call in start. I would wrap events with array 👍🏼

y9v · 2025-11-11T10:54:42Z

lib/datadog/open_feature/exposures/worker.rb

+        def send_payload(payload)
+          @flush_mutex.synchronize do
+            response = @transport.send_exposures(payload)
+            logger.debug { "OpenFeature: Send exposures response was not OK: #{response.inspect}" } unless response&.ok?


This log is visible for our customers - is "exposures" a term that we are using a lot in our public documentation and UI?

I don't think so, that's an internal terminology (not publicly mentioned, so no alternative). We probably could rephrase it as "unable to send evaluation details metric ..." (that's what we use in the interface)

lib/datadog/open_feature/transport/http/api.rb

y9v · 2025-11-11T10:57:14Z

lib/datadog/open_feature/transport/http/client.rb

+
+            yield(api, env)
+          rescue => e
+            message = "Internal error during #{self.class.name} request. Cause: #{e.class.name} #{e.message} " \


Do we need to include self.class.name here in the message, as we are already reporting the partial backtrace?

It's a copy-paste, I will try to rework the message 👍🏼

lib/datadog/open_feature/transport/http/exposures.rb

y9v · 2025-11-11T11:00:38Z

lib/datadog/open_feature/binding.rb

+  end
+end
+
+require_relative 'binding/evaluator'


could we move requires to the top of the file please?

y9v · 2025-11-11T11:01:34Z

lib/datadog/open_feature/component.rb

+        return unless settings.open_feature.enabled
+
+        unless settings.remote.enabled
+          logger.warn('OpenFeature: Could not be enabled without Remote Configuration Management available. To enable Remote Configuration, see https://docs.datadoghq.com/agent/remote_config')


How about:

"OpenFeature could not be enabled without Remote Configuration. The enable Remote Configuration, see https://docs.datadoghq.com/agent/remote_config"

y9v · 2025-11-11T11:06:31Z

lib/datadog/open_feature/evaluation_engine.rb

+        # In the example from the OpenFeature there is zero trust to the result of the evaluation
+        # do we want to go that way?
+
+        result = @evaluator.get_assignment(flag_key, evaluation_context, expected_type, Time.now.utc.to_i)


Do we need to call ufc on Time.now, considering that we are converting it to int anyway?

This is a mock call and it will change in the other PR, so, we don't need it.

y9v · 2025-11-11T11:06:54Z

lib/datadog/open_feature/evaluation_engine.rb

+
+        result
+      rescue => e
+        @telemetry.report(e, description: 'OpenFeature: Failed to fetch value for flag')


Should we add the flag key to this message?

Nope, because it's private and we are not allowed to report it

P.S Flag name might uncover certain business-value and hence considered private

y9v · 2025-11-11T11:07:44Z

lib/datadog/open_feature/exposures.rb

+require_relative 'exposures/buffer'
+require_relative 'exposures/worker'
+require_relative 'exposures/deduplicator'
+require_relative 'exposures/reporter'


Could we place those at the beginning of the file for consistency with our code base?

Actually it's different in our codebase, but will do. What I don't like - logically they will define module, when the definition is actually here. But don't mind to change

lib/datadog/open_feature/provider.rb

y9v · 2025-11-11T11:10:53Z

lib/datadog/open_feature/remote.rb

+          data = content.data.read
+          content.data.rewind
+
+          raise ReadError, 'EOF reached' if data.nil?


do we want to raise here, or log a warning?

This is a part that I have no answer, but that's what all the components have in their remote 🤷🏼

Strech · 2025-11-11T13:12:19Z

lib/datadog/core/remote/client/capabilities.rb

              register_receivers(Datadog::DI::Remote.receivers(@telemetry))
            end

+            if settings.respond_to?(:open_feature) && settings.open_feature.enabled


Unfortunately, I have to keep this check because of some tests are impossible to rewrite

Are they mocking the API? :( Thanks for trying anyway.

I was able to remove one check tho, so 50% success rate. The test is called loading_spec.rb and it is checking the base minimum configuration without all components loaded. And then load them 1-by-1

ivoanjo

Left a few notes!

ivoanjo · 2025-11-11T14:33:45Z

lib/datadog/core/remote/client/capabilities.rb

              register_receivers(Datadog::DI::Remote.receivers(@telemetry))
            end

+            if settings.respond_to?(:open_feature) && settings.open_feature.enabled


Are they mocking the API? :( Thanks for trying anyway.

lib/datadog/open_feature/component.rb

ivoanjo · 2025-11-11T14:59:09Z

lib/datadog/open_feature/component.rb

+      def shutdown!
+        @worker&.flush
+        @worker&.stop(true)
+      end


Note that calling flush on shutdown will hold off stopping the customer's application. So... we should probably have a good timeout here (it's not clear to me that we do?)

ivoanjo · 2025-11-11T15:01:01Z

lib/datadog/open_feature/configuration/settings.rb

Minor: Consider maybe moving this one folder up? At least to me it seems a lot of boilerplate to have an empty configuration.rb + a whole extra folder just to support "enabled".

Alright, I will reduce the unnecessary files 👍🏼

ivoanjo · 2025-11-11T15:04:17Z

lib/datadog/open_feature/binding.rb

Minor: This is actually not needed? Consider moving the require slightly elsewhere and removing this; we already have so much complexity, I think if we avoid a few dummy files, that's a win! (See: This PR having 72 files!)

Same applies to other similar files -- I think we should avoid them as much as possible.

ivoanjo · 2025-11-11T15:07:49Z

lib/datadog/open_feature/exposures/batch_builder.rb

+        def build_context(settings)
+          env = extract_env(settings)
+          service = extract_service(settings)
+          version = extract_version(settings)
+
+          context = {}
+          context[:env] = env if env
+          context[:service] = service if service
+          context[:version] = version if version
+
+          context
+        end
+
+        def extract_env(settings)
+          return settings.env if settings.respond_to?(:env)
+          return settings.tags['env'] if settings.respond_to?(:tags)
+
+          nil
+        end
+
+        def extract_service(settings)
+          return settings.service if settings.respond_to?(:service)
+          return settings.tags['service'] if settings.respond_to?(:tags)
+
+          nil
+        end
+
+        def extract_version(settings)
+          return settings.version if settings.respond_to?(:version)
+          return settings.tags['version'] if settings.respond_to?(:tags)
+
+          nil
+        end


So in core/settings.rb we have a bunch of code to match the service/env/version with the tags. This seems to somewhat repeat that logic... I'm curious why it's needed?

ivoanjo · 2025-11-11T15:08:49Z

lib/datadog/open_feature/exposures/buffer.rb

+module Datadog
+  module OpenFeature
+    module Exposures
+      BufferBaseClass =
+        (Core::Environment::Ext::RUBY_ENGINE == 'ruby') ? Core::Buffer::CRuby : Core::Buffer::ThreadSafe
+
+      class Buffer < BufferBaseClass


Minor: From what I understood, this feature will require libdatadog and thus not work on JRuby. So... should we raise if we're not on CRuby instead?

I didn't know that libdatadog is not working on JRuby, noted 👍🏼

@ivoanjo Do you think I just can use CRuby as we will not announce the JRuby support on that feature.

Yes! Maybe keep the branch and set it to nil on JRuby? Or :unsupported? Just so that if we ever want to support JRuby, the code "reminds" us to update itself.

(Also on libdatadog + JRuby it indeed doesn't work at the moment and I have not seen any plans to change that)

ivoanjo · 2025-11-11T15:13:27Z

lib/datadog/open_feature/exposures/worker.rb

+        def enqueue(event)
+          buffer.push(event)
+
+          flush if buffer.length >= @buffer_limit


This flush is slightly... suspicious to me. Won't it run on the same thread as the caller, not on the background worker? Is that the intended semantics here?

Going to double check it, good point 👍🏼

Co-authored-by: Ivo Anjo <[email protected]>

github-actions bot added core Involves Datadog core libraries appsec Application Security monitoring product labels Nov 5, 2025

Strech force-pushed the ffl-1319-add-agent-communication-for-openfeature branch from dbc899c to 0a25c37 Compare November 6, 2025 11:44

sameerank mentioned this pull request Nov 7, 2025

[FFL-1361] Evaluation in binding in ruby #5022

Draft

Strech force-pushed the ffl-1319-add-agent-communication-for-openfeature branch 3 times, most recently from b1a978f to 9e7efb0 Compare November 10, 2025 10:02

github-actions bot added integrations Involves tracing integrations profiling Involves Datadog profiling tracing labels Nov 10, 2025

Strech changed the base branch from add-openfeature-component to master November 10, 2025 12:42

This comment has been minimized.

Sign in to view

Strech added 15 commits November 10, 2025 15:52

Add OpenFeature component

3b09d45

Fir require of the OpenFeature component

71f8a44

Add skeleton for evaluation API

a4ee9f9

Add guard-clause to the Provider and Evaluator

c37d46a

Add comment regarding GC

c403b37

Update settings and component interface impl.

ca55e31

Add RBS files for OpenFeature component

f1e9e65

Fix typing for Remote module

fa7e1f9

Add openfeature-sdk stub and fix component typing

2f6d276

Fix settings RBS

388c32b

Wrap FFI code into a separate module

7e21f87

Add tests for remote config

f6de27e

Add component tests

5c74657

Add new environment configuration definition

d3a464e

Add component specs

2de1cec

Strech added 6 commits November 10, 2025 15:56

Fix tests after rebase

dc5425b

Adjust code to be thread-safe

c77ba19

Update RBS definitions

c9e09b9

Address PR feedback

5f4a6c2

Change RC and reconfiguration flow

87393dd

Fix standardrb complaints

2221c34

Strech force-pushed the ffl-1319-add-agent-communication-for-openfeature branch from d7424f9 to 2221c34 Compare November 10, 2025 15:04

sameerank reviewed Nov 10, 2025

View reviewed changes

lib/datadog/core/configuration/components.rb Outdated Show resolved Hide resolved

sameerank reviewed Nov 11, 2025

View reviewed changes

Strech added 5 commits November 11, 2025 10:22

Fix duplication issue in components

2d964ac

Fix loading specs

6db64b0

Remove stale types

dc4c235

Add additional test case for RC being disabled

783d2a6

Add missing RBS definitions

01ec1bb

Strech marked this pull request as ready for review November 11, 2025 10:09

Strech requested a review from a team as a code owner November 11, 2025 10:09

Strech added the openfeature A new component that provider an ability to configure feature flags label Nov 11, 2025

y9v reviewed Nov 11, 2025

View reviewed changes

lib/datadog/open_feature/transport/http/api.rb Show resolved Hide resolved

y9v reviewed Nov 11, 2025

View reviewed changes

Strech commented Nov 11, 2025

View reviewed changes

y9v approved these changes Nov 11, 2025

View reviewed changes

Apply suggested changes from the review

64b1636

ivoanjo removed the profiling Involves Datadog profiling label Nov 11, 2025

y9v removed the appsec Application Security monitoring product label Nov 11, 2025

ivoanjo reviewed Nov 11, 2025

View reviewed changes

Strech and others added 3 commits November 11, 2025 16:29

Update lib/datadog/open_feature/component.rb

760f2c5

Co-authored-by: Ivo Anjo <[email protected]>

Reduce boilerplate and unnecessary files

77bdd8d

Fix worker flushing on demand and shutdown

4fde21f

	def create_success_result(value, variant, allocation_key, variation_type, do_log, reason)
	ResolutionDetails.new(
	value: value,
	variant: variant,
	error_code: nil, # nil for successful cases (so "if result.error_code" works correctly)
	error_message: '', # Empty string for Ok cases (matches libdatadog FFI)
	reason: reason,
	allocation_key: allocation_key,
	do_log: do_log,
	flag_metadata: {
	"allocation_key" => allocation_key,
	"variation_type" => variation_type,
	"do_log" => do_log
	},
	extra_logging: {}
	)
	end

[FFL-1319] Add feature flags events exposure #5024

Are you sure you want to change the base?

[FFL-1319] Add feature flags events exposure #5024

Conversation

Strech commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pr-commenter bot commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

scenario:tracing - Tracing.log_correlation

Uh oh!

github-actions bot commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Typing analysis

Untyped methods

Untyped other declarations

Uh oh!

This comment has been minimized.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Strech Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Strech Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Strech commented Nov 5, 2025 •

edited

Loading

github-actions bot commented Nov 5, 2025 •

edited

Loading

pr-commenter bot commented Nov 5, 2025 •

edited

Loading

github-actions bot commented Nov 10, 2025 •

edited

Loading

Strech Nov 11, 2025 •

edited

Loading

Strech Nov 11, 2025 •

edited

Loading