Set unique flow id & timestamp to each flow #343

sananguliyev · 2025-05-27T08:45:08Z

In the current build of the traced stream, tracing events are currently grouped only by section. This setup makes it difficult to link these events together and understand the complete message journey in the stream flow. The new implementation addresses this issue by:

Assigning a unique flow ID to each message journey
Adding a timestamp to each tracing event to provide a clearer order of the steps the message went through
Introducing EventsByFlowID, which organizes events by flow and arranges them in chronological order based on timestamp.

Additionally: If no tags exist: git describe --tags will fail, but the error message will be suppressed by 2>/dev/null, and then it will fall back to "v0.0.0". Currently, it prints fatal: No names found, cannot describe anything. when make file tries to generate version.

Fixes #344

sananguliyev · 2025-05-30T12:53:45Z

@jem-davies @gregfurman
Could you please take a look when you have time?

gregfurman · 2025-06-01T19:02:11Z

@sananguliyev Hey! Thanks for the contribution and apologies for the delay. Will take a look this eve 😄

gregfurman

Some questions relating to the metadata approach. Think better internal tracing is a brilliant idea so I'm keen to hear your thoughts on my comments 😄

gregfurman · 2025-06-01T23:23:53Z

internal/bundle/tracing/events.go

+	}
+
+	// Try to use OpenTelemetry trace ID if available
+	if traceID := tracing.GetTraceID(part); traceID != "" && traceID != "00000000000000000000000000000000" {


Can we not rely soley on the TraceID as the FlowID? Or are you trying to account for the case where a span wasn't initialised (I'm no expert in OTEL so pardon my ignorance)?

The purpose here is to reuse something if exists. If the user enabled tracing then message has its own ID and here I wanted to use it instead of generating new one, and it's also better in order to link trace and internal events if someone wants to use flow ID for some purposes.

@gregfurman I did not get whether you agree to keep this like this.

Yeah I suppose this makes sense. Do you imagine a situation where the flowID and traceID should ever differ?

Also, perhaps we set 00000000000000000000000000000000 to a const so it's not a magic string?

I do not think they should ever differ if tracing enabled. The whole point here is to have some ID, but not optional like trace ID. And also if there is trace ID I do not see a problem with having different IDs.

gregfurman · 2025-06-01T23:23:55Z

internal/bundle/tracing/events.go

+	if flowID, exists := part.MetaGetMut("_bento_flow_id"); exists {
+		if flowIDStr, ok := flowID.(string); ok && flowIDStr != "" {
+			return flowIDStr
+		}
+	}


I'm not sure storing this in the message metadata is the best approach. For example, there are instances where metadata is dumped to a JSON body, sent as headers, and sometimes even fully reset/cleared. Also, the fact that metadata is mutable means a user could hypothetically overwrite this data and lose it.

If we are going to be storing this info, what're your thoughts on perhaps rather using the message context as a store (similar to how OTEL does it)?

i.e

type flowTraceKey struct{} // store with an internal struct as a key to avoid collisions/overwrites func WithFlowD(ctx context.Context, flowID string) context.Context { return context.WithValue(ctx, traceKey{}, traceID) } func SetFlowID(p *message.Part) string { ctx := message.GetContext(p) // V7 UUIDs are sortable since they're time ordered // hence no real need for timestamps as well // (see https://pkg.go.dev/github.com/google/uuid#NewV7) flowID, _ := uuid.NewV7() return context.WithValue(ctx, traceKey{}, flowID.String()) } func GetFlowID(ctx context.Context) string { if id, ok := ctx.Value(traceKey{}).(string); ok { return id } return "" }

TBH, initially I did not plan to add it as bloblang method and wanted to make it only accessible from tracing summary api, but then just decided to add it in order to make it accessible from bloblang, too. I did not investigate too much before adding bloblang method. Your suggestion sounds good, but I first need to check it.

Not exposing this as a bloblang method is also an option, and I can remove this method implementation.

I'd prefer the context based approach. Else, the more we can rely on open telemetry tracing context the better!

Hey @sananguliyev 👋 Do you keen have capacity to finish up this PR? I'm happy to take it over otherwise 😄 Lmk!

Hey @gregfurman, I might have some time next week. I will try to apply what you suggested until next weekend. If you want to release in the next a couple of days then feel free to take it over :)

No pressure 🙂 Let me move it back into draft then

I found some time today, and fixed the issue you pointed here. You're absolutely right about metadata being mutable and users can mess with that, which should not be the case since flow ID is immutable by nature. So please take another look. And feel free to make changes here in case you see something small, in order to move faster.

@gregfurman do you have capacity to review this? I am planning to make another contribution to enable CDC source, but first want to finish this and then contribute the component.

Hey! Sorry I must've missed the notification since I placed the PR into draft 🤦

I'll give the PR another look this evening 🙂

gregfurman

Thanks for the follow ups on this!

I'm thinking whether we want to ever set the flowID values outside of the traceInput (i.e when a message part is first created), although I suppose there's the case where we want to create child-spans.

bento/internal/bundle/tracing/input.go

Lines 52 to 57 in 852a365

    
           if t.e.IsEnabled() { 
        
           	_ = tran.Payload.Iter(func(i int, part *message.Part) error { 
        
           		_ = atomic.AddUint64(t.ctr, 1) 
        
           		t.e.Add(EventProduceOf(part)) 
        
           		return nil 
        
           	})

Let me know!

gregfurman · 2025-10-30T20:08:12Z

internal/bundle/tracing/processor.go

-		// TODO: Find a better way of locating deletes (using batch index tracking).
-		t.e.Add(EventDeleteOf())


Nice! Should we deprecate EventDeleteOf() in favour of this EventDeleteOfPart function?

gregfurman · 2025-10-30T20:09:27Z

internal/impl/pure/processor_workflow_test.go


 	assert.Equal(t, `{"content":"waddup","id":"HELLO WORLD","meta":{"workflow":{"succeeded":["fooproc"]}}}`, outValue)
+
+	// Normalize events for testing by removing FlowID, Timestamp, and _bento_flow_id metadata


Are we still using this _bento_flow_id approach?

Yes, we need some cleanup.

I thought we were using the struct{} approach instead of the string for specifying the key?

gregfurman · 2025-10-30T20:31:14Z

internal/bloblang/query/functions.go

+			`meta flow_id = flow_id()`,
+		),
+	).Experimental(),
+	func(fCtx FunctionContext) (any, error) {


Does it make sense to be setting the flowID from within this bloblang function?

Should it not rather just retrieve the flowID/traceID

Since we have this, I thought it makes sense to have access to this value in case some users want to add it to logs or somewhere else for logging or even some other purposes, because it's same as trace ID which we can currently access.

I think this function should change to only receive the flowID, and not set it if not found. Wdyt?

The main idea here is it should always be available. Regardless where it's first called it should either return existing and create new and set to ctx. E.g. we have steps a and b. it's always created at a, and when b reads it it get the already created one. If there is any case and for whatever reason we add one more step which runs before the a then it creates flow id and a just get what is there instead of creating.

This should always be created anyway if tracing is enabled since the input component is wrapped in a tracedInput component.

Imagine this case:

input: resource: A pipeline: processors: - resource: B - mapping: "meta flow_id = flow_id()" # <-- set and get here output: resource: C

If not set at the input layer, setting a flow_id at the mapping step does not help us at all with knowing the flow of a message through a pipeline -- only that it was sent on to a resource C

Since all messages (in a normal pipeline) are initially created at the input layer, EventProduce (which is at the input layer) means there should be an attached flowID -- making the create call redundant.

bento/internal/bundle/tracing/events.go

Lines 42 to 48 in b90d55a

return NodeEvent{

Type: EventProduce,

Content: string(part.AsBytes()),

Meta: meta,

FlowID: getOrCreateFlowID(part),

Timestamp: time.Now(),

}

Does this make sense? Or perhaps I conceptually not understanding the inclusion of flowID and when we expect it to be called.

gregfurman · 2025-10-30T20:32:50Z

internal/bloblang/query/functions.go

+			return flowID, nil
+		}
+
+		// Try to use OpenTelemetry trace ID if available


nit: Can we remove some of these comments? Also this looks largely identical to the getOrCreateFlowID

We definitely need some general cleanup.

gregfurman · 2025-10-30T20:36:36Z

internal/bundle/tracing/events.go

+	}
+
+	// Try to use OpenTelemetry trace ID if available
+	if traceID := tracing.GetTraceID(part); traceID != "" && traceID != "00000000000000000000000000000000" {


Yeah I suppose this makes sense. Do you imagine a situation where the flowID and traceID should ever differ?

Also, perhaps we set 00000000000000000000000000000000 to a const so it's not a magic string?

gregfurman · 2025-10-30T20:38:16Z

public/service/tracing_test.go

+						}
+						meta := make(map[string]any)
+						for mk, mv := range ev.Meta {
+							if mk != "_bento_flow_id" {


Already mentioned. Do we still want this metadata?

sananguliyev added 2 commits May 27, 2025 10:18

Set unique id & timestamp to each flow

17b45cf

Simplify sorting

744227b

sananguliyev requested review from gregfurman and jem-davies as code owners May 27, 2025 08:45

Generated docs for flow id

b50515b

sananguliyev mentioned this pull request May 27, 2025

Assign unique flow ID to each message #344

Open

Fix tests by removing _bento_flow_id from event meta data

f2e9295

sananguliyev changed the title ~~Set unique id & timestamp to each flow~~ Set unique flow id & timestamp to each flow May 27, 2025

sananguliyev added 3 commits May 27, 2025 22:15

Fix rest of the failing tests due to new metadata

cef67c8

suppress error messages if no git tags exists

cbe4bc7

Cleaned-up unnecessary comments

09903b5

gregfurman reviewed Jun 1, 2025

View reviewed changes

gregfurman marked this pull request as draft October 18, 2025 17:13

Refactor flow ID handling to use context instead of metadata

b90d55a

sananguliyev marked this pull request as ready for review October 18, 2025 22:08

gregfurman reviewed Oct 30, 2025

View reviewed changes

	if t.e.IsEnabled() {
	_ = tran.Payload.Iter(func(i int, part *message.Part) error {
	_ = atomic.AddUint64(t.ctr, 1)
	t.e.Add(EventProduceOf(part))
	return nil
	})

		// TODO: Find a better way of locating deletes (using batch index tracking).
		t.e.Add(EventDeleteOf())


		assert.Equal(t, `{"content":"waddup","id":"HELLO WORLD","meta":{"workflow":{"succeeded":["fooproc"]}}}`, outValue)

		// Normalize events for testing by removing FlowID, Timestamp, and _bento_flow_id metadata

	return NodeEvent{
	Type: EventProduce,
	Content: string(part.AsBytes()),
	Meta: meta,
	FlowID: getOrCreateFlowID(part),
	Timestamp: time.Now(),
	}

Set unique flow id & timestamp to each flow #343

Are you sure you want to change the base?

Set unique flow id & timestamp to each flow #343

Uh oh!

Conversation

sananguliyev commented May 27, 2025 • edited by jem-davies Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sananguliyev commented May 30, 2025

Uh oh!

gregfurman commented Jun 1, 2025

Uh oh!

gregfurman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gregfurman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

sananguliyev commented May 27, 2025 •

edited by jem-davies

Loading