Skip to content

Conversation

@SimonHeybrock
Copy link
Member

Summary

  • Add logs: StreamLUT parameter to StreamMapping for mapping Kafka f144 source names to internal names (consistent with detectors/monitors)
  • Extend nexus_helpers.py with units extraction and code generation for f144_log_streams
  • Add Bifrost f144 stream mappings for all motion and sample environment streams (24 streams total)

StreamMapping changes

Previously, f144 log data used Kafka source names directly (e.g., BIFRO-DtCar:MC-RotZ-01:Mtr.RBV). Now supports mapping to friendlier internal names (e.g., detector_rotation).

nexus_helpers.py enhancements

New CLI tool for extracting f144 stream info from NeXus files:

# List all f144 streams
python -m ess.livedata.nexus_helpers file.hdf --f144

# Generate code for a topic
python -m ess.livedata.nexus_helpers file.hdf --generate --topic bifrost_motion

Bifrost configuration

Combined f144_log_streams dict keeps source names, internal names, units, and topics co-located:

f144_log_streams = {
    'detector_rotation': {
        'source': 'BIFRO-DtCar:MC-RotZ-01:Mtr.RBV',
        'units': 'deg',
        'topic': 'bifrost_motion',
    },
    ...
}

Both f144_attribute_registry and the StreamLUT are derived from this single source of truth.

Test plan

  • All existing tests pass
  • New tests for StreamMapping.log_topics property
  • New tests for suggest_internal_name, filter_f144_streams, generate_f144_log_streams_code

🤖 Generated with Claude Code

SimonHeybrock and others added 2 commits December 8, 2025 09:35
Previously, f144 log data used Kafka source names directly as stream names
(e.g., 'BIFRO-DtCar:MC-RotZ-01:Mtr.RBV'). This change adds support for
mapping these source names to friendlier internal names (e.g., 'detector_rotation'),
consistent with how detectors and monitors are handled.

Changes:
- Replace `log_topics` parameter with `logs: StreamLUT` in StreamMapping
- Add `_make_dev_logs` helper in _ess.py for dev mode
- Update `make_dev_stream_mapping` to accept `log_names` parameter
- Add Bifrost log mappings for detector_rotation and sample_rotation
- Update dummy instrument to include log mappings for timeseries tests
- Add tests for StreamMapping.log_topics property

Original prompt: I am trying to update bifrost/specs.py to use the real values
in f144_attribute. The real source_name we need is something found in the file
(extracted info) - for example, BIFRO-DtCar:MC-RotZ-01:Mtr.RBV. Does this
support also name mapping for NXlog (f144)? OR is the f144_attribute_registry
setup insufficient? Or is it independent, applied after stream-mapping?

Follow-up: I think we should take option B. Please implement this in a new branch.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add tooling to help generate f144_log_streams configuration from NeXus files:

- Add `units` field to StreamInfo dataclass, extracted from 'value' datasets
- Add `suggest_internal_name()` to derive internal names from group paths
- Add `filter_f144_streams()` to filter by writer module, topic, and patterns
- Add `generate_f144_log_streams_code()` to generate Python dict code
- Add CLI options: --f144, --topic, --exclude, --generate, --var-name

Refactor bifrost configuration to use combined f144_log_streams dict:
- Define `f144_log_streams` mapping internal name -> {source, units}
- Derive `f144_attribute_registry` from `f144_log_streams`
- Derive StreamLUT in `_make_bifrost_logs()` from `f144_log_streams`

This keeps source names, internal names, and units co-located in one place.

Example usage:
  python -m ess.livedata.nexus_helpers file.hdf --generate --topic bifrost_motion

Original prompt: Can you extend the nexus_helpers.py script to also extract
and list units? Or, going beyond that, it would be good if we could turn this
into a better tool to generate code for attr registry and log_source_mapping.
Is having the two in two parts convenient?

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for mapping Kafka f144 log stream source names (e.g., EPICS PV names) to internal stream names in ESSlivedata, making the system consistent with how detectors and monitors are already handled.

Key changes:

  • Modified StreamMapping to accept a logs: StreamLUT parameter instead of log_topics: set, with log_topics becoming a computed property
  • Enhanced nexus_helpers.py with units extraction from NeXus files and CLI tools for generating f144 stream configuration code
  • Added comprehensive Bifrost f144 log stream mappings (24 streams total) for motion and sample environment data

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/ess/livedata/kafka/stream_mapping.py Replaced log_topics parameter with logs: StreamLUT, making log_topics a computed property consistent with other stream types
src/ess/livedata/nexus_helpers.py Added units extraction to StreamInfo, new helper functions (suggest_internal_name, filter_f144_streams, generate_f144_log_streams_code), and enhanced CLI with argparse for code generation
src/ess/livedata/config/instruments/bifrost/specs.py Added comprehensive f144_log_streams dictionary with 24 entries for motion and sample environment streams, deriving f144_attribute_registry from it
src/ess/livedata/config/instruments/bifrost/streams.py Added _make_bifrost_logs() function to create StreamLUT from f144_log_streams, integrated logs into both DEV and PROD configurations
src/ess/livedata/config/instruments/dummy/streams.py Updated DEV configuration to include log_names parameter using f144_attribute_registry keys
src/ess/livedata/config/instruments/_ess.py Added _make_dev_logs() helper function and updated make_dev_stream_mapping() to accept log_names parameter
tests/config/streams_test.py Added tests for StreamMapping.log_topics property behavior with and without logs
tests/nexus_helpers_test.py Added comprehensive tests for new functions including suggest_internal_name, filter_f144_streams, and generate_f144_log_streams_code
Comments suppressed due to low confidence (1)

src/ess/livedata/config/instruments/dummy/streams.py:44

  • The dummy instrument PROD configuration should include a logs parameter in the StreamMapping to be consistent with the DEV configuration and the pattern used in Bifrost. The dummy instrument has f144_attribute_registry defined in specs.py with 'motion1', and the DEV configuration uses log_names=list(instrument.f144_attribute_registry.keys()), but the PROD configuration (lines 40-44) doesn't include a logs parameter. Consider adding a _make_dummy_logs() function similar to _make_bifrost_logs() and including it in the PROD StreamMapping.
    StreamingEnv.PROD: StreamMapping(
        **make_common_stream_mapping_inputs(instrument='dummy'),
        detectors=_make_dummy_detectors(),
        area_detectors=_make_dummy_area_detectors(),
    ),

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if i > 0:
name = parts[i - 1]
# Remove common suffixes like '_r0', '_01' etc.
import re
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import re statement should be moved to the top of the file with other imports rather than being imported inside the function. This follows Python best practices and improves code readability. The same issue exists in filter_f144_streams (line 162).

Copilot uses AI. Check for mistakes.
exclude_patterns:
List of regex patterns to exclude from results (matched against group_path).
"""
import re
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import re statement should be moved to the top of the file with other imports rather than being imported inside the function. This follows Python best practices and improves code readability.

Copilot uses AI. Check for mistakes.
Comment on lines 187 to 188
The generated dictionary maps internal names to source and units info,
which can be used to derive both f144_attribute_registry and StreamLUT.
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring should mention that the generated dictionary includes 'topic' in addition to 'source' and 'units'. Currently it says "maps internal names to source and units info" but should say "maps internal names to source, units, and topic info" to accurately reflect the intended output structure.

Copilot uses AI. Check for mistakes.
info = by_name[name]
units = info.units or 'dimensionless'
lines.append(
f" '{name}': {{'source': '{info.source}', 'units': '{units}'}},"
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The generate_f144_log_streams_code function does not include the 'topic' key in the generated dictionary entries. However, the actual f144_log_streams in bifrost/specs.py includes 'topic', and _make_bifrost_logs() in bifrost/streams.py (line 72) expects it to be present. The generated code should include 'topic': '{topic}' in each dictionary entry to match the actual usage pattern.

Suggested change
f" '{name}': {{'source': '{info.source}', 'units': '{units}'}},"
f" '{name}': {{'source': '{info.source}', 'units': '{units}', 'topic': '{topic}'}},"

Copilot uses AI. Check for mistakes.
]
code = generate_f144_log_streams_code(infos, topic='motion')
assert "f144_log_streams: dict[str, dict[str, str]] = {" in code
assert "'motor': {'source': 'MOTOR:PV:RBV', 'units': 'degrees'}" in code
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test expectation should be updated to include the 'topic' key in the generated code output. The actual f144_log_streams dictionary used in production (e.g., bifrost/specs.py) includes the 'topic' key, and the _make_bifrost_logs() function expects it. The assertion should check for something like "'motor': {'source': 'MOTOR:PV:RBV', 'units': 'degrees', 'topic': 'motion'}" instead.

Suggested change
assert "'motor': {'source': 'MOTOR:PV:RBV', 'units': 'degrees'}" in code
assert "'motor': {'source': 'MOTOR:PV:RBV', 'units': 'degrees', 'topic': 'motion'}" in code

Copilot uses AI. Check for mistakes.
- Move `import re` to module-level imports instead of inside functions
- Update docstring to mention that generated dict includes topic info
- Include 'topic' key in generated f144_log_streams dictionary entries
- Update test assertions to match new output format with topic

Addresses review comments from PR #589.

Original prompt: Please use a new worktree to address comments in a #589. Commit and push, then cleanup worktree.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
},
# Sample environment streams (topic: bifrost_sample_env)
'heater_1': {
'source': 'YMIR-SEE:SE-LS336-004:HTR1',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange to see Ymir stuff in here? (or maybe not?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from the latest coda file.

# Combined f144 log stream configuration.
# Maps internal name -> {source: Kafka source name, units: unit string, topic: topic}
# Generated using: python -m ess.livedata.nexus_helpers <file> --generate --topic <t>
f144_log_streams: dict[str, dict[str, str]] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a mapping only applicable to Bifrost, or will other instruments follow later?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, others will follow later, but every instrument will have its own unique mapping (or maybe with some overlap, for shared equipment).

if i > 0:
name = parts[i - 1]
# Remove common suffixes like '_r0', '_01' etc.
name = re.sub(r'_[rt]\d+$', '', name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So above, in the f144_log_streams mapping, there are entries like temperature_1, temperature_2.
Can removing the suffixes lead to duplicate suggested names?
Or did I not follow the logic correctly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it could, yes. But probably only if ECDC decided to store logs with identical names in different NeXus groups.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say that's not impossible? Something like 'temperature' is quite a common name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think you are right, the main issue is actually transformation chains, which those suffixes represent!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, I forgot about transformation chains!

"Usage: python -m ess.livedata.nexus_helpers <nexus_file.hdf>\n"
)
sys.exit(1)
parser = argparse.ArgumentParser(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Suffixes like '_r0', '_t0' etc. represent transformation chains in NeXus,
not arbitrary naming conventions. These should be preserved in the suggested
internal names.

Changes:
- Remove regex-based suffix stripping from suggest_internal_name()
- Update docstring to remove misleading example about suffix removal
- Rename tests test_removes_r0_suffix → test_preserves_r0_suffix
- Rename tests test_removes_t0_suffix → test_preserves_t0_suffix
- Update test expectations to preserve suffixes

Note: The bifrost f144_log_streams configuration was generated with the old
code that stripped suffixes. However, examining the current stream names
(attenuator_1, temperature_0, etc.), none appear to match the pattern
'_[rt]\d+$' that was being stripped, so no regeneration is needed at this
time. If the original NeXus file contained streams with '_r0' or '_t0'
suffixes, they would need to be regenerated to ensure accuracy.

Original prompt: Please use a worktree and address the suffix-removal problem
discussed in #589 (do not strip _r0 and the like, since it refers to "rotation 0",
in fact do not strip any suffixes). Do we need to rerun the scripts that were used
to generate the dict of names for bifrost?
@SimonHeybrock
Copy link
Member Author

@nvaytet I think I addressed the comments. We need this now for deployment so I will merge.

@SimonHeybrock SimonHeybrock merged commit 403db19 into main Dec 15, 2025
4 checks passed
@SimonHeybrock SimonHeybrock deleted the add-log-stream-mapping branch December 15, 2025 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants