-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Describe the new feature or enhancement
Description
Nihon Kohden EEG files contain comments/annotations with textual and image content, but currently this information is not accessible when reading the data with MNE.
I previously raised this question on the MNE Discourse forum, where the limitation was discussed:
https://mne.discourse.group/t/how-to-read-the-nihon-kohden-eeg-files-comments-content-of-the-comment/11680
At the moment, MNE can detect the presence/timing of comments but does not expose the actual text content of those comments to the user.
Why this matters
The comment text in Nihon Kohden recordings often contains clinically and experimentally relevant metadata, such as:
- Seizure/Event types and full descriptions added by experts
- Manual annotations added during recording added by nurses
Losing this information during import makes downstream analysis specially in case of machine learning incomplete and requires users to rely on external vendor software.
Current behavior
Given a code like this:
import mne
print(mne.__version__)
raw = mne.io.read_raw_nihon("FJ00231Z.EEG", preload=False) # OR preload=True
for ann in raw.annotations:
print(ann)This will output the annotations as expected, but it does not read the comments content. Here is a sample output of the code:
1.11.0 # MNE version
Loading FJ00231Z.EEG
Found 21E file, reading channel names.
Reading header from Path\To\File\EEG2100\FJ00231Z.EEG # EEG 2100 Device
Found PNT file, reading metadata.
Found LOG file, reading events.
OrderedDict({'onset': np.float64(11768.0), 'duration': np.float64(0.0), 'description': np.str_('eye close'), 'orig_time': datetime.datetime(2025, 12, 27, 9, 49, 31, tzinfo=datetime.timezone.utc), 'extras': {}})
OrderedDict({'onset': np.float64(13307.568), 'duration': np.float64(0.0), 'description': np.str_('P_COMMENT'), 'orig_time': datetime.datetime(2025, 12, 27, 9, 49, 31, tzinfo=datetime.timezone.utc), 'extras': {}})Those annotations with 'description': np.str_('P_COMMENT') contain comments such as this image and are not currently readable:
- Nihon Kohden EEG files can be read
- Annotations can be read
- Comments timing are available
- Comments are all shown as
P_COMMENTand the text/content is not accessible
Expected behavior
At least the Comment text should be parsed and exposed, ideally as:
Annotations.description, or- In the extra field of the annotation ideally as a dict
Describe your proposed implementation
Basic code snippet
Here is the code i use currently to read the comments from a given .CMT file. It should be integrated into mne.io.read_raw_nihon code but the code is not complete so it can wait.
import re
import string
from dataclasses import dataclass
@dataclass
class Comment:
timestamp: int
text: str
TS_RE = re.compile(rb"(\d{20})") # The timing of each annotation which is 20 digit long
PRINTABLE = set(bytes(string.printable, "ascii")) # The .CMT file contains many NULL and control characters
def clean_bytes(b: bytes) -> str:
# keep printable ASCII, including space and newline - NOT TESTED WITH COMMENTS INCLUDING IMAGE LINKS
cleaned_byte = bytes(c if c in PRINTABLE else ord(" ") for c in b)
cleaned_str = cleaned_byte.decode("ascii", errors="ignore").strip()
cleaned_str = cleaned_str[10:].lstrip().lstrip("\t\n\x0b\r\x0c") # at least 10 control characters are included before the text
return cleaned_str
def parse_cmt(path: str):
data = open(path, "rb").read()
matches = list(TS_RE.finditer(data))
records = []
for i, m in enumerate(matches):
ts = int(m.group(1).decode("ascii"))
start = m.end()
end = matches[i + 1].start() if i + 1 < len(matches) else len(data)
raw_text = data[start:end]
text = clean_bytes(raw_text)
if text:
records.append(Comment(timestamp=ts, text=text))
return records
records = parse_cmt("ROOT_PATH/NKT/EEG2100/FJ00231Z.CMT")
for r in records:
print("📝")
print(r.timestamp)
print(r.text)Limitations:
- Tested only on EEG 2100 devices
- Its just a workaround for starter
- Nihon Kohden Comments can containt color code, background transparency, and even a reference image. None of these are read here as i've never seen experts really use them.
Describe possible alternatives
No alternatives currently
Additional context
No response