Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected Indexing Error Due to Low-Precision sampling_rate Calculation #13117

Open
SYXiao2002 opened this issue Feb 18, 2025 · 3 comments · May be fixed by #13120
Open

Unexpected Indexing Error Due to Low-Precision sampling_rate Calculation #13117

SYXiao2002 opened this issue Feb 18, 2025 · 3 comments · May be fixed by #13120
Labels

Comments

@SYXiao2002
Copy link

Description of the problem

The precision of info["sfreq"] calculation in read_raw_snirf remains too low, leading to cumulative errors when converting time to indices.

Instrument

Model: NirSmart II-3000 Series
Sampling Frequency: 11Hz

Current Calculation of sampling_rate(info["sfreq"]) in read_raw_snirf:

def _extract_sampling_rate(dat):
    """Extract the sample rate from the time field."""
    # This is a workaround to provide support for Artinis data.
    # It allows for a 1% variation in the sampling times relative
    # to the average sampling rate of the file.
    MAXIMUM_ALLOWED_SAMPLING_JITTER_PERCENTAGE = 1.0

    time_data = np.array(dat.get("nirs/data1/time"))
    sampling_rate = 0
    if len(time_data) == 2:
        # specified as onset, samplerate
        sampling_rate = 1.0 / (time_data[1] - time_data[0])
    else:
        # specified as time points
        periods = np.diff(time_data)
        uniq_periods = np.unique(periods.round(decimals=4))
        if uniq_periods.size == 1:
            # Uniformly sampled data
            sampling_rate = 1.0 / uniq_periods.item()
        else:
            # Hopefully uniformly sampled data with some precision issues.
            # This is a workaround to provide support for Artinis data.
            mean_period = np.mean(periods)
            sampling_rate = 1.0 / mean_period
            ideal_times = np.linspace(time_data[0], time_data[-1], time_data.size)
            max_jitter = np.max(np.abs(time_data - ideal_times))
            percent_jitter = 100.0 * max_jitter / mean_period
            msg = (
                f"Found jitter of {percent_jitter:3f}% in sample times. Sampling "
                f"rate has been set to {sampling_rate:1f}."
            )
            if percent_jitter > MAXIMUM_ALLOWED_SAMPLING_JITTER_PERCENTAGE:
                warn(
                    f"{msg} Note that MNE-Python does not currently support SNIRF "
                    "files with non-uniformly-sampled data."
                )
            else:
                logger.info(msg)
    time_unit = _get_metadata_str(dat, "TimeUnit")
    time_unit_scaling = _get_timeunit_scaling(time_unit)
    sampling_rate *= time_unit_scaling

    return sampling_rate

The rounding to four decimal places helps find a unique period more easily, but it also increases the risk of cumulative indexing errors.

For example, as shown in the figure below, rounding to four decimal places results in an unexpected indexing error:

Image

Steps to reproduce

import mne

filepath = 'p1-0509.snirf'
raw = mne.io.read_raw_snirf(filepath)

calculated_sfreq = raw.info['sfreq']

print(f'sfreq: {calculated_sfreq} Hz')

events, event_id = mne.events_from_annotations(raw)

idx=6
print(f'The {idx}th annotation in time: {raw.annotations.onset[idx]}')
print(f'The {idx}th annotation in index: {events[idx][0]}')

Link to data

p1-0509.snirf

Expected results

The 6th annotation in time: 460.272727
The 6th annotation in index: 5063

Actual results

The 6th annotation in time: 460.272727
The 6th annotation in index: 5064

Additional information

Platform macOS-14.7.2-x86_64-i386-64bit-Mach-O
Python 3.13.1 (main, Dec 3 2024, 17:59:52) [Clang 16.0.0 (clang-1600.0.26.4)]
CPU Intel(R) Core(TM) i5-8257U CPU @ 1.40GHz
machdep.cpu.family: 6
machdep.cpu.model: 142
machdep.cpu.extmodel: 8
machdep.cpu.extfamily: 0
machdep.cpu.stepping: 10
machdep.cpu.feature_bits: 9221959987971750911
machdep.cpu.leaf7_feature_bits: 43804591 0
machdep.cpu.leaf7_feature_bits_edx: 3154128384
machdep.cpu.extfeature_bits: 1241984796928
machdep.cpu.signature: 526058
machdep.cpu.brand: 0
machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C
machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SGX BMI1 AVX2 SMEP BMI2 ERMS INVPCID FPU_CSDS MPX RDSEED ADX SMAP CLFSOPT IPT MDCLEAR TSXFA IBRS STIBP L1DF ACAPMSR SSBD
machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT PREFETCHW RDTSCP TSCI
machdep.cpu.logical_per_package: 16
machdep.cpu.cores_per_package: 8
machdep.cpu.microcode_version: 246
machdep.cpu.processor_flag: 6
machdep.cpu.core_count: 4
machdep.cpu.thread_count: 8 (8 cores)
Memory 8.0 GiB

Core
├☑ mne 1.9.0 (latest release)
├☑ numpy 2.2.3 (unknown linalg bindings)
├☑ scipy 1.15.2
└☑ matplotlib 3.10.0 (backend=macosx)

Numerical (optional)
├☑ sklearn 1.6.1
├☑ nibabel 5.3.2
├☑ nilearn 0.11.1
├☑ pandas 2.2.3
├☑ h5io 0.2.4
├☑ h5py 3.12.1
└☐ unavailable numba, dipy, openmeeg, cupy

Visualization (optional)
└☐ unavailable pyvista, pyvistaqt, vtk, qtpy, ipympl, pyqtgraph, mne-qt-browser, ipywidgets, trame_client, trame_server, trame_vtk, trame_vuetify

Ecosystem (optional)
├☑ mne-nirs 0.7.1
└☐ unavailable mne-bids, mne-features, mne-connectivity, mne-icalabel, mne-bids-pipeline, neo, eeglabio, edfio, mffpy, pybv
None

@SYXiao2002 SYXiao2002 added the BUG label Feb 18, 2025
Copy link

welcome bot commented Feb 18, 2025

Hello! 👋 Thanks for opening your first issue here! ❤️ We will try to get back to you soon. 🚴

@larsoner
Copy link
Member

Agreed that rounding to 4 places is not good enough. We should probably use something like:

mean_period = np.mean(periods)
if not np.allclose(periods, mean_period, rtol=1e-6):
    # do something like warn or whatever

this should be much more robust I think. @SYXiao2002 would you be up for trying this and seeing if it works and opening a PR to fix our code?

@SYXiao2002
Copy link
Author

Sure, but I'm unsure whether rtol should be an input parameter for this function—it would require additional work to refactor properly.

@SYXiao2002 SYXiao2002 linked a pull request Feb 19, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants