-
-
Notifications
You must be signed in to change notification settings - Fork 19.2k
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import datetime
import pandas as pd
idx = [datetime.datetime(year=2025, month=10, day=17, hour=17, minute=ii, second=10, microsecond=0) for ii in range(15, 18, 1)]
# (17:15:10, 0), (17:16:10, 1), (17:17:10, 2)
df = pd.DataFrame({'value': range(len(idx))}, index=idx)
# Expected: (17:15:00, NaN), (17:16:00, NaN), (17:17:00, NaN)
# Result: (17:15:00, 0), (17:16:00, 1), (17:17:00, 2)
df_resample = df.resample('1min', origin='start_day').asfreq()
Issue Description
resample(..., origin='start_day')
groups the dataframe with respect to the first day at midnight.
Therefore, if resampling frequency is 1min
and each datetime index has a second
value, expected result of afreq()
would be NaN
for all.
This expected behavior really happens when second
value of each datetime index is not the same:
idx = [datetime.datetime(year=2025, month=10, day=17, hour=17, minute=ii, second=10 if ii < 17 else 0, microsecond=0) for ii in range(15, 18, 1)]
# (17:15:10, 0), (17:16:10, 1), (17:17:00, 2)
df = pd.DataFrame({'value': range(len(idx))}, index=idx)
# Result: (17:15:00, NaN), (17:16:00, NaN), (17:17:00, 2)
df_resample = df.resample('1min', origin='start_day').asfreq()
However, if the second
value of all datetime index is the same, resampling ignores the second value and just reindex the dataframe:
idx = [datetime.datetime(year=2025, month=10, day=17, hour=17, minute=ii, second=10, microsecond=0) for ii in range(15, 18, 1)]
# (17:15:10, 0), (17:16:10, 1), (17:17:10, 2)
df = pd.DataFrame({'value': range(len(idx))}, index=idx)
# Expected: (17:15:00, NaN), (17:16:00, NaN), (17:17:00, NaN)
# Produced: (17:15:00, 0), (17:16:00, 1), (17:17:00, 2)
df_resample = df.resample('1min', origin='start_day').asfreq()
Expected Behavior
Expected result: (17:15:00, NaN), (17:16:00, NaN), (17:17:00, NaN)
Produced result: (17:15:00, 0), (17:16:00, 1), (17:17:00, 2)
Installed Versions
commit : 9c8bc3e
python : 3.14.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19045
machine : AMD64
processor : Intel64 Family 6 Model 158 Stepping 13, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : Korean_Korea.949
pandas : 2.3.3
numpy : 2.3.3
pytz : 2025.2
dateutil : 2.9.0.post0
pip : 25.2
Cython : None
sphinx : None
IPython : None
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : None
blosc : None
bottleneck : None
dataframe-api-compat : None
fastparquet : None
fsspec : None
html5lib : None
hypothesis : None
gcsfs : None
jinja2 : None
lxml.etree : None
matplotlib : None
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
psycopg2 : None
pymysql : None
pyarrow : None
pyreadstat : None
pytest : None
python-calamine : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlsxwriter : None
zstandard : None
tzdata : 2025.2
qtpy : None
pyqt5 : None