-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to preprocess the Time-MMD dataset? #2
Comments
im also wondering what does |
After reading the paper Time-MMD: Multi-Domain Multimodal Dataset for Time Series Analysis and analyze the original data from https://gis.cdc.gov/grasp/fluview/fluportaldashboard.html, I believe that the In which the Accordingly, I write a code to get import pandas as pd
def seasonal_group_average(df=pd.DataFrame(), seasonal_period=51, group_window_size=1, target='%UNWEIGHTED ILI'):
"""
Compute the `prior_history_avg` based on seasonal grouped average.
Args:
df (pd.DataFrame): DataFrame containing the time series data.
seasonal_period (int): Seasonal period, e.g., 51 weeks.
group_window_size (int): Size of the group window for averaging.
target (str): Column name of the target time series.
Returns:
pd.DataFrame: DataFrame with a new column `prior_history_avg`.
"""
if target in df.columns:
df['prior_history_avg'] = [
(
sum(
df[target].iloc[max(0, t - i * seasonal_period)]
for i in range(1, group_window_size + 1)
) / group_window_size
if t >= seasonal_period else 0.0
)
for t in range(len(df))
]
return df
ili_data_df = pd.read_csv('ILINet.csv', header=1)
seasonal_grouped_df = seasonal_group_average(df=ili_data_df)
seasonal_grouped_df.to_csv('test.csv')
seasonal_grouped_df But in the author's data |
Thank you for your great work.
Currently, I am confused about preprocessing the Time-MMD dataset.
In your provided data,
data/Public_Health/US_FLURATIO_Week.csv
, I do not know how to get six kinds of data, such asprior_history_avg', 'prior_history_std', 'Final_Search_2', 'Final_Search_4', 'Final_Search_6', 'Final_Output'
.By
data/DataPre_ClosedSourceLLM/Prepare.ipynb
, we could obtainFinal_Output
, however, how can we get the other five columns of data?Any suggestion will help me a lot, thank you!
The text was updated successfully, but these errors were encountered: