Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
257 commits
Select commit Hold shift + click to select a range
8240fc6
refactor: add PAMAP activity list to pre-prompt in PAMAP2AccQADataset
masquare Jul 18, 2025
3ce8414
Merge branch 'masquare/baseline' of https://github.com/StanfordBDHG/E…
RealLast Jul 20, 2025
a742388
Added evaluation folder.
RealLast Jul 20, 2025
a4e133a
Added embedhealth folder.
RealLast Jul 20, 2025
346f326
Finished baseline implementation.
RealLast Jul 20, 2025
2f967b0
Removed normalization from simple AccQADataset.
RealLast Jul 20, 2025
800b5e9
Changed eval scripts to all write into a single evaluation_results.cs…
RealLast Jul 20, 2025
935c1fb
Set max_samples to None.
RealLast Jul 20, 2025
631f0a8
Changed to not include error message.
RealLast Jul 21, 2025
7746cf4
Added support for OpenAI for evaluation. Introduced OpenAIPipeline as…
RealLast Jul 21, 2025
f966acc
Updated serializer settings for gruver formatter to match original va…
RealLast Jul 21, 2025
002c699
Added MPS warnings.
RealLast Jul 21, 2025
354a650
Add plot data option
max-rosenblattl Jul 21, 2025
d97eda3
Merge branch 'preprint/evaluation' of github.com:StanfordBDHG/EmbedHe…
max-rosenblattl Jul 21, 2025
ba34434
Fix plotting
max-rosenblattl Jul 21, 2025
b7cab07
Fix PAMAP2AccQA classes
max-rosenblattl Jul 21, 2025
590bbde
Added class balancing for train set in pamap cot.
RealLast Jul 22, 2025
d7a41f4
Merge branch 'preprint/evaluation' of https://github.com/StanfordBDHG…
RealLast Jul 22, 2025
e177141
Removed oversampling strategy, instead introduced BalancedBatchSample…
RealLast Jul 22, 2025
148b8c5
Updated label retrieval for batch size calculation.
RealLast Jul 22, 2025
57474b4
Added debug prints.
RealLast Jul 22, 2025
698cfd3
Temporarily removed sampler.
RealLast Jul 22, 2025
ffa153f
Re-enabled sampler.
RealLast Jul 22, 2025
ca791fe
Updated num epochs.
RealLast Jul 22, 2025
cbc9318
Removed sampler during Pamap2CoT training for now.
RealLast Jul 24, 2025
315cc60
Set sampler to None.
RealLast Jul 24, 2025
2bfe0cc
Added script to generate pamap predictions.
RealLast Jul 26, 2025
1fc4ab7
Added SleepEDFCoTQADataset and a corresponding stage stage4_sleep_cot…
RealLast Jul 26, 2025
218e204
Added debug print.
RealLast Jul 26, 2025
059a6ba
Extracted time series data from first dimension.
RealLast Jul 26, 2025
5fae3d7
Updated assertion statement.
RealLast Jul 26, 2025
aec0c62
Debug
RealLast Jul 26, 2025
477b1e4
Debug
RealLast Jul 26, 2025
f3e67cc
Debug
RealLast Jul 26, 2025
db04092
Debug
RealLast Jul 26, 2025
c9ebc45
Debug
RealLast Jul 26, 2025
8244572
Added SleepEDF CoT loader test.
RealLast Jul 26, 2025
5bd003a
Added sleep cot eval script.
RealLast Jul 26, 2025
8af4e0b
Added scripts to get example outputs for SleepCoT and then plot them.
RealLast Jul 28, 2025
2b7bd4a
Updated path
RealLast Jul 28, 2025
d781836
Changed path.
RealLast Jul 28, 2025
eb79e2d
Updated path
RealLast Jul 28, 2025
84fec71
Updated invalid path.
RealLast Jul 28, 2025
f2444a8
Updated SleepEDF evaluation.
RealLast Aug 8, 2025
0de7c8a
Merge branch 'preprint/evaluation' of https://github.com/StanfordBDHG…
RealLast Aug 8, 2025
9d0eb67
Uploaded code for ECG-QA.
RealLast Aug 8, 2025
dda59f9
Implemented ECGQA dataset, only using PTB-XL for now.
RealLast Aug 13, 2025
6087465
Added concrete errors and exceptions.
RealLast Aug 13, 2025
a4d14ca
Added parsing of answers directly from the templates.
RealLast Aug 13, 2025
15df576
Added parameter to optionally remove questions requiring to compare 2…
RealLast Aug 15, 2025
d677eed
Added ecg qa test.
RealLast Aug 15, 2025
a1e65ef
Added har_cot dataset loader.
RealLast Aug 15, 2025
79e8e57
Replaced PamapCOT with HARCot.
RealLast Aug 15, 2025
e30c025
Removed wrong sampler.
RealLast Aug 15, 2025
a42554d
Added attention attribute.
RealLast Aug 20, 2025
c7852ee
Added eager attention implementation.
RealLast Aug 20, 2025
1fd36d0
Removed compilation during evaluation due to issues with Flamingo on …
RealLast Aug 21, 2025
3412e73
Fixed local variable not found when resuming training.
RealLast Aug 21, 2025
6c85b3c
Fixed epoch var
RealLast Aug 21, 2025
13e2a8e
Fixed epoch var
RealLast Aug 21, 2025
88668d6
Fixed epoch var
RealLast Aug 21, 2025
763cd32
Fixed epoch var
RealLast Aug 21, 2025
bb79507
Fixed epoch var
RealLast Aug 21, 2025
9849ad4
Fixed epoch var
RealLast Aug 21, 2025
3f5f60a
Fixed epoch var
RealLast Aug 21, 2025
94829c9
Fixing issues with dynamo
RealLast Aug 21, 2025
88481e8
Fixing issues with dynamo
RealLast Aug 21, 2025
890b34d
Fixing issues with dynamo
RealLast Aug 21, 2025
3d06e43
Fixing issues with dynamo
RealLast Aug 21, 2025
e0b1b75
Fixing issues with dynamo
RealLast Aug 21, 2025
6cf8238
Fixing issues with dynamo
RealLast Aug 21, 2025
317f279
Fixing issues with dynamo
RealLast Aug 21, 2025
35525ee
Fixing issues with dynamo
RealLast Aug 21, 2025
6e1c30b
Fixing issues with dynamo
RealLast Aug 21, 2025
c1e44b2
Fixing issues with dynamo
RealLast Aug 21, 2025
3391087
Fixing issues with dynamo
RealLast Aug 21, 2025
a4ef803
Fixing issues with dynamo
RealLast Aug 21, 2025
6e4c662
Fixing issues with dynamo
RealLast Aug 21, 2025
926f996
Fixing issues with dynamo
RealLast Aug 21, 2025
2333660
Added distributed evaluation.
RealLast Aug 23, 2025
4c243a9
Reverted multi-gpu eval, seems not to work.
RealLast Aug 25, 2025
148e8af
Added missing property for gemma.
RealLast Aug 25, 2025
4a63f31
Gemma3 debug.
RealLast Aug 25, 2025
0426818
Check model.
RealLast Aug 25, 2025
b243da6
Debugging.
RealLast Aug 25, 2025
6b5722b
Debugging.
RealLast Aug 25, 2025
6f8c29d
Added support for gemma3.
RealLast Aug 25, 2025
fc404d5
Added tracking of losses.
RealLast Aug 25, 2025
fa4607e
Improved loss tracking.
RealLast Aug 25, 2025
5136167
Add vision eval for hugging face pipeline
max-rosenblattl Aug 25, 2025
a26155a
Added LoRA finetuning to EmbedHealthSP.
RealLast Aug 25, 2025
39cc80e
Updated info prints.
RealLast Aug 25, 2025
261785a
Unify plotting flag
max-rosenblattl Aug 25, 2025
ec48aa4
Update baseline prompt to reason step by step
max-rosenblattl Aug 25, 2025
1d024ee
Fixed loading of Checkpoints with LoRA.
RealLast Aug 26, 2025
eb030ae
Remove plotting from common_evaluator and add to own class
max-rosenblattl Aug 26, 2025
bec04b7
Remove gruver formatting to access time series data directly
max-rosenblattl Aug 27, 2025
feb3bfd
Refactor common plot evaluator
max-rosenblattl Aug 27, 2025
8fb8aaa
Fix classes in prompt
max-rosenblattl Aug 27, 2025
5ea36ac
Added ECG-QA CoT loader.
RealLast Aug 27, 2025
81961a5
Changed to deserialize to CPU first during eval as peak memory is too…
RealLast Aug 27, 2025
f605cd5
Gemma specific patches to padding.
RealLast Aug 27, 2025
93cb142
Gemma specific patches.
RealLast Aug 27, 2025
cb338ad
Updated support for gemma4 for EmbedHealthFlamingo.
RealLast Aug 27, 2025
f71b498
Reduced epochs for testing.
RealLast Aug 27, 2025
86c48bb
Unwrapping Gemma3-4b.
RealLast Aug 27, 2025
df36c82
Reverted changes.
RealLast Aug 27, 2025
8ae41f9
Set epochs to correct number.
RealLast Aug 27, 2025
abcaf1f
Merge branch 'RealLast/ECG-QA-Integration' into MaxRosenblattl/Vision…
max-rosenblattl Aug 27, 2025
b74ef9d
Add HARAccQADataset
max-rosenblattl Aug 27, 2025
197d9d1
Adapt eval pamap script
max-rosenblattl Aug 27, 2025
c323dc3
Run HAR Acc with 4o
max-rosenblattl Aug 27, 2025
7e61e54
Increased timeout.
RealLast Aug 28, 2025
670fb7a
Increased epochs.
RealLast Aug 28, 2025
4249ef2
Improved Sleep evaluation.
RealLast Aug 29, 2025
c73cce4
Updated ECG-QA-CoT loader prompt and loading.
RealLast Aug 29, 2025
e4d47b7
Added ECG-QA-CoT to curriculum_learning
RealLast Aug 29, 2025
b1475ee
Added loss history tracking
RealLast Aug 29, 2025
4d64b03
Added progress bar, removed print.
RealLast Aug 29, 2025
700f521
Removed nested progress bar.
RealLast Aug 29, 2025
73ca3c6
Added progress bar.
RealLast Aug 29, 2025
d1d9b41
Added progress bar when buiilding dataset.
RealLast Aug 29, 2025
b90715a
Added progress bar to extraction.
RealLast Aug 29, 2025
f72a933
Changed attention implementation to eager.
RealLast Aug 30, 2025
ddcdee7
Split files into text and plot evaluation
max-rosenblattl Aug 31, 2025
db81304
Throw exceptions
max-rosenblattl Aug 31, 2025
e03ceea
Added script to parse ecg_qa_cot results.
RealLast Aug 31, 2025
5b33ad7
Merge branch 'RealLast/ECG-QA-Integration' of https://github.com/Stan…
RealLast Aug 31, 2025
453fb28
Refactor plotting function as parameter
max-rosenblattl Aug 31, 2025
d4e49a2
Raise exception when saving failed.
RealLast Aug 31, 2025
7c1e91d
Move prompt to dataset
max-rosenblattl Aug 31, 2025
b6ca37f
Add parsing results script
max-rosenblattl Aug 31, 2025
d32a590
Corrected result parsing. Made get_possible_answer_for_template static.
RealLast Sep 1, 2025
8acbbe5
Merge branch 'MaxRosenblattl/VisionEval' of https://github.com/Stanfo…
RealLast Sep 1, 2025
0e06974
Fixed ecg_qa_cot results parsing.
RealLast Sep 1, 2025
544af4f
Made get_answer_for_template static.
RealLast Sep 1, 2025
2d9e9a7
Fix minor issues
max-rosenblattl Sep 1, 2025
d841487
Updated parse script.
RealLast Sep 2, 2025
acff4f8
Merge branch 'MaxRosenblattl/VisionEval' of https://github.com/Stanfo…
RealLast Sep 2, 2025
efee240
Updated ECG-QA-CoT link to point to zip file with larger dataset.
RealLast Sep 6, 2025
303f842
Fixed wrong question naming.
RealLast Sep 6, 2025
cdc86f9
Fixed using 12 leads always. Corrected time series text to match HAR-…
RealLast Sep 6, 2025
ce53428
Updated M4CaptionDataset to include all samples.
RealLast Sep 6, 2025
849ffe3
Added caching to speed up loading.
RealLast Sep 6, 2025
14b57ef
Removed specific question parts, pre_prompt should be independent of …
RealLast Sep 6, 2025
2535270
Removed leftover else.
RealLast Sep 6, 2025
a5851f6
Removed task_specific.
RealLast Sep 6, 2025
182b2fc
Updated ptbxl download link.
RealLast Sep 6, 2025
4ea4ec4
Added script for evaluating ecg_qa on baselines.
RealLast Sep 8, 2025
e7f26ce
Updated link to final ECG-QA-CoT dataset.
RealLast Sep 10, 2025
a0ac231
Increased max tokens for eval.
RealLast Sep 10, 2025
a26c5f0
Updated TSQA parser.
RealLast Sep 10, 2025
a5206bf
Added script for parsing predictiosn from baseline.
RealLast Sep 10, 2025
7368c57
Changed evaluation to not compute test loss as this means running the…
RealLast Sep 10, 2025
8bb4a25
Added check for test_predictions file.
RealLast Sep 10, 2025
03ec96b
Added debug command.
RealLast Sep 10, 2025
5191d17
Debug comments.
RealLast Sep 10, 2025
def36c7
Fixed intendation issue.
RealLast Sep 10, 2025
899c738
Add sleep plot evaluation
max-rosenblattl Sep 10, 2025
159870f
Added get_memory_use script.
RealLast Sep 12, 2025
3a3611e
Added bash script to run all memory experiments.
RealLast Sep 12, 2025
1490371
Corrected pamaramaters.
RealLast Sep 12, 2025
6b2f1bb
Fixed bugs.
RealLast Sep 12, 2025
ce0d3ae
Added model.to(device).
RealLast Sep 12, 2025
6d39d5e
Added memory use in gb as well.
RealLast Sep 12, 2025
1391a11
Updated training for epochs.
RealLast Sep 12, 2025
da9a182
Added progress bar.
RealLast Sep 12, 2025
a91afab
Changed to DataLoader.
RealLast Sep 12, 2025
55f3eae
Switched order of SP and Flamingo.
RealLast Sep 12, 2025
fc30089
Fixed max memory tracking.
RealLast Sep 12, 2025
da24edd
Improved memory tracking.
RealLast Sep 12, 2025
37a93d3
Added memory to progress bar.
RealLast Sep 12, 2025
29b29f6
Updated pynvml usage.
RealLast Sep 12, 2025
62b8739
Updated common evaluator to support ECG-QA-CoT.
RealLast Sep 12, 2025
99fdfd0
Added baseline parserers for sleep and har. Added plot_memory_usage.
RealLast Sep 13, 2025
24a474b
Readded missing scripts.
RealLast Sep 13, 2025
6efda0f
Add sleep plot results parser
max-rosenblattl Sep 13, 2025
0d37c44
Added SimulationQADataset. Added script to run missing models.
RealLast Sep 14, 2025
cdd71ae
Added SimulationQADataset.
RealLast Sep 14, 2025
a03b662
Increased dataset size.
RealLast Sep 14, 2025
6ce8b0d
Reduced size to 1000.
RealLast Sep 14, 2025
f0b5287
Changed order.
RealLast Sep 14, 2025
e29efac
Add tsqa plot eval
max-rosenblattl Sep 14, 2025
f05508b
Merge branch 'RealLast/ECG-QA-Integration' into MaxRosenblattl/Vision…
max-rosenblattl Sep 14, 2025
a6ad339
Increased max supported time series length.
RealLast Sep 14, 2025
d90e8b6
Preparing for rerunning with time series of length 10000.
RealLast Sep 14, 2025
ae8bbfb
Rerunning with time series of length 10000.
RealLast Sep 14, 2025
eb30332
Updated script for missing simulations.
RealLast Sep 14, 2025
6f3266e
Removed old datasets.
RealLast Sep 14, 2025
5758695
Reverted patch size.
RealLast Sep 14, 2025
fa626dd
Add ecg qa plot eval
max-rosenblattl Sep 14, 2025
e9ac45d
Fix naming error
max-rosenblattl Sep 14, 2025
3b725a9
Added storing correct answer in json when evalauting ecq-qa.
RealLast Sep 16, 2025
e6a6882
Added distribution for test set.
RealLast Sep 16, 2025
31b2d69
Added progress bar for validation.
RealLast Sep 16, 2025
41090c8
Increased max new tokens
RealLast Sep 16, 2025
4d9fbd2
Added script to generate eval dataset for doctors.
RealLast Sep 16, 2025
d2d4a24
Update README.md
RealLast Sep 17, 2025
2880c1e
Adapt common evaluator plot for pretrained models
max-rosenblattl Sep 18, 2025
b3c9fbd
Store intermediate results
max-rosenblattl Sep 18, 2025
a8f80ca
Add pre post prompt order
max-rosenblattl Sep 18, 2025
6f3d359
Fix brackets
max-rosenblattl Sep 18, 2025
3bf6208
Fix quotes
max-rosenblattl Sep 18, 2025
ba592b7
Add original sleep dataset
max-rosenblattl Sep 18, 2025
41ae49b
Fix tsqa parse
max-rosenblattl Sep 18, 2025
9920a43
Sanity check
max-rosenblattl Sep 19, 2025
f29ba57
Pass sanity check
max-rosenblattl Sep 19, 2025
f20752a
ECG evaluation.
RealLast Sep 19, 2025
038a8f8
Merge branch 'MaxRosenblattl/VisionEval' of https://github.com/Stanfo…
RealLast Sep 19, 2025
cdbdb5f
Preparing to rerun failed simulations.
RealLast Sep 20, 2025
c5a27f5
Temporarily increased max patches.
RealLast Sep 20, 2025
b930432
Increased max patches.
RealLast Sep 20, 2025
d1eb752
Preparing for dataset rerun.
RealLast Sep 20, 2025
7d5de4d
Added script to check missing checkpoints. Improved plotting.
RealLast Sep 20, 2025
c2c8014
Added further scripts for plotting.
RealLast Sep 20, 2025
253367f
Add Hugging Face
ThomasKaar Sep 21, 2025
938fa72
Add static llm_id mapping
ThomasKaar Sep 21, 2025
5d96f0a
Change API
ThomasKaar Sep 21, 2025
341fa31
Adjust Readme
ThomasKaar Sep 21, 2025
cf112d6
Fix code style, change hugging face behavior
ThomasKaar Sep 21, 2025
251ad2b
Add default parameters
ThomasKaar Sep 21, 2025
240a590
Remove kwargs
ThomasKaar Sep 21, 2025
3a03f9e
Adjust documentation
ThomasKaar Sep 21, 2025
dab60a0
Fix time_series not found bug
max-rosenblattl Sep 21, 2025
bdf0ce5
Remove comment
max-rosenblattl Sep 21, 2025
fc75a8d
Fixed plot eval.
RealLast Sep 23, 2025
1136c35
Merge branch 'MaxRosenblattl/VisionEval' of https://github.com/Stanfo…
RealLast Sep 23, 2025
48554b7
Rename to OpenTSLM
ThomasKaar Sep 23, 2025
36fd55b
Add renaming in test_train_sp
ThomasKaar Sep 24, 2025
a4762cf
Merge branch 'main' into ThomasKaar/RenameToOpenTSLM
ThomasKaar Sep 24, 2025
3f03de0
Add renames occured after merging
ThomasKaar Sep 24, 2025
cd1b85d
Move EmbedHealthX to OpenTSLMX files
ThomasKaar Sep 24, 2025
7b6cce7
Adapt readme
ThomasKaar Sep 24, 2025
41e41c8
Fix text
ThomasKaar Sep 24, 2025
563f6a7
Merge branch 'ThomasKaar/AdaptReadMe' into ThomasKaar/AddHuggingFace
ThomasKaar Sep 26, 2025
0f3e49b
fix readme + normalization
ThomasKaar Sep 26, 2025
3e4e6cd
Solved import issues.
RealLast Sep 27, 2025
bba98e7
added test file.
RealLast Sep 27, 2025
cb43454
Fixed patch sizes to match huggingface models.
RealLast Sep 27, 2025
62b7f46
Update dimports.
RealLast Sep 28, 2025
501ea58
Thomas kaar/adapt read me (#21)
ThomasKaar Sep 30, 2025
be0b19c
Merge branch 'ThomasKaar/RenameToOpenTSLM' into ThomasKaar/AddHugging…
RealLast Sep 30, 2025
09bc3c2
Add files via upload
RealLast Oct 1, 2025
29c6cff
Merge remote-tracking branch 'origin/main' into ThomasKaar/AddHugging…
ThomasKaar Nov 9, 2025
e063ccd
Added demo scripts to test existing huggingface checkpoints on each d…
RealLast Nov 30, 2025
93ff09e
Removed duplicate functions.
RealLast Nov 30, 2025
6e694b1
Added missing license headers.
RealLast Nov 30, 2025
6f393a8
Merge branch 'main' into ThomasKaar/AddHuggingFace
ThomasKaar Dec 1, 2025
0278fb8
fix linkspector failure for doi link
ThomasKaar Dec 1, 2025
abc45e7
fix license header of linkspector yml
ThomasKaar Dec 1, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@ raw_data

*.ts
*.zip
./__pycache__
./__pycache__
upload_to_huggingface.py
15 changes: 15 additions & 0 deletions .linkspector.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#
# This source file is part of the OpenTSLM open-source project
#
# SPDX-FileCopyrightText: 2025 Stanford University, ETH Zurich, and the project authors (see CONTRIBUTORS.md)
#
# SPDX-License-Identifier: MIT
#

dirs:
- .
useGitIgnore: true
ignorePatterns:
- pattern: "doc:/"
- pattern: "http://localhost"
- pattern: "https://doi.org"
84 changes: 82 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ OpenTSLM models can reason over multiple time series of any length at once, gene

</p>


## Installation

1. **Clone the Repository**
Expand Down Expand Up @@ -80,7 +79,83 @@ OpenTSLM has been tested and works with the following models:
Other variants may work but have not been extensively tested.


## Multi-stage training (Curriculum)

## 🚀 Quickstart with pretrained models

EmbedHealth provides a factory class called `OpenTSLM` for easily loading pre-trained models from Hugging Face Hub. The `load_pretrained` method automatically detects the model type and returns the appropriate model instance.


```python
from src import OpenTSLM, TextPrompt, TextTimeSeriesPrompt, FullPrompt

# Load model
model = OpenTSLM.load_pretrained("OpenTSLM/gemma-3-270m-pt-har-flamingo")

# Create prompt with raw time series data (normalization handled automatically)
prompt = FullPrompt(
pre_prompt=TextPrompt("You are an expert in HAR analysis."),
text_time_series_prompt_list=[
TextTimeSeriesPrompt("X-axis accelerometer", [2.34, 2.34, 7.657, 3.21, -1.2])
],
post_prompt=TextPrompt("What activity is this? Reasn step by step providing a full rationale before replying.")
)

# Generate response
output = model.eval_prompt(prompt, normalize=True)
print(output)
```

### 🤗 HuggingFace Demo Scripts

We provide ready-to-use demo scripts in the `demo/huggingface/` directory that demonstrate how to load pretrained models from HuggingFace Hub and run inference on the evaluation sets for each task:

- **`01_test_hf_tsqa.py`** - Test TSQA (Time Series Question Answering) models
- **`02_test_hf_m4.py`** - Test M4 (Time Series Captioning) models
- **`03_test_hf_har_cot.py`** - Test HAR CoT (Human Activity Recognition Chain-of-Thought) models
- **`04_test_hf_sleep_cot.py`** - Test Sleep CoT (Sleep Stage Classification) models
- **`05_test_hf_ecg_qa_cot.py`** - Test ECG QA CoT (ECG Question Answering) models

Each script:
1. Downloads the model checkpoint from HuggingFace Hub automatically (change repo id as neededs)
2. Loads the corresponding test dataset
3. Runs inference on the evaluation set
4. Prints model outputs with sample information

**Note:** The scripts above use the OpenTSLM-SP models except for ECG-QA, as they require less VRAM and should run on most hardware. Change the model checkpoints as needed in each file.

**Usage:**

```bash
# Run any of the demo scripts
python demo/huggingface/01_test_hf_tsqa.py
python demo/huggingface/02_test_hf_m4.py
python demo/huggingface/03_test_hf_har_cot.py
python demo/huggingface/04_test_hf_sleep_cot.py
python demo/huggingface/05_test_hf_ecg_qa_cot.py
```

**Customizing the model:**

Edit the `REPO_ID` variable at the top of each script to test different model variants. For example:

```python
# In 01_test_hf_tsqa.py
REPO_ID = "OpenTSLM/llama-3.2-1b-tsqa-sp" # Soft Prompt model
# or
REPO_ID = "OpenTSLM/llama-3.2-1b-tsqa-flamingo" # Flamingo model
```

**Available models on HuggingFace:**

All pretrained models are available under the `OpenTSLM` organization on HuggingFace Hub. Model names follow the pattern:
- `OpenTSLM/{base_model}-{dataset}-{model_type}`
- `base_model`: `llama-3.2-1b`, `llama-3.2-3b`, `gemma-3-1b-pt`, `gemma-3-270m`
- `dataset`: `tsqa`, `m4`, `har`, `sleep`, `ecg`
- `model_type`: `sp` (Soft Prompt) or `flamingo` (Flamingo)

Example: `OpenTSLM/llama-3.2-1b-ecg-flamingo`

## Training: Multi-stage training (Curriculum)

OpenTSLM uses curriculum learning with progressive training stages:

Expand Down Expand Up @@ -137,6 +212,11 @@ python curriculum_learning.py --model OpenTSLMFlamingo --eval_only
- `--gradient_checkpointing`: Enable gradient checkpointing for memory efficiency
- `--verbose`: Enable verbose logging

### Repository Naming Convention

- Repository IDs ending with `-sp` will load and return `EmbedHealthSP` models
- Repository IDs ending with `-flamingo` will load and return `EmbedHealthFlamingo` models


## 📁 Results Structure

Expand Down
Loading