Skip to content

Support hourly data#54

Open
SarahAlidoost wants to merge 30 commits into
mainfrom
support_hourly
Open

Support hourly data#54
SarahAlidoost wants to merge 30 commits into
mainfrom
support_hourly

Conversation

@SarahAlidoost

@SarahAlidoost SarahAlidoost commented Jun 12, 2026

Copy link
Copy Markdown
Member

closes #50

This PR:

  • adds support for hourly data
  • refactor code to improve efficiency
  • adds instructions for dkrz jupyter hub

Issues found:

@SarahAlidoost

Copy link
Copy Markdown
Member Author

@meiertgrootes and @rogerkuou I implemented the support for hourly data and ran the notebook on dkrz jupyter notebook.

  • When using hourly data, the input shape grows quickly e.g. one month: (31×24, 160, 400) = (T, H, W). For longer periods, this becomes too large for memory, even infeasible. Therefore, we cannot train on continuous multi-month sequences. Instead, we should group each month in the batch dimension, e.g. (2, 31×24, 160, 400) = (B, T, H, W) for two months, see issue Add support for patching in time in dataset #62 .
  • I refactored parts of the code to reduce CPU data overhead. Training is now faster, but performance is still limited by CPU constraints. We should moving to GPU, but the code must first be debugged for GPU usage, see issue Add support GPU in the model #30 .
  • We could run the "validation" in parallel with the training loop to improve performance, but it is currently not a bottleneck. The inference (forward call) is fast, but the loss.backward in train mode is the challenge.
  • In the example notebook, training was only run for 50 epochs, so the results might not be valid. They should not be used to evaluate the performance of the model in SST prediction.

@SarahAlidoost SarahAlidoost marked this pull request as ready for review June 26, 2026 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for hourly data

1 participant