Skip to content

CUDA out of memory. #57

@FiorenzoParascandolo1

Description

@FiorenzoParascandolo1

Hi,
I'm using KANLinear in my own project. I have a problem of CUDA out of memory.
Specifically:

  • the model A uses a unique MLP layer (1 and only 1 MLP layer in the whole network) to map a (160, 8, 197, 197) vector in a (160, 8, 197, 197) vector.
  • the model B uses a unique KANLinear layer (1 and only 1 KANLinear layer in the whole network) to map a (160, 8, 197, 197) vector in a (160, 8, 197, 197) vector.

The "whole network" is a transformer based on MLP both for model A and model B.
The model A uses the 60% of the VRAM of a GPU with 24GB of VRAM, while the second model shows CUDA out of memory problem. Since the difference in the number of parameters for the two models is negligible:
the difference is equal to the difference between a single nn.Linear(197, 197) and a single KanLinear(197, 197), how is it possible to have a CUDA out of memory problem?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions