Floating point exception (core dumped) problem

## Description
I face a problem when I try to reproduce the paper code [GIANT](https://github.com/amzn/pecos/tree/mainline/examples/giant-xrt). I used my own text-atttibuted graph dataset and followed the data processing instruction by [GIANT](https://github.com/amzn/pecos/tree/mainline/examples/giant-xrt).

**It seems really strange that this problem occurred at training level 1, while it can be well at training level 0.**
**I try to direct this issue, and the only problem I can find is that it may occur at *sparse_matmul()* function in matcher._predict().**

### Steps to reproduce
The command is
```shell
CUDA_VISIBLE_DEVICES=1 python3 -m pecos.xmc.xtransformer.train -t X.trn.txt -x X.trn.tfidf.npz -y Y.trn.npz -m xrt_models --batch-gen-workers 0
```

## Error message or code output

```
12/29/2023 13:02:58 - INFO - pecos.xmc.xtransformer.matcher - | [   5/   5][  7150/  7220] | 1373/1444 batches | ms/batch 451.6586 | train_loss 7.300417e-01 | lr 9.695291e-07
12/29/2023 13:03:24 - INFO - pecos.xmc.xtransformer.matcher - | [   5/   5][  7200/  7220] | 1423/1444 batches | ms/batch 451.0563 | train_loss 7.260027e-01 | lr 2.770083e-07
12/29/2023 13:03:24 - INFO - pecos.xmc.xtransformer.matcher - | **** saving model (avg_prec=0) to /tmp/tmpo8wg3j8h at global_step 7200 ****
12/29/2023 13:03:26 - INFO - pecos.xmc.xtransformer.matcher - -----------------------------------------------------------------------------------------
12/29/2023 13:03:36 - INFO - pecos.xmc.xtransformer.matcher - Reload the best checkpoint from /tmp/tmpo8wg3j8h
Floating point exception (core dumped)
```

## Environment
- Operating system: Ubuntu-22.04.1 (X86)
- Python version: 3.9.18
- PECOS version: 1.2.2
- torch: 1.13.1
- numpy: 1.26.2
- scipy: 1.11.4
- transformers: 4.36.2


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Floating point exception (core dumped) problem #273

Description

Steps to reproduce

Error message or code output

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Floating point exception (core dumped) problem #273

Description

Description

Steps to reproduce

Error message or code output

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions