Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] 複数GPU使用時、cacheを利用した場合としない場合でclassification taskのスコアが異なる #49

Open
akiFQC opened this issue Aug 2, 2024 · 0 comments

Comments

@akiFQC
Copy link
Collaborator

akiFQC commented Aug 2, 2024

複数GPU使用時、cacheを利用した場合としない場合でclassification taskのスコアが異なる、という問題が起こっています。

起こっている問題

実行環境

  • A100 x 8GPU
  • https://github.com/sbintuitions/JMTEB/tree/dev を使用

実行したこと

下記の2コマンドでで JMTEBを評価

実験1

torchrun --nproc-per-node 8\
  src/jmteb/__main__.py \
  --embedder TransformersEmbedder \
  --embedder.model_name_or_path "$model" \
  --embedder.normalize_embeddings False \
  --embedder.max_seq_length 512 \
  --embedder.batch_size 64 \
  --save_dir $model/jmteb_evaluation \
  --embedder.model_kwargs '{"torch_dtype":"torch.bfloat16"}' \
  --overwrite_cache true \
  --evaluators src/jmteb/configs/jmteb.jsonnet  

実験1に続いて、実験1のcacheを利用して実験2を実施

実験2 (バッチサイズ 64→16, overwrite_cache true→false)

torchrun --nproc-per-node 8\
  src/jmteb/__main__.py \
  --embedder TransformersEmbedder \
  --embedder.model_name_or_path "$model" \
  --embedder.normalize_embeddings False \
  --embedder.max_seq_length 512 \
  --embedder.batch_size 16 \
  --save_dir $model/jmteb_evaluation \
  --embedder.model_kwargs '{"torch_dtype":"torch.bfloat16"}' \
  --overwrite_cache false \
  --evaluators src/jmteb/configs/jmteb.jsonnet  

結果

実験1と2でamazon_counterfactual_classification,amazon_review_classificationのスコアが大きく異なる。

amazon_counterfactual_classification

実験1 (キャッシュなし)

{
    "metric_name": "macro_f1",
    "metric_value": 0.8044316348273781,
    "details": {
        "optimal_classifier_name": "logreg",
        "val_scores": {
            "knn_cosine_k_2": {
                "accuracy": 0.9184549356223176,
                "macro_f1": 0.6717093066370041
            },
            "logreg": {
                "accuracy": 0.9141630901287554,
                "macro_f1": 0.7540248086566377
            }
        },
        "test_scores": {
            "logreg": {
                "accuracy": 0.9271948608137045,
                "macro_f1": 0.8044316348273781
            }
        }
    }
}

実験2(キャッシュあり)

{
    "metric_name": "macro_f1",
    "metric_value": 0.6258438858598253,
    "details": {
        "optimal_classifier_name": "knn_cosine_k_2",
        "val_scores": {
            "knn_cosine_k_2": {
                "accuracy": 0.9163090128755365,
                "macro_f1": 0.6680366047454656
            },
            "logreg": {
                "accuracy": 0.9012875536480687,
                "macro_f1": 0.47404063205417607
            }
        },
        "test_scores": {
            "knn_cosine_k_2": {
                "accuracy": 0.8982869379014989,
                "macro_f1": 0.6258438858598253
            }
        }
    }
}

amazon_review_classification

実験1(cacheなし)

{
    "metric_name": "macro_f1",
    "metric_value": 0.6315426985818074,
    "details": {
        "optimal_classifier_name": "logreg",
        "val_scores": {
            "knn_cosine_k_2": {
                "accuracy": 0.4866,
                "macro_f1": 0.47608787259379304
            },
            "logreg": {
                "accuracy": 0.64,
                "macro_f1": 0.636986907686679
            }
        },
        "test_scores": {
            "logreg": {
                "accuracy": 0.6342,
                "macro_f1": 0.6315426985818074
            }
        }
    }
}

実験2(キャッシュあり)

{
    "metric_name": "macro_f1",
    "metric_value": 0.4944960931963851,
    "details": {
        "optimal_classifier_name": "knn_cosine_k_2",
        "val_scores": {
            "knn_cosine_k_2": {
                "accuracy": 0.4916,
                "macro_f1": 0.48080999105915334
            },
            "logreg": {
                "accuracy": 0.2,
                "macro_f1": 0.06666666666666667
            }
        },
        "test_scores": {
            "knn_cosine_k_2": {
                "accuracy": 0.5028,
                "macro_f1": 0.4944960931963851
            }
        }
    }
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant