-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] update save config file #84
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is sample of evaluators:
amazon_counterfactual_classification:
class_path: jmteb.evaluators.ClassificationEvaluator
init_args:
train_dataset:
class_path: jmteb.evaluators.classification.data.HfClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: train
name: amazon_counterfactual_classification
text_key: text
label_key: label
val_dataset:
class_path: jmteb.evaluators.classification.data.HfClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: amazon_counterfactual_classification
text_key: text
label_key: label
test_dataset:
class_path: jmteb.evaluators.classification.data.HfClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: amazon_counterfactual_classification
text_key: text
label_key: label
average: macro
log_predictions: false
amazon_review_classification:
class_path: jmteb.evaluators.ClassificationEvaluator
init_args:
train_dataset:
class_path: jmteb.evaluators.classification.data.HfClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: train
name: amazon_review_classification
text_key: text
label_key: label
val_dataset:
class_path: jmteb.evaluators.classification.data.HfClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: amazon_review_classification
text_key: text
label_key: label
test_dataset:
class_path: jmteb.evaluators.classification.data.HfClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: amazon_review_classification
text_key: text
label_key: label
average: macro
log_predictions: false
esci:
class_path: jmteb.evaluators.RerankingEvaluator
init_args:
val_query_dataset:
class_path: jmteb.evaluators.reranking.data.HfRerankingQueryDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: esci-query
query_key: query
retrieved_docs_key: retrieved_docs
relevance_scores_key: relevance_scores
test_query_dataset:
class_path: jmteb.evaluators.reranking.data.HfRerankingQueryDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: esci-query
query_key: query
retrieved_docs_key: retrieved_docs
relevance_scores_key: relevance_scores
doc_dataset:
class_path: jmteb.evaluators.reranking.data.HfRerankingDocDataset
init_args:
path: sbintuitions/JMTEB
split: corpus
name: esci-corpus
id_key: docid
text_key: text
log_predictions: false
top_n_docs_to_log: 5
jagovfaqs_22k:
class_path: jmteb.evaluators.RetrievalEvaluator
init_args:
val_query_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalQueryDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: jagovfaqs_22k-query
query_key: query
relevant_docs_key: relevant_docs
test_query_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalQueryDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: jagovfaqs_22k-query
query_key: query
relevant_docs_key: relevant_docs
doc_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalDocDataset
init_args:
path: sbintuitions/JMTEB
split: corpus
name: jagovfaqs_22k-corpus
id_key: docid
text_key: text
doc_chunk_size: 1000000
log_predictions: false
top_n_docs_to_log: 5
jaqket:
class_path: jmteb.evaluators.RetrievalEvaluator
init_args:
val_query_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalQueryDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: jaqket-query
query_key: query
relevant_docs_key: relevant_docs
test_query_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalQueryDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: jaqket-query
query_key: query
relevant_docs_key: relevant_docs
doc_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalDocDataset
init_args:
path: sbintuitions/JMTEB
split: corpus
name: jaqket-corpus
id_key: docid
text_key: text
doc_chunk_size: 1000000
log_predictions: false
top_n_docs_to_log: 5
jsick:
class_path: jmteb.evaluators.STSEvaluator
init_args:
val_dataset:
class_path: jmteb.evaluators.sts.data.HfSTSDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: jsick
sentence1_key: sentence1
sentence2_key: sentence2
label_key: label
test_dataset:
class_path: jmteb.evaluators.sts.data.HfSTSDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: jsick
sentence1_key: sentence1
sentence2_key: sentence2
label_key: label
log_predictions: false
jsts:
class_path: jmteb.evaluators.STSEvaluator
init_args:
val_dataset:
class_path: jmteb.evaluators.sts.data.HfSTSDataset
init_args:
path: sbintuitions/JMTEB
split: train
name: jsts
sentence1_key: sentence1
sentence2_key: sentence2
label_key: label
test_dataset:
class_path: jmteb.evaluators.sts.data.HfSTSDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: jsts
sentence1_key: sentence1
sentence2_key: sentence2
label_key: label
log_predictions: false
livedoor_news:
class_path: jmteb.evaluators.ClusteringEvaluator
init_args:
val_dataset:
class_path: jmteb.evaluators.clustering.data.HfClusteringDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: livedoor_news
text_key: text
label_key: label
test_dataset:
class_path: jmteb.evaluators.clustering.data.HfClusteringDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: livedoor_news
text_key: text
label_key: label
log_predictions: false
massive_intent_classification:
class_path: jmteb.evaluators.ClassificationEvaluator
init_args:
train_dataset:
class_path: jmteb.evaluators.classification.data.HfClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: train
name: massive_intent_classification
text_key: text
label_key: label
val_dataset:
class_path: jmteb.evaluators.classification.data.HfClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: massive_intent_classification
text_key: text
label_key: label
test_dataset:
class_path: jmteb.evaluators.classification.data.HfClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: massive_intent_classification
text_key: text
label_key: label
average: macro
log_predictions: false
massive_scenario_classification:
class_path: jmteb.evaluators.ClassificationEvaluator
init_args:
train_dataset:
class_path: jmteb.evaluators.classification.data.HfClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: train
name: massive_scenario_classification
text_key: text
label_key: label
val_dataset:
class_path: jmteb.evaluators.classification.data.HfClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: massive_scenario_classification
text_key: text
label_key: label
test_dataset:
class_path: jmteb.evaluators.classification.data.HfClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: massive_scenario_classification
text_key: text
label_key: label
average: macro
log_predictions: false
mewsc16:
class_path: jmteb.evaluators.ClusteringEvaluator
init_args:
val_dataset:
class_path: jmteb.evaluators.clustering.data.HfClusteringDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: mewsc16_ja
text_key: text
label_key: label
test_dataset:
class_path: jmteb.evaluators.clustering.data.HfClusteringDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: mewsc16_ja
text_key: text
label_key: label
log_predictions: false
mrtydi:
class_path: jmteb.evaluators.RetrievalEvaluator
init_args:
val_query_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalQueryDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: mrtydi-query
query_key: query
relevant_docs_key: relevant_docs
test_query_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalQueryDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: mrtydi-query
query_key: query
relevant_docs_key: relevant_docs
doc_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalDocDataset
init_args:
path: sbintuitions/JMTEB
split: corpus
name: mrtydi-corpus
id_key: docid
text_key: text
doc_chunk_size: 10000
log_predictions: false
top_n_docs_to_log: 5
nlp_journal_abs_intro:
class_path: jmteb.evaluators.RetrievalEvaluator
init_args:
val_query_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalQueryDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: nlp_journal_abs_intro-query
query_key: query
relevant_docs_key: relevant_docs
test_query_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalQueryDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: nlp_journal_abs_intro-query
query_key: query
relevant_docs_key: relevant_docs
doc_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalDocDataset
init_args:
path: sbintuitions/JMTEB
split: corpus
name: nlp_journal_abs_intro-corpus
id_key: docid
text_key: text
doc_chunk_size: 1000000
log_predictions: false
top_n_docs_to_log: 5
nlp_journal_title_abs:
class_path: jmteb.evaluators.RetrievalEvaluator
init_args:
val_query_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalQueryDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: nlp_journal_title_abs-query
query_key: query
relevant_docs_key: relevant_docs
test_query_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalQueryDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: nlp_journal_title_abs-query
query_key: query
relevant_docs_key: relevant_docs
doc_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalDocDataset
init_args:
path: sbintuitions/JMTEB
split: corpus
name: nlp_journal_title_abs-corpus
id_key: docid
text_key: text
doc_chunk_size: 1000000
log_predictions: false
top_n_docs_to_log: 5
nlp_journal_title_intro:
class_path: jmteb.evaluators.RetrievalEvaluator
init_args:
val_query_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalQueryDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: nlp_journal_title_intro-query
query_key: query
relevant_docs_key: relevant_docs
test_query_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalQueryDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: nlp_journal_title_intro-query
query_key: query
relevant_docs_key: relevant_docs
doc_dataset:
class_path: jmteb.evaluators.retrieval.data.HfRetrievalDocDataset
init_args:
path: sbintuitions/JMTEB
split: corpus
name: nlp_journal_title_intro-corpus
id_key: docid
text_key: text
doc_chunk_size: 1000000
log_predictions: false
top_n_docs_to_log: 5
paws_x_ja:
class_path: jmteb.evaluators.PairClassificationEvaluator
init_args:
val_dataset:
class_path: jmteb.evaluators.pair_classification.data.HfPairClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: validation
name: paws_x_ja
sentence1_key: sentence1
sentence2_key: sentence2
label_key: label
test_dataset:
class_path: jmteb.evaluators.pair_classification.data.HfPairClassificationDataset
init_args:
path: sbintuitions/JMTEB
split: test
name: paws_x_ja
sentence1_key: sentence1
sentence2_key: sentence2
label_key: label
save_dir: /somewhere/checkpoints/checkpoint-100/jmteb_evaluation
overwrite_cache: true
log_predictions: true
embedder:
class_path: jmteb.embedders.DataParallelSentenceBertEmbedder
init_args:
model_name_or_path: /somewhere/checkpoints/checkpoint-100
batch_size: 16384
normalize_embeddings: false
max_seq_length: 512
add_eos: false
model_kwargs:
torch_dtype: torch.bfloat16
auto_find_batch_size: true |
lsz05
approved these changes
Nov 21, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ありがとうございます!LGTMです!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
関連する Issue / PR
#46
PR をマージした後の挙動の変化
JMETBによる評価時に `--save_dir` 内に `jmteb_config.yaml`というconfig fileを保存したい挙動の変更を達成するために行ったこと
`src/jmteb/__main__.py` に保存用のコードを追加動作確認