Skip to content

Commit

Permalink
Mod: Update README and vocab output names.
Browse files Browse the repository at this point in the history
  • Loading branch information
Labbeti committed Apr 18, 2024
1 parent 5ed229e commit 4e7fa82
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 3 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ print(sents_scores)
Each metrics also exists as a python class version, like `aac_metrics.classes.cider_d.CIDErD`.

## Which metric(s) should I choose for Automated Audio Captioning?
To evaluate audio captioning systems, I would recommand to compute `SPIDEr`, `FENSE` and `Vocab` metrics. `SPIDEr` is useful to compare with the rest of the litterature, but it is highly sensitive to n-gram matching and can overestimate model trained with reinforcement learning. `FENSE` is more consistent and variable than `SPIDEr`, but uses a model not trained on audio captions. `Vocab` can give you an insight about the model diversity. To compute all of these metrics at once, you can use for example the `Evaluate` class:
To evaluate audio captioning systems, I would recommand to compute `SPIDEr`, `FENSE` and `Vocab` metrics. `SPIDEr` is useful to compare with the rest of the litterature, but it is highly sensitive to n-gram matching and can overestimate model trained with reinforcement learning. `FENSE` is more consistent and variable than `SPIDEr`, but it uses a model not trained on audio captions. `Vocab` can give you an insight about the model diversity. To compute all of these metrics at once, you can use for example the `Evaluate` class:

```python
from aac_metrics import Evaluate
Expand All @@ -113,7 +113,7 @@ mult_references: list[list[str]] = ...

corpus_scores, _ = evaluate(candidates, mult_references)

vocab_size = corpus_scores["vocab"]
vocab_size = corpus_scores["vocab.cands"]
spider_score = corpus_scores["spider"]
fense_score = corpus_scores["fense"]
```
Expand Down
2 changes: 1 addition & 1 deletion src/aac_metrics/classes/vocab.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ def compute(self) -> Union[VocabOuts, Tensor]:

def get_output_names(self) -> tuple[str, ...]:
return (
"vocab",
"vocab.cands",
"vocab.mrefs_full",
"vocab.ratio_full",
"vocab.mrefs_avg",
Expand Down

0 comments on commit 4e7fa82

Please sign in to comment.