Considering the current popularity of social media platforms, informal communication has surged. Often, these platforms host multilingual conversations, and to top it off, much of the language is informal and riddled with sarcasm.
However, current translation tools struggle to accurately convey the intended meaning of sarcastic messages (see Figure 1). This is due to their inability to grasp the nuances of informal language and the layered meanings behind sarcasm.
This research aims to bridge this gap in the translation experience by developing a novel approach to translating sarcastic English tweets into honest Telugu interpretations.
We designed a two-pipeline approach (see Figure 3). Our focus is on Pipeline A, which handles sarcasm translation in two steps:
-
Sarcasm Interpretation:
- Use Seq2Seq models to convert sarcastic English tweets into honest English.
- Hypothesis: Transformer models (like BERT) will outperform RNN-based approaches (e.g., Peled & Reichart, 2017) due to their contextual strength.
-
Telugu Translation:
- Translate the honest English interpretation to Telugu using machine translation techniques.
This ensures the translated message accurately conveys the true meaning behind the sarcastic tweet.
Pipeline B, on the other hand, attempts a direct translation from sarcastic English to honest Telugu. However, we hypothesize that Pipeline A will perform better because:
- Telugu is a low-resource language
- English-to-English models show higher contextual performance than English-to-low-resource translation
This research aims to:
- Improve informal online communication (see Figure 2)
- Foster cross-lingual understanding in social media
- Provide a baseline for solving other open-class NMT challenges
Ultimately, by integrating sarcasm interpretation into machine translation, we aim to:
- Achieve high-quality low-resource translations
- Enhance model performance in handling complex language features
There has been significant research around sarcasm interpretation, but limited work on sarcasm translation, especially in the context of text-to-text translation of memes.
While notable progress has been made in sarcasm interpretation using multi-modal models [(Desai et al., 2022)], translating sarcastic content remains a challenge.
Sarcasm translation is considered an open-class Neural Machine Translation (NMT) problem. This is because the meaning of sarcastic expressions is not compositional—it doesn’t arise simply from the meanings of individual words. Models that directly translate such expressions often fail to preserve their intended meaning.
A similar challenge is seen in idiom translation, which has seen more progress. For instance, Baziotis et al. (2022) provide an evaluation and analysis of idioms, offering useful insights for handling open-class translation problems.
However, their work does not address low-resource language adaptations, which is a key focus in our project.
Our sarcasm interpretation method builds upon the approach by Peled and Reichart (2017), who framed the problem as a monolingual machine translation task. Their model, called SIGN (Sarcasm Sentiment Interpretation Generator), focuses on sentiment words that express the opposite of their literal meaning in sarcastic contexts.
Key steps in SIGN:
- Clustering Sentiment Words into positive and negative based on semantic similarity
- Replacing Sentiment Words with cluster IDs in both:
- Sarcastic source text
- Honest reference text
- Training a phrase-based MT model on this transformed data
- De-clustering the output at inference time to recover the honest interpretation
They propose 3 de-clustering strategies:
- SIGN-centroid: Replace each cluster ID with the sentiment word closest to its centroid in the word embedding space
- SIGN-context: Use point-wise mutual information with neighboring words to choose replacements
- SIGN-oracle: Use human judgment for the best replacement (upper-bound performance)
While SIGN did not outperform baselines on automatic metrics, human evaluation showed SIGN's outputs better captured intended sentiment, especially with the context-based method.
Translating English into Telugu, a low-resource language, is a multifaceted challenge due to:
- Rich morphological structure
- High syntactic diversity
- Limited high-quality parallel corpora
Prior works like Prasad and Muthukumaran (2013) and Ramesh et al. (2023) have shown progress in Indian language MT, but Telugu still presents unique difficulties.
Advanced transformer-based models like:
- T5 / mT5 [(Raffel et al., 2020)]
- mBART [(Tang et al., 2020)]
... have shown exceptional performance across multilingual tasks by capturing long-range dependencies and rich context.
Additionally, custom tokenizers tailored for Telugu are necessary for proper evaluation:
- Sandhi (morphophonemic changes)
- Samasa (compound word formations)
See also: IndicTrans2 [(Gala et al., 2023)] — a framework for Indian language MT emphasizing tokenization techniques and cultural nuance preservation.
This enables evaluation beyond literal correctness—focusing on cultural and contextual alignment as well.
To evaluate our sarcasm interpretation pipelines, we required a dataset containing:
- English sarcastic sentences
- Corresponding honest Telugu translations
Since such a high-quality dataset was unavailable, we extended the Sarcasm SIGN dataset [(Peled and Reichart, 2017)]:
- Contains 2,993 unique sarcastic tweets
- Each tweet has 5 English interpretations → total of 14,965 interpretations
-
Initial Translation:
- Used Google Translate API to generate Telugu interpretations for the 14,965 English ones.
-
Manual Correction:
- Every Telugu sentence was manually vetted and corrected by native Telugu-speaking team members.
- Goal: Improve semantic alignment and idiomatic correctness.
-
Common Observations During Correction:
- Non-alphabetic symbols sometimes mistranslated or returned as Unicode.
- Lack of native terminology led to:
- Leaving English terms unchanged (a common practice in Telugu)
- Transliteration of English terms into Telugu
- Resolved named entities (e.g., football teams, companies) using Telugu news sources:
✅ The corrected translations were used as the ground truth for model training and evaluation.
We present two schemes to interpret and translate English sarcasm:
-
Interpretation Phase:
- Fine-tune a seq2seq model to convert English sarcasm → English honest interpretations
-
Translation Phase:
- Fine-tune a machine translation (MT) model to translate: English honest interpretation → Telugu honest interpretation
- Fine-tune a translation model to map: English sarcasm → Telugu honest interpretation
We used HuggingFace pre-trained models:
-
English to English (Interpretation):
google-t5/t5-*facebook/bart-*
-
English to Telugu (Translation):
google/mt5-*facebook/mbart-*
- Loss-based early stopping with patience of 5
- Trained for ~15 epochs
- Hardware: 3 × NVIDIA A100-SXM4-80GB GPUs
- Batch size: 32
- Validation/Test split: 20% each
Models were selected based on best validation loss.
We used the following automated metrics:
- BLEU
- ROUGE (1, 2, and L variants)
- PINC [(Chen and Dolan, 2011)]
- N-gram dissimilarity metric
- Originally designed for paraphrase tasks
- All metrics calculated on test set
- Telugu evaluations used manually corrected interpretations as references
- Used the pre-trained tokenizers from the respective models for fair metric computation [(Ramesh et al., 2023)]
We followed the approach of Desai et al. (2022).
- 25 random samples were selected from the test set.
- 7 evaluators (linguistic experts aged 20–30) were asked to rate outputs from the best models of each pipeline.
Metrics Rated:
- Adequacy: Accuracy of interpreting sarcasm
- Fluency: Coherency of the Telugu translation
- Excellent
- Good
- Fair
- Poor
- Majority voting used to finalize score per example
- Tie-breaks:
- For 2-way ties: select lower rating
- For longer ties: select median rating
We present and compare the performance of our fine-tuned models against existing approaches.
In Table 1, we compare our fine-tuned models with SIGN [(Peled and Reichart, 2017)].
| Model | BLEU | ROUGE-1 | ROUGE-2 | ROUGE-L | PINC |
|---|---|---|---|---|---|
| SIGN ‡ | 66.96 | 70.34 | 42.81 | 69.98 | 47.11 |
| T5-base † | 84.34 | 87.89 | 80.90 | 87.37 | 15.97 |
| T5-large † | 85.29 | 89.28 | 82.83 | 88.95 | 13.83 |
| BART-large † | 86.32 | 86.40 | 80.73 | 86.21 | 11.06 |
Our models achieve higher BLEU and ROUGE than SIGN, indicating accurate interpretations. Lower PINC suggests smaller surface-level changes, which is expected when sarcasm is subtle.
In Table 2, we report BLEU and ROUGE scores for both pipelines.
Pipeline A clearly outperforms direct translation (Pipeline B), especially in BLEU score (35.80 vs 31.69).
| Pipeline | Adequacy (avg) | Fluency (avg) |
|---|---|---|
| A | 3.8 | 3.88 |
| B | 3.2 | 3.04 |
Pipeline A again outperforms in both interpretation accuracy and fluency.
Two high-rated and two low-rated samples are shown, illustrating how sentence length and punctuation affected translation quality.
- Shorter sentences tend to yield better results
- Removing punctuation led to misinterpreted sarcasm, affecting translations
In this paper, we explored two approaches for achieving accurate sarcasm interpretation and translation from English to Telugu:
- English sarcasm → English honest interpretation
- English honest → Telugu honest translation
- English sarcasm → Telugu honest interpretation
To effectively fine-tune our models, we manually curated a Telugu dataset by correcting Google Translate outputs, ensuring semantic and contextual alignment.
We evaluated model performance using both:
- Automatic metrics (BLEU, ROUGE, PINC)
- Human evaluations (Adequacy, Fluency)
- Pipeline A outperformed Pipeline B on all evaluation metrics.
- Transformer-based models (like T5, mBART) significantly improved results.
- Incorporating sarcasm interpretation as a preprocessing step enhanced translation quality for low-resource languages.
- Human evaluations confirmed the superiority of Pipeline A in fluency and adequacy.
While this research focused on translating literal meaning into Telugu, future work can aim to:
- Enable direct translation of sarcastic intent into Telugu or other languages
- Expand the dataset to include sarcastic phrases
-
Baziotis, C., Mathur, P., & Hasler, E. (2022). Automatic evaluation and analysis of idioms in neural machine translation. arXiv:2210.04545
-
Chen, D., & Dolan, W. B. (2011). Collecting highly parallel data for paraphrase evaluation. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 190–200.
-
Desai, P., Chakraborty, T., & Akhtar, M. S. (2022). Nice perfume. how long did you marinate in it? multimodal sarcasm explanation. In Proceedings of the AAAI Conference on Artificial Intelligence, 36, 10563–10571.
-
Gala, J., Chitale, P. A., Raghavan, A. K., Doddapaneni, S., Gumma, V., Kumar, A., Nawale, J., Sujatha, A., Puduppully, R., Raghavan, V., et al. (2023). IndicTrans2: Towards high-quality and accessible machine translation models for all 22 scheduled Indian languages. arXiv:2305.16307
-
Peled, L., & Reichart, R. (2017). Sarcasm SIGN: Interpreting sarcasm with sentiment-based monolingual machine translation. arXiv:1704.06836
-
Prasad, T. V., & Muthukumaran, G. M. (2013). Telugu to English translation using direct machine translation approach. International Journal of Science and Engineering Investigations, 2(12), 25–32.
-
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., & Liu, P. J. (2020). Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of Machine Learning Research, 21(140), 1–67. Link
-
Ramesh, G., Doddapaneni, S., Bheemaraj, A., Jobanputra, M., Raghavan, A. K., Sharma, A., Sahoo, S., Diddee, H., J, M., Kakwani, D., et al. (2023). Samanantar: The largest publicly available parallel corpora collection for 11 Indic languages.
-
Tang, Y., Tran, C., Li, X., Chen, P. J., Goyal, N., Chaudhary, V., Gu, J., & Fan, A. (2020). Multilingual translation with extensible multilingual pretraining and finetuning. arXiv:2008.00401




