AI and the Art of Humor: New Study Reveals Limitations of Language Models

AI and the Art of Humor: New Study Reveals Limitations of Language Models

Researchers from Cardiff University and Ca’ Foscari University of Venice have uncovered intriguing insights into the humor comprehension of large language models (LLMs). At the 2025 Conference on Empirical Methods in Natural Language Processing held in Suzhou, China, researchers presented their discoveries. They found that although LLMs can notice the structure of puns, their comprehension is shallow and brittle.

>Professor Jose Camacho Collados from Cardiff University’s School of Computer Science and Informatics is leading the study. It sets out to test just how well large language models (LLMs) understand puns. Our models are able to pick up on pun structures. The findings indicate they don’t really understand what makes them funny.

In a battery of experiments, the researchers discovered how badly LLMs failed when they encountered a little wordplay that wasn’t in their training set. Their accuracy rates even sank below 20% when required to tell puns apart from non-pun sentences. The rest of the team jokingly pointed out that LLMs don’t even know what an “ukelele” is. They imagined it being pronounced as “you-kill-LLM,” making it an even better pun. Changing their focus to “ukulele” didn’t make them seem any less deadly.

Professor Camacho Collados commented on the findings, stating, “In general, LLMs tend to memorise what they have learned in their training. As such, they catch existing puns well but that doesn’t mean they truly understand them.”

Overall, this study shows a deeper picture of how LLMs engage with language. The research team proved just how simple it is to trick these models by changing pre-existing puns or stripping them of their double meanings. In these cases, the models were forced to lean on prior associations and explain their outputs with different types of rationale.

We found that the easiest way to trick LLMs was by taking existing puns, stripping them of their double meaning, which entirely eliminated the basis of the original pun. In such constructions, models relate the sentences to previous equal puns. Then they create every possible convoluted rationale to justify labelling them puns. In the end, we discovered that their knowledge of puns is a mirage,” Professor Camacho Collados concluded.

The implications of this research are significant. The operations of artificial intelligence including machine learning, deep learning, and automated decision making are moving quickly. Parsing difficult human expressions and subtleties such as humor are still an LLM hurdle. This research illustrates AI’s inability to grasp cultural and linguistic subtleties. Beyond what it does, it raises important ethical questions of how it’s being used across industries that require clear and unequivocal communication.

Tags