21.8.07

Understanding Humor

A recurring theme in Star Trek: The Next Generation (for example, The Outrageous Okona) was Data's attempt to understand humor. A story in The New Scientist reports on two University of Cincinnati researchers who have written a program that understands a particular kind of joke.

The kind of joke the program recognizes relies on a pun:

Property broker: "I tell you this new house has no flaws at all."
Buyer: "Then what do you walk on?"

Here's how the article describes the program:
To teach the program to spot jokes, the researchers first gave it a database of words, extracted from a children's dictionary to keep things simple, and then supplied examples of how words can be related to one another in different ways to create different meanings. When presented with a new passage, the program uses that knowledge to work out how those new words relate to each other and what they likely mean. When it finds a word that doesn't seem to fit with its surroundings, it searches a digital pronunciation guide for similar-sounding words. If any of those words fits in better with the rest of the sentence, it flags the passage as a joke. The result is a bot that "gets" jokes that turn on a simple pun.

Also interesting is the sort of joke the program does not understand:

Patient: "Doctor, doctor, I swallowed a bone."
Doctor: "Are you choking?"
Patient: "No, I really did!"

Here, the program fails because the word "choking" occurs in a context (medical) in which it fits. As the researchers remark in the UC news release, the hardest part of the program is representing the knowledge of the world required to recognize an anomalous word.

And this highlights something Simon DeDeo said offhandedly in "Towards an Anarchist Poetics" in Absent: "Every attempt to produce authentic text with a computer equipped only with a dictionary and syntax rules has failed." This is true. It is also silly—no one and nothing can produce "authentic text" with only a dictionary and syntax rules. Writing of any sort involves many rules, conventions, and cultural knowledge far beyond vocabulary and syntax. Finding ways to represent that knowledge and put it to use is the hardest part of programming some sort of writing machine. (See Haiku, Dog Grammar, and Computer-Generated Text I & II for a long aside.)

[The researchers' presentation is not available online. A story in The Telegraph provides different details. You can find references to other, earlier articles on Julia Taylor's web page and on the page for the UC Applied AI lab. Taylor's page includes a PDF of her Master's thesis which contains a summary of earlier humor-recognizing programs and an extensive taxonomy of jokes based on puns.]

1 Comments:

Blogger Simon said...

I think the question is -- is it possible to have a computer say something that we could read as the "authentic" speech of someone other than the programmer? Of course if you load up your code with enough material, you will get something that says something -- but it seems that behind this will be the programmer. Can the programmer surprise himself?

29/8/07 11:03  

Post a Comment

<< Home