Member-only story

How Good Is GPT-4 At Solving Riddles?

Much better, and sometimes worse, than I thought.

9 min readJan 15, 2024

As a way to see how good GPT-4 is at solving unique problems, I decided to test Microsoft Copilot (formally Bing Chat) with a series of riddles that friends and I developed last year.

They were geared towards podcast super-fans, and each riddle consisted of a four line poem and a custom visual clue I illustrated to go with it.

The idea was simple: look at the poem + visual and see if you can guess the podcast being referenced. Some are easy, some are pretty difficult, but several people were able to get all of them with the help of extensive Googling.

So now that Microsoft Copilot can describe and understand images, let’s see how it handles this somewhat complicated task.

Here is the beginning prompt I gave it:

“I made an illustration and a riddle to go with it, where both are hints for the name of a podcast. Can you describe the image and the text of the riddle, and then make your best guess on what the show is and why.”

Then I uploaded the first image and gave it no further hints other than the image provided.

Before moving ahead, if you love podcasts and want to see all the riddles without the answers to see how many you can figure out, you can find them here.

For each riddle, I’ll post the image that was uploaded to Copilot as well as a screenshot of the response.

Riddle #1 (guessed correct)

The AI guessed: Endless Thread

Correct?: YES

Damn, freaking nailed it. For all of these, what impressed me most was its ability to both understand what the image is and how it relates to the riddle as a clue.

Riddle # 2 (guessed correct)

How Good Is GPT-4 At Solving Riddles?

Much better, and sometimes worse, than I thought.

Here is the beginning prompt I gave it:

Riddle #1 (guessed correct)

Riddle # 2 (guessed correct)

Written by Erik Jones

Responses (1)