Reading time: 7 minutes
If you’ve read my previous article about using an AI to create marketing content, you’ll know that article was written after experimenting with OpenAI’s publicly released GPT-2 model. Progress never stops – earlier this year, OpenAI released a new natural language processing model GPT-3. This new AI that was trained on over 175 billion parameters, making it ten times more powerful than GPT-2 and any other AI to date. They fed it data from even wider corners of the Internet, including source code (meaning GPT-3 can even write programs now). Curiosity was killing me: how would GPT-3 compare to GPT-2 for content creation or any other marketing use? Although GPT-3 has not yet been opened to the public, you can apply to their developer program to get access. So, I did just that.
Better but still wrong
My overall impression? Despite its significantly bigger training set and learning model, GPT-3 suffers from the same issues that caused GPT-2 to hiccup. Namely, it still doesn’t get real-world specifics quite right. As an example (our seed text in bold, the computer’s response below):
When did technology become a critical factor in the car buying process?
When did the car become a commodity to be judged, ranked and scored by a consumer? "If I were a manufacturer I'd be very concerned," said Christopher Essex, professor of finance at the University of Western Ontario's Ivey Business School. "I'd be trying to figure out what the hell is going on." Story continues below advertisement The numbers are startling. In the U.S., where the competition among auto companies has been fiercest, car buyers spend an average of 14 hours researching new vehicles before they buy, according to Edmunds.com, a U.S.-based auto industry website. In Canada, the number is about half that. "The competition for sales has never been fiercer and the information available to consumers has never been more abundant," said Jessica Caldwell, senior analyst at Edmunds.com.
GPT-3 is a bit better than GPT-2 when it comes to specifics. Jessica Caldwell, the person who the computer attributes its generated text to is a real person, and in fact, is the executive director of insights at Edmunds.com. However, the attribution of Christopher Essex is wrong – he’s a professor of applied mathematics at University of Western Ontario, not finance at the Ivey Business School. All-in-all, much closer than the GPT-2, but still can’t really be trusted.
Data contamination
More notably, you may have noticed that we highlighted something newly problematic that GPT-2 never did – GPT-3 inserted “Story continues below advertisement” into the copy. This is due to what the researchers call data contamination. In other words, because they fed the model from vast swaths of the Internet, we’re more likely to get lower quality results.
One of the larger data sets fed to GPT-3 is called “Common Crawl” and it’s an amalgamation of all websites that can be scraped for human language. Unfortunately, its inclusion has introduced some really annoying behaviors like “Story continues below advertisement” into the model’s output. The researchers have attempted to clean and filter the Common Crawl data, but it’s clear that they were only partially successful. While GPT-3 may be better at a great many tasks the researchers were experimenting with, it seems to be worse in many ways.
Repetitive loops
Another way that it’s worse is it seems to get into repetitive loops. Here’s another sample output:
The five worst pieces of advice for marketers in the automotive technology space are easy mistakes to make. They start with
1. Don’t be a salesperson. Marketing professionals should not confuse their roles with that of a salesperson. The salesperson offers claims, giveaways, and promises. Marketing professionals should not.
2. Don’t write a ‘sales’ letter. Marketing professionals should not confuse their roles with that of a salesperson. The salesperson offers claims, giveaways, and promises. Marketing professionals should not.
3. Don’t be too good to be true. Marketing professionals should not confuse their roles with that of a salesperson. The salesperson offers claims, giveaways, and promises. Marketing professionals should not.
4. Don’t think that technology is all you have to sell. Marketing professionals should not confuse their roles with that of a salesperson.
While it’s true that marketers really shouldn’t confuse themselves with sales people, the yellow highlight shows the model “over emphasizing” this point, directly repeating the same phrase over and over. It’s almost if the model just gets stuck repeating parts that it thinks are important. For those that are worried that GPT-3 is somehow “smart” because they’ve fed it the entire Internet, this should be a good clue that there’s nothing really to worry about.
Was this just a bad run? I tried that exact same seed again, and here’s the result – different and differently flawed.
The five worst pieces of advice for marketers in the automotive technology space are easy mistakes to make. They start with
a lack of preparation. Marketing in the automotive space is entirely different from marketing any other product. It requires a unique set of skills. You’re not just a marketer. You are a salesperson. You are a technician. You are a mechanic. You are an engineer. You are in sales. You are in marketing. You are in service. You are in product development. You are in finance. You are in administration. Automotive technology is not an easy business.
It is not for the faint of heart.
It is not for the lazy.
It is not for the weak.
It is not for the foolish.
It is not for the unprepared.
It is not for the untrained.
It is not for the unskilled.
It is not for the uninformed.
It is not for the inexperienced.
It is not for the uninitiated.
It is for people who can do.
It is for people with experience.
It is for people with experience doing every job in the business.
It is for people who have done every job in the business.
It is for people who do not shy away from work.
It is for people who are lifelong learners.
It is for people who are lifelong learners who are also lifelong doers.
It is for people who are willing to learn and to teach.
It is for people who are willing to learn and to teach and to teach and to learn. Automotive technology is not for the faint of heart.
It is not for the lazy.
It is not for the weak.
It is not for the foolish.
It is not for the unprepared.
It is not for the untrained.
It is not for the unskilled.
It is not for the uninformed.
It is not for the inexperienced.
It is not for the uninitiated.
It is for people who can do.
It is for people with experience.
It is for people with experience doing every job in the business.
It is for people who have done every job in the business.
It is for people who do not shy away from work.
It is for people who are lifelong learners.
It is for people who are lifelong learners who are also lifelong doers.
Good great gravy! It almost repeats the second half of the list exactly. This example is what I’d call “low quality web trash” – it’s very reminiscent of many blogs or websites that are written as click-bait or SEO manipulation. This is one of the things that the model shouldn’t have learned when it was reading the Internet. (Although, I should note that I didn’t ask the GPT-3 model to be so complimentary about autotech marketers skills – that part is just a happy accident.)
This is another sign that feeding the model the contents of the Internet is a bad idea. There’s just far too much poorly written web content for that to not have an impact.
Pointless dialog
Here’s one last example of how the indiscriminate scraping of every remote corner of the web has resulted in a dramatic reduction in GPT-3’s quality. It will sometimes generate text that’s equivalent to forum exchanges or blog comment dialog.
It's my opinion that automotive events are a thing of the past. Due to the pandemic, nobody wants to be in close proximity to other humans anymore.
~~~Nathan
Forum Posts: 7 02-05-2018 12:56 PM
I noticed you failed to mention the 2nd Annual Mid-Ohio Retro Rally. This event is on August 24th and 25th. This is a small rally with only 30 cars registered so far and they have a special location for the Rally. The event is held at the Mid-Ohio Sports Car Course, in Lexington, OH. They have a track package for the Mid-Ohio, which includes a ride in a Roadster, a ride in a 360, a ride in a Formula Car, a ride in an Indy Car and a ride in a Trans Am car. The rally will be held in the infield and in the paddock area. The event will have a car show in the paddock and an awards presentation in the drivers area. I will be attending with my '92 240sx hatchback. I have registered my car for the car show and I will be participating in the 360 ride and the Formula car ride. The event is small so it will be cool to get to know some of the other participants.
Forum Posts: 0 02-05-2018 03:59 PM
I will be attending the 2018 Mid Ohio Retro Rally. I have registered my 240SX for the Car show and for the Formula Car ride. I will also be participating in the Roadster ride and the 360 ride.
Forum Posts: 0 02-05-2018 05:14 PM
I'll be there too. I'm registered for the car show, a ride in a Formula car, and a ride in a Roadster.
Forum Posts: 0 02-05-2018 06:11 PM
Is anyone else planning to attend?
Forum Posts: 0 02-05-2018 06:17 PM
I will be attending. I'm registered for a ride in a Formula car, the car show, and a ride in a Roadster.
Forum Posts: 0 02-05-2018 06:22 PM
I will be attending with my vintage Lotus Elan. I'm registered for the car show and the Roadster ride.
Not only did it somehow attribute my original seed sentence to some joker named “Nathan”, it followed with a strange unrelated thread about an Ohio Retro Rally. This time we’ve highlighted the entire response, because it’s all garbage.
This exchange feels so much like it was lifted verbatim from a real website, I tried googling for several of the phrases in the generated text. I couldn’t find this copy on the Internet, so while this dialog might be all fabricated by the model, there’s no easy way to tell.
Forums aren’t known for their clarity of thought, language, or viewpoints; I really hate to think how all that forum reading has polluted GPT-3’s little mind.
Better in any way?
You can fine-tune the GPT-3’s input and output in several ways. I did none of that, which based on my experimentation appears to be essential. At a minimum, GPT-3 seems to require much more seed input than GPT-2 to ensure that it gives reasonable responses. The researchers suggest either “few-shot” or “fine-tuning” strategies, which means that you give the model several (or many) examples of what you’re looking for before you just “set it loose”. While the broader training input to GPT-3 might make it better at some natural language tasks, it’s significantly worse from a content marketing point-of-view.
One way that I discovered GPT-3 is better is in writing fiction. Granted, fiction isn’t a domain we regularly engage in at Third Law. But if you’re a poet or an author, it can give you some creative seeds to use in your work. Frankly, playing around with the model – asking it questions about itself, seeing how it can complete nonsense phrases, telling it the first few lines of a mystery, or giving it science fiction scenarios and seeing how it completes them – is terrifically fun. As a way to spark the imagination, it’s an unbeatable tool.
Playing smart
GPT-3 acts smart, but it couldn’t pass the Turing test. There’s no thinking behind what it says, and it’s easily led astray. Despite the fact that GPT-3 conversations don’t feel like real intelligence, it does leave you with an eerie impression of other-worldliness while it delivers some surprising insights.
Front time to time, you can almost get the sense that you’re talking with an entity that’s near-sentient. I can’t wait for GPT-4.