AI vs Human Written Metadata: Which Leads to Higher CTRs?

Ever since ChatGPT was released in November 2022, digital markers have been continually testing and pushing the boundaries of the tool.

Seer is no exception - fromkeyword categorizationtointernal linking, we’ve been on a mission to figure out how ChatGPT can drive efficiency to benefit our team and clients.

Ourculture of testingwas highlighted within Anthony’s post from May 2023, regardingusing ChatGPT to write metadatatoo.

Spoiler alert:Anthony’s test uncovered that using ChatGPT led todecreasedperformance.

However, one of his noted variables was the small sample size of 4 pages in the test.

So, we decided to test how the results may vary with a larger sample size.

Test Results TLDR

When it comes to writing metadata with ChatGPT vs having our team manually write them, this is what we found:

+176% lift in click-through-ratemanually writing vs ChatGPT-4

-21.5% decline in CTRfor pages with ChatGPT-4 written metadata

>6x faster to write元数据with ChatGPT vs manually

While there are benefits of utilizing ChatGPT, the resulting performance is not comparable enough to justify utilizing this tool exclusively for metadata optimization.

There are multiple variables at play, requiring the need for a third test. Read on for a more detailed breakdown of this test.

Hypotheses

If ChatGPT-4 can provide metadata with comparable performance, then we should utilize the tool more often than manually providing metadata optimizations.
If Google overwrites meta descriptions and titles, then we should limit the time we spend writing them from scratch.

Methodology

Test groups

We categorized a list of 237 pages from the Seer blog into three groups:

Method	Number of Pages in Group	Author	Process
Manual Test Group	57	Meghan Evans	Our current process of keyword research and bespoke titles and descriptions.
GPT Test Group	90	ChatGPT-4 + LinkReader plugin	We used the LinkReader plugin so the tool would be able to crawl the content of each page to provide relevant metadata.
Google Test Group	90	N/A	We created this group with the intention of leaving all title tags and meta descriptions blank for Google to automatically fill in.

Generating the Metadata

ChatGPT-4 Metadata

We “trained” ChatGPT-4 by asking the tool to explain what an SEO optimized meta description and title tag would look like and to “act as an SEO practitioner” to ensure it was as up to the task as possible.

Google Metadata

As mentioned, the Google test group’s metadata was initially intended to be left blank. However, we discovered that Hubspot has a limitation and won’t allow the publishing of a page without these elements.

Due to this, we added each article's name in the title tag and followed two options for the meta description:

Left as a period. (Example)
Or input “We are not including an optimized meta description here to test how often Google rewrites metadata.” (Example)

We did this so the meta descriptions wouldn’t contain any information relevant to the page and give Google full rein to update as it pleased.

Measurement

Our measurement plan included standard SEO metrics:

URL Click-through-rate (CTR)
URLs rankings
# of rewritten titles and descriptions

We also assessed efficiency (time spent) using ChatGPT-4 for metadata vs. the typical manual process.

Results

After a month of testing our meta description optimizations, here’s what we saw for each group:

Method	CTR(Pre)	CTR(Post)	CTRPoP % Change
Manual	0.96%	1.12%	+16.26%
Google	0.25%	0.27%	+10.55%
ChatGPT-4	0.17%	0.13%	-21.48%

SEO Metrics

These metrics help us determine what actually performs best in terms of driving traffic to the website.

Winner: Manually Written Metadata

After testing these groups over the course of a month, the Manual group outperformed both ChatGPT-4 and Google methods.

ChatGPT-4 was the only group that experienced a decrease in CTR by -21.5%.

Manual Method CTR Data:

+176% liftvs ChatGPT-4
+54.1% liftvs Google rewrite
+16.3% liftvs previous period Manual CTR

Efficiency Metrics

These metrics help us understand what lowers the level-of-effort for the practitioners creating the meta data.

Winner: ChatGPT-4 Written Metadata

ChatGPT-4 enabled us to complete metadata>6x fasterthan the manual process.

ChatGPT-4 Method Efficiency Data:

2.5 minutes/pagevs 19.1 minutes per page manually writing metadata
3.75 hours for 90 pagesvs. 18.12 hours manually writing metadata for 57 pages

What’s Next?

So I shouldn’t use ChatGPT for Metadata?

We don’t believe that it’s time to give up manually writing metadata yet, but "don't use ChatGPT for Metadata"is notthe conclusion we’ve drawn here.

This technology is evolving at a quicker clip than we’ve ever seen before, so we will continue to test and learn in this space.

使用ChatGPT-4的主要好处是有效率的ciency, however, at this time it doesn’t seem like the performance is comparable enough to justify generating metadata exclusively with this tool.

Next test

Due to the multiple variables in this test, our next test is going to be swapping half of the Manual group with metadata written by ChatGPT-4, following a similar process as before.

Will this change result in a significant shift in CTR once variables are as consistent as possible?Stay tuned.

In our next test,Supernova™will be utilized to a greater capacity.

We’ll integrate both paid data (such as converting keywords) and organic features (such as People Also Ask questions) in order to provide ChatGPT additional context for metadata optimizations.

Additional Data + Test Variables

Organic rankings

Did it impact the test? Yes.

The average organic ranking across the test periods experienced minimal shifts, so we can’t determine a correlation between organic rankings and metadata methods.

While an incremental increase,the ChatGPT-4 group experienced the largest improvement in average ranking across test periods. This ranking shift did not drive an increase in CTR, possibly further highlighting the decreased performance of utilizing this tool for metadata.

Method	Beginning Avg. Rank	End Avg. Rank	Difference
Manual	18.0	17.6	-2.2%
ChatGPT-4	22.3	21.4	-4.0%
Google	23.3	24.3	+4.3%

Rewritten metadata

Did it impact the test? Yes.

We usedSupernova我们的基于云计算的数据platform, in order to analyze how often our implemented metadata was rewritten by Google in the search results. Here’s an example of how that report is viewed within the product’s dashboard:

image1-4

Manual written meta descriptions were rewritten 130% more often than URLs in the Google-written group.This includes theScreaming Frog Guidepage’s meta description that was rewritten 17 times over the course of 30 days.

Method	Avg. Titles	Avg.Descriptions
Manual	1.30	3.77
ChatGPT-4	0.74	1.72
Google	0.85	1.64

SERP Features

Did it impact the test? Yes.

The Manual group was shown alongside 8,001 different SERP Features, compared to 2,272 and 1,534 for the Google and ChatGPT-4 groups, respectively. This gap can be attributed to the better average ranking of this group compared to others as shown in the Organic Rankings table above.

image2-4

People Also Ask results were, by far, the most visible SERP Features, however Image results drove the highest CTR when visible alongside our testing pages.

SERP Feature	Count	CTR
PAA	7,968	0.99%
Image	2,378	1.42%
Video	924	1.01%
Answer Box - Paragraph	318	1.28%
Answer Box - List	183	1.34%
Answer Box - Other	34	0.48%
Answer Box - Table	2	0.00%

Impressions

Did it impact the test? Not as much as expected.

One group having an outlier of impressions could increase the margin of error for our test. Measuring the impressions of each group is important in order to understand if each test method had a comparable sample size of CTR data.

As expected, the Manual group had the largest amount of impressions. However there was just an 11.4% (21,177) difference between the high and low impression groups.

Method	Impressions	CTR
Manual	206,364	1.12%
Google	196,151	0.13%
ChatGPT-4	185,187	0.27%

While the pages within the Manual grouping were viewed the most, there was only a 5.2% difference in impressions compared to the Google group, despite a 38% average ranking difference.

Conclusions

We know that we’re still in the early stages of what will be possible with AI for SEO.

Sign up for the Seer newsletterorget in touch with usto hear more about our tests and how we’re using AI to impact our team and clients.

AI vs Human Written Metadata: Which Leads to Higher CTRs?

How will this post help me?

Table of Contents

Test Results TLDR

Hypotheses

Methodology

Test groups

Generating the Metadata

Measurement

Results

SEO Metrics

Efficiency Metrics

What’s Next?

So I shouldn’t use ChatGPT for Metadata?

Next test

Additional Data + Test Variables

Organic rankings

Rewritten metadata

SERP Features

Impressions

Conclusions

We love helping marketers like you.

相关的帖子