With google leak making headlines, this research data seems to back the claims. I started working on this research to find out what kind of content ranks #1 on google.

I gathered data broadly to make sure no single domain dominated due to its depth.

Here’s what I did.

  • Ran 100 queries for each of the 1000 categories over the past 2 weeks
  • Most of the queries were 2 – 5 word queries, typically 3 or 4
  • Used Google custom search API

I pulled the list of categories from anything I could find. Internet, asked LLMs, wikipedia, life, etc to come up with 1000 categories. There were a lot of overlaps. (Note here that running 10000 queries in a single industry would give completely different results).

Then constructed queries like these: “Give me 100 questions in Health and Fitness using 4 keywords only just like how people would search on internet.”

Then ran the queries through LLM apis, which gave me results like these: “1. Best exercises core strength, 2. Best exercises for legs…”

After this, I ran the results through Google custom search API.

And here is the list of top ranking domains by #1 search results:

Base URL Count Percentage Domain Authority
reddit 1224 12.07% 92
youtube 677 6.68% 100
forbes 216 2.13% 94
linkedin 215 2.12% 99
indeed 155 1.53% 91
healthline 109 1.08% 91
hbr 107 1.06% 92
wikipedia 82 0.81% 98
mayoclinic 75 0.74% 92
investopedia 70 0.69% 92

There were 3236 different domains from the 10137 search results. And reddit was clearly outperforming all other domains.

Is Reddit filling the gaps left by other websites, or were the answers simply better?

The answer is mixed.

The worst result had only 2 total engagement: 1 upvote and 1 comment, clearly lacking in various ways. On the other end, the most engaged result had a total engagement of 12,851, and was clearly beneficial. Median engagement was 97, which was an interesting number.

There were 20 posts under 10 engagements at the top of the search (very niche content).

Limitations of this research:

  • This is not an accurate representation of global search results.
  • It is just a result of a test.
  • Google search results changes all the time. And it is currently changing at the time of writing.

My primary focus next is to look into the characteristics of top ranking contents.

+ I want to:

  • Increase the dataset to reinforce findings.
  • Incorporate more 1-2 word queries.
  • Cover additional industries for a diverse dataset.

Before I continue on this research, I want to hear from you:

What would you like to see from this research? Where would you want me to investigate further?

Thanks for reading! I invite any collaborations or comments.

submitted by /u/edytai
[link] [comments]