Why keeping spam out of Search is so important
When you come to Search with a query in mind, you trust that Google will find a number of relevant and helpful pages to choose from. We put a lot of time and effort into improving our search systems to ensure that’s the case.
Working on improvements to our language understanding and other search systems is only part of why Google remains so helpful. Equally important is our ability to fight spam. Without our spam-fighting systems and teams, the quality of Search would be reduced–it would be a lot harder to find helpful information you can trust.
With low quality pages spamming their way into the top results, the greater the chances that people could get tricked by phony sites trying to steal personal information or infect their computers with malware. If you’ve ever gone into your spam folder in Gmail, that’s akin to what Search results would be like without our spam detection capabilities.
Every year we publish a Webspam Report that details the efforts behind reducing spam in your search results and supporting the community of site creators whose websites we help you discover. To coincide with this year’s report, we wanted to give some additional context for why spam-fighting is so important, and how we go about it.
We’ve always designed our systems to prioritize the most relevant and reliable webpages at the top. We publicly describe the factors that go into our ranking systems so that web creators can understand the types of content that our systems will recognize as high quality.
We define “spam” as using techniques that attempt to mimic these signals without actually delivering on the promise of a high quality content, or other tactics that might prove harmful to searchers.
Our Webmaster Guidelines detail the types of spammy behavior that is discouraged and can lead to a lower ranking: everything from scraping pages and keyword stuffing to participating in link schemes and implementing sneaky redirects.
Fighting spam is never-ending battle, a constant game of cat-and-mouse against existing and new spammy behaviors. This threat of spam is why we’ve continued to be very careful about how much detail we reveal about how our systems work. However, we do share a lot, including resources that provide transparency about the positive behaviors creators should follow to create great information and gain visibility and traffic from Search.
Spotting the spammers
The first step of fighting spam is detection. So how do we spot it? We employ a combination of manual reviews by our analysts and a variety of automated detection systems.
We can’t share the specific techniques we use for spam fighting because that would weaken our protections and ultimately make Search much less useful. But we can share about spammy behavior that can be detected systematically.
After all, a low quality page might include the right words and phrases that match what you searched for, so our language systems wouldn’t be able to detect unhelpful pages from content alone. The telltale signs of spam are in the behavioral tactics used and how they try to manipulate our ranking systems against our Webmaster Guidelines.
Our spam-fighting systems detect these behaviors so we can tackle this problem at scale. In fact, the scale is huge. Last year, we observed that more than 25 billion of the pages we find each day are spammy. (If each of those pages were a page in a book, that would be more than 20 million copies of “War & Peace” each day!) This leads to an important question: once we find all this spam, what happens next?
Stopping the spammers
When it comes to how we handle spam, it depends on the type of spam and how severe the violation is. For most of the 25 billion spammy pages detected each day, we’re able to automatically recognize their spammy behavior and ensure they don’t rank well in our results. But that’s not the case for everything.
As with anything, our automated systems aren’t perfect. That’s why we also supplement them with human review, a team that does its own spam sleuthing to understand if content or sites are violating our guidelines. Often, this human review process leads to better automated systems. We look to understand how that spam got past our systems and then work to improve our detection, so that we catch the particular case and automatically detect many other similar cases overall.
In other cases, we may issue what’s called a manual action, when one of our human spam reviewers finds that content that isn’t complying with our Webmaster Guidelines. This can lead to a demotion or a removal of spam content from our search results, especially if it’s deemed to be particularly harmful, like a hacked site that has pages distributing malware to visitors.
When a manual action takes place, we send a notice to the site owner via Search Console, which webmasters can see in their Manual Actions Report. We send millions of these notices each year, and it gives site owners the opportunity to fix the issue and submit for reconsideration. After all, not all “spam” is purposeful, so if a site owner has inadvertently tried tactics that run afoul of our guidelines, or if their site has been compromised by hackers, we want to ensure they can make things right and have their useful information again available to people in Search. This brings us back to why we invest so much effort in fighting spam: so that Search can bring you good, helpful and safe content from sites across the web.
Discovering great information
It’s unfortunate that there’s so much spam, and so much effort that has to be spent fighting it. But that shouldn’t overshadow the fact there are millions upon millions of businesses, publishers and websites with great content for people to discover. We want them to succeed, and we provide tools, support and guidance to help.
We publish our own Search Engine Optimization Starter Guide to provide tips on how to succeed with appropriate techniques in Search. Our Search Relations team conducts virtual office hours, monitors our Webmaster Community forums, and (when possible!) hosts and participates in events around the world to help site creators improve their presence in Search. We provide a variety of support resources, as well as the Search Console toolset to help creators with search.
We’d also encourage anyone to visit our How Google Search Works site, which shares more generally about how our systems work to generate great search results for everyone.
Related Google News:
- Improvements for locating new comments and important conversations in Google Docs February 22, 2021
- New Association functionality added to Search Console February 22, 2021
- Why I Ride with Waymo: Jesse February 19, 2021
- Introducing Model Search: An Open Source Platform for Finding Optimal ML Models February 19, 2021
- Join us at Search Central Live February 18, 2021
- Charge it up: New Geo Maps features for electric vehicles rolling out in Canada February 17, 2021
- Why Verizon Media picked BigQuery for scale, performance and cost February 12, 2021
- Email aliases now included in Gmail search results February 11, 2021