Penguin 4.0: How the Real-Time Penguin-in-the-Core-Alg Model Changes SEO – Whiteboard Friday

Posted by randfish

The dust is finally beginning to settle after the long-awaited rollout of Penguin 4.0. Now that our aquatic avian friend is a real-time part of the core Google algorithm, we’ve got some changes to get used to. In today’s Whiteboard Friday, Rand explains Penguin’s past, present, and future, offers his analysis of the rollout so far, and gives advice for going forward (hint: never link spam).

Click on the whiteboard image above to open a high-resolution version in a new tab!

Video Transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week, it is all about Google Penguin. So Google Penguin is an algorithm that’s been with us for a few years now, designed to combat link spam specifically. After many, many years of saying this was coming, Penguin 4.0 rolled out on Friday, September 23rd. It is now real-time in Google’s algorithm, Google’s core algorithm, which means that it’s constantly updating.

So there are a bunch of changes. What we’re going to talk about today is what Penguin 1.0 to 3.x looked like and how that’s changed as we’ve moved to the Penguin 4.0 model. Then we’ll cover a little bit of what the rollout has looked like and how it’s affecting folks’ sites and specifically some recommendations. Thankfully, we don’t have a ton.

Penguin 1.0-3x

But important to understand, if people ask you about Penguin, people ask you about the penalties that used to come from Penguin, you’ve got to know that, back in the day…

  • Penguin 1.0 to 3.x, it used to run intermittently. So every few months, Google would collect a bunch of information, they’d run the algorithm, and then they’d release it out in the wild. It would now be in the search results. When that rollout happened, that was the only time, pretty much the only time that penalties from Penguin specifically would be given to websites or removed.

    This meant that a lot of the time, you had this slow process, where if you got penalized by Penguin, you did something bad, you did some sketchy link building, you went through all the process, you went through all the processes of getting that penalty lifted, Google said, “Fine, you’re in good shape. The next time Penguin comes out, your penalty is lifted.” You could wait months. You could wait six months or more before that penalty got lifted. So a lot of fear here and a lot of slowness on Google’s side.

  • Penguin also penalized, much like Panda, where it looks at a portion of the site, these pages maybe are the only ones on this whole domain that got bad links to them, but old Penguin did not care. Penguin would hit the entire website.

    It would basically say, “No, you’re spamming to those pages, I’m burying your whole domain. Every page on your site is penalized and will not be able to rank well.” Those sorts of penalties are very, very tough for a lot of websites. That, in fact, might be changing a little bit with the new Penguin algorithm.

  • Old Penguin did not require a reconsideration request process, though manual penalties and, some SEOs believed, Penguin penalties, too, did lift often in conjunction with disavowing old links, proving to Google that you had gone through the process of trying to get those links removed.

    It wasn’t often enough to just say, “I’ve disavowed them.” You had to tell Google, “Hey, I tried to contact the site where I bought the links or I tried to contact the private blog network, but I couldn’t get them to take it down or I did get them to take it down or they blackmailed me and forced me to pay them to take it down.” Sometimes people did pay and Google said that was bad, but then sometimes would lift the penalties and sometimes they told them, “Okay, you don’t have to pay the extortionist and we’ll lift the penalty anyway.” Very manual process here.

  • Penguin 1.0 to 3.x was really designed to remove the impact of link spam on search results, but doing it in a somewhat weird way. They were doing it basically through penalties that affected entire websites that had tried to manipulate the results and by creating this fear that if I got bad links, I would be potentially subject to Penguin for a long period.

I have a theory here. It’s a personal theory. I don’t want you to hold me to it. I believe that Google specifically went through this process in order to collect a tremendous amount of information on sketchy links and bad links through the disavow file process. Once they had a ginormous database of what sketchy and spammy bad links looked like, that they knew webmasters had manually reviewed and had submitted through the disavowal file and thought could harm their sites and were paid for or just links that were not editorially acquired, they could then machine learn against that giant database. Once they’ve acquired enough disavowals, great. Everything else is gravy. But they needed to get that huge sample set. They needed it not to just be things that they, Google, could identify but things that all of us distributed across the hundreds of millions of websites on the planet could identify. Using those disavowal files, Google can now make Penguin more real-time.

Penguin 4.0+

So challenges here, this is too slow. It hurt too much to have that long process. So in the new Penguin 4.0 and going forward, this runs as part of the core algorithm, meaning…

  • As soon as Google crawls and indexes a site and is able to update that in their databases, that site’s penalty is either lifted or incurred. So this means that if you get sketchy links, you don’t have to wait for Penguin to come out. You could get hurt tomorrow.
  • Penguin does not necessarily any longer penalize an entire domain. It still might. It could be the case that if lots of pages on a domain are getting sketchy links or some substantive portion or Google thinks you’re just too sketchy, they could penalize you.

Remember, Penguin is not the only algorithm that can penalize websites for getting bad links. There are manual spam penalties, and there are other forms of spam penalties too. Penguin is not alone here. But it may be simply taking the pages that earn those bad links and discounting those links or using different signals, weighting different signals to rank those pages or search results that have lots of pages with sketchy links in them.

  • It is also the case — and this is not 100% confirmed yet — but some early discussion between Google’s representatives and folks in the webmaster and SEO community has revealed to us that it may not be the case that Penguin 4.0 and moving forward still requires the full disavow and whole reconsideration request process.

That’s not to say that if you incur a penalty, you should not go through this. But it may not be the case that’s the only way to get a penalty lifted, especially in two cases — no fault cases, meaning you did not get those links, they just happened to come to you, or specifically negative SEO cases.

I want to bring up Marie Haynes, who does phenomenally good work around spam penalties, along with folks like Sha Menz and Alan Bleiweiss, all three of them have been concentrating on Google penalties along with many, many other SEOs and webmasters. But Marie wrote an excellent blog post detailing a number of case studies, including a negative SEO case study where the link penalty had been lifted on the domain. You can see her results of that. She’s got some nice visual graphs showing the keyword rankings changing after Penguin’s rollout. I urge you to do that, and we’ll make sure to link to it in the transcript of this video.

  • Penguin 4.0 is a little bit different from Penguin 1.0 to 3 in that it’s still designed to remove the impact of spam links on search results, but it’s doing it by not counting those links in the core algo and/or by less strongly weighting links in search results where many folks are earning spammy links.

So, for example, your PPC, your porn, your pills, your casino searches, those types of queries may be places where Google says, “You know what? We don’t want to interpret, because all these folks have nasty links pointing to them, we are going to weight links less. We’re going to weight other signals higher.” Maybe it’s engagement and content and query interpretation models and non-link signals that are offsite, all those kinds of things, clickstream data, whatever they’ve got. “We’re going to push down the value of either these specific links or all links in the algo as we weight them on these types of results.”

Penguin 4.0 rollout

So this is what we know so far. We definitely will keep learning more about Penguin as we have more experience with it. We also have some information on the rollout.

  • Started on Friday, September 23rd, few people noticed any changes.

In fact, the first few days were pretty slow, which makes sense. It fits with what Google said about the rollout being real-time and them needing time to crawl and index and then refresh all this data. So until it rolls out across the full web and Google’s crawled and indexed all the pages, gone through processing, we’re not going to get there. So little effect that same day, but…

  • More SERP flux started three to five days after, that next Monday, Tuesday, Wednesday. We saw very hot temperatures starting that next week in MozCast, and Dr. Pete has been detailing those on Twitter.
  • As far as SEOs noticing, yes, a little bit.

So I asked the same poll on Twitter twice, once on September 27th and once on October 3rd, so about a week apart. Here is the data we got. “Nope, nothing yet.” “Went from 76% to 72%,” so a little more than a quarter of SEOs have noticed some changes.

A lot of folks noticing rankings went up. Moz itself, in fact, benefitted from this. Why is that the case? Well, any time a penalty rolls out to a lot of other websites, bad stuff gets pushed down and those of us who have not been spamming move up in the rankings. Of course, in the SEO world, which is where Moz operates, there are plenty of folks getting sketchy links and trying things out. So they were higher in the rankings, they moved down, and Moz moved up. We saw a very nice traffic boost. Thank you, Google, for rolling out Penguin. That makes our Audience Development team’s metrics look real good.

Four percent and then six percent said they saw a site or page get penalized in their control, and two percent and then one percent said they saw a penalty lifted. So a penalty lifted is still pretty light, but there are some penalties coming in. There are a few of those. Then there’s the nice benefit of if you don’t link spam, you do not get penalized. Every time Google improves on the Penguin algorithm, every time they improve on any link spam algorithm, those of us who don’t spam benefit.

It’s an awesome thing, right? Instead of cheering against Google, which you do if you’re a link spammer and you’re very nervous, you get to cheer for Google. Certainly Penguin 4.0 is a good time to cheer for Google. It’s brought a lot of traffic to a lot of good websites and pushed a lot of sketchy links down. We will see happens as far as disavows and reconsideration requests for the future.

All right, everyone, thanks for joining. Look forward to hearing about your experiences with Penguin. We’ll see you next week for another edition of Whiteboard Friday. Take care.

Video transcription by

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!