Penguin Pussy cat
Reading between the lines and looking at the data reveals some further (to our Penguin 4 review) interesting insights about the nature of the current implementation of Google Penguin 4 and the removal of Google Penguin 3. Let me jump in and make some statements:
1) Penguin 4 is no longer a penalty, instead it's as soft as this pussy cat.
2) Google's manual spam team has been largely designed out with Penguin 4!
3) Google are implying that disavow files are now largely redundant (do they get finally it?)
4) Google's new granular formula takes a long time to fully compute and it will take months come in full...
5) Penguin 3 was largely removed around 2nd September to make way for Penguin 4, introduced 23rd Sept
Understanding the changes
In the old format of detecting link spam, (even before Penguin) mostly spam was detected with algorithms that would detect suspicious activity such as multiple links from one IP to another, gaining too many links too quickly, too many (spammy) anchor texts on a domain, spam reports by competitors, etc. then once Google's suspicions had been arisen, the spam team might take a physical look, then kill the linking out domain and leave notes about any penalised target domain, all of this would be totally invisible to the public apart from them seeing ranking changes. As well as they may add a penalty score that would automatically expire after some variable time to the guilty parties.
Google's New Pussy Cat
The new method of detecting link spam improves on these systems and introduces a few massive new changes, that of comparing every new link they find against the entire link graph of that domain and also against all the other sites that are also linked too and looking for any unnatural comparisons. Iterating though arrays of links in such a way is a slow time consuming task; it's a massive task to do on any scale no matter how much computing power you have! BUT, this is the most compelling and intelligent method of finding unnatural links because it is so common that domains that spam have one thing in common, they don't do it in isolation, they do it as a common practice, and these practices can be detected by comparing everything against everything else (read slowly).
Google's new method includes a function to classify networks owned by single entities, hence if they discover one entity owns more than one domain, that group of domains will be grouped and classified as a network, these groups will be the key to understanding the value of a link and the system will be able to detect natural links within networks VS unnatural links, according to the commonalties between the domains linked within the networks (read granular), and the accumulative level of suspicion between the two entities (groups).
That means that you will be judged on your total discretions across multiple domains and it's an accumulative penalty not isolated to each domain, that means that the focus is now on all domains owned by one entity and once suspicion has been aroused they will be comparing other related domains and looking for patterns. But all of this focus is more specific to the target domains (domains higher in the ranks) than it is to the general link graph. Because it is Penguin 4 and related to manipulation of search ranks the whole focus is now on removing manipulation and doing it algorithmically, thus there is going to be more white noise or collateral damage with this approach hence it has been necessary to simply remove the positive effects of suspect links rather than actually penalise domains with any certainty.
How accurate can this new method be? (Read Aggressive Tiger or a soft Penguin?)
Because this is a highly intelligent (processing heavy) method of eliminating the effect of manipulation (spam) and the effects of this new system is going to be much more granular and as a result much more effective (read laser precision weapons instead of carpet bombing towns), so much so that the confidence Google has in this new method has allowed Google to retire the old spam team and their old methods no doubt with huge savings on manual labour, offset by the cost of much more processing power now being applied to the problem. This will also mean that the effects on the SERPs will be significantly greater than what has currently been seen in the past, thus in my estimates the noticeable affects will be something like:
Penguin 3: percentage of the SERPS affected 6%
Penguin 4: percentage of the SERPS affected 18%
However the changes will not come into full effect in one huge update, but instead this should take many months to fully recompute the current link graph and then reclassify the link graph as they know it!