Google’s problematic Right to be forgotten algorithm

As I stated in my last blog post, after 17 years of working in the field of SEO, I remember 2 reproducible and severe errors by Google.  And now I have found a third error from Google. Well, it may be an error from my point of view, Google may view that as “by design”.

google-text-en

There has been a big discussion about the judgement of the highest European Court  pertaining to Google and the right to be forgotten. This has created more questions than answers. Since June 26, 2014 Google has started to delete results from their search results. A small piece of text at the bottom of the results page explains that due to the constraints of the European data privacy regulations, some results “may have been removed”. There are also already discussions about the implementation of all that.

Google has stated than several tens of thousands of requests have already been received. When you look at the statement at the bottom of a search results when you search for a name, you know that in the past something has happened that this person wants forgotten.

This has actually occurred to the following people (click on the link to see the full search results page):

Amjad BashirAmjad Bashir
Andrew Lewer
Diane James
Janice Atkinson
Louise Bours
Lucy Anderson
Margot Parker
Neena Gill
Ray Finch
Theresa Griffin
Tim Anker

And many many more. The screenshots are in German, as it  cannot be reproduced the same in English. There seem to be different results depending on the language. It is also possible that this happens, because this new technology is just rolling out to all Google data centers.

Do you not know these people? I don’t know them as well. However, they are all members of the European Parliament.

About 20% of all members want to be forgotten.

UK names are more forgotten than German names

The strange thing is that with the German names of the European Parliament members, the problem cannot be reproduced in the same way. The Google algorithm seems to prefer English names.

Obviously there is a problem with the algorithm. Additionally, the following names have that statement at the bottom:

Fred Parker, Hans Parker, Geraldine Parker, Abbey Parker, Alicia Parker, Johannes Parker, Sven Parker, Timo Parker

but not

Matt Parker, James Parker, Thomas Parker, Francis Parker, Mary Parker

It looks like that if a first name and a last name is part of a name that is on the list of the forgotten ones, then this statement appears. The question remains as to whether or not any result is actually removed.

As we can see, Google states that these results “may have been removed”, which  is unclear as to the issue of removal. And additionally, why is there a difference if I search in English or German?

Matt Cutts, the noted head of the Google web spam team, told me: “Note that we show the notice for queries that look like names in Europe.” This sounds as if a name is similar to one in which there had been a request for removal, then this statement will appear at the bottom of the list. However, it still looks strange.

Let me ask you a question: What would you say if that statement appears in the results when you search for your own name? And what if you have a rare and unique name? Then most of the results will be about you. Everybody that searches for your name will also see it. Which is exactly what has occurred with Geraldine DeRuiter.

geraldineFor this case as well, the statement appears in the German results. It also appears at times in the English results, but not always.

Geraldine DeRuiter is the wife of Rand Fishkin, a well known person in the SEO industry and the founder of Moz.com. He was surprised about that and told me that his wife had not submitted any such request for removal.

So I tried other names that start with “Geraldine”, and sometimes you get that statement of removal, sometimes not. It still looks like  “Geraldine” is a first name of a person that submitted a request, and if you combine it with a last name from another person that did a request, this statement appears. But not always and not in every language.

The more people will submit such a request, the more such statements will appear in the results. In the future it’s possible that this will appear in most name searches. Does that make sense? To me it looks like an algorithmic error. Is it possible that Google wants it that way? Is it their version of protesting against the judgement?

If I were Geraldine DeRuiter and I had not submitted such a request, I would see that as a false-positive signal. I would not want this below the search results of my name. The question is whether or not Google will change their algorithm and if somebody will fight against that. In any regard, my personal opinion is that this is wrong.

I will now demonstrate how to check a bulk list of names for the removal statement. If you want to check a list, you can also use Forecheck for it or just drop me a line and I will have a look at it.

How to check bulk names

You can check all these names by hand, but here is how to execute it in an automated way with bulk lists of names in Forecheck:

excel_liste

First, save that list as a text file. Now we need that list with links to the Google results page. First you copy the URL for search strings into an Excel list (first row) and the names in the second row.

Secondly, you copy the two columns into a text file and delete all tabs. Now you have a list of all the names with the URLs of their search results page.

In Forecheck, you can open a list of URLs.

open-url-list

Next, open the generated text file. You will  now see the list of all URLs in Forecheck. Click on the arrow at the top to start the analysis.

fc_analyse_names

Now Forecheck will run through all the URLs.

After finishing the list, you can go to the Search tab and search for a string such as

“may have been removed”.

Pay attention not to use a string that could be part of the result without the removal statement. Searching for “removed” could lead to false-positive results when just the word “removed” is a description text of a result.

search-removed

You can search the Content of all pages (which is the visible content) or the Source (source code). Searching the content will be faster, of course.

I hope that Google will refine it’s algorithm and change the way how they interpret the names. Perhaps Google wants this discussion to occur as they probably deem this judgement as a mistake. Of course, this judgement is not only a lot of work for Google, it also brings up many additional questions. And this problem will not only hit Google, it will hit other websites, at least the popular ones. How will they handle it? Only the future knows.

About Thomas Kaiser

Thomas Kaiser, founder and CEO of Forecheck LLC and cyberpromote GmbH, launched his first company at 23. He developed the first MPEG-2 video coder for Windows at the Technical University of Munich. In 1997 he invented “RankIt!!”, the first SEO software program in Germany. He has also written several books and is a sought-after speaker at SEO conferences and events. He loves playing guitar, enjoys his 5 kids and has drunk SEO milk since birth. You can write him at thomas /at/ forecheck.com.
  • forecheck

    Right, still the question is, why they did not build the name database first before launching it? Some people with unusal names that apply for a job these days may get some uncomfortable questions.

  • http://www.notprovided.eu Jan-Willem Bobbink

    Thomas, Google said they would show it for every name so you will probably see it for every name you search for. Read http://www.seroundtable.com/google-right-to-be-forgotten-fake-requests-18774.html

    It also states “may have” so it doesn’t mean there really is an entry removed.

  • Pingback: Als auch Google sich das Recht auf Vergessen nahm! - SEO-united.de Blog

  • forecheck

    The whole thing is just rolling out to all data centers, so the German Parliament members may follow. The question is: Why do they show that message at all? I think they were not forced to put it there.

  • http://www.elcario.de/ elcario

    I can confirm Rayn. I see it for my relatively unique name as well and I did not request anything, so this is basically a standard procedure/warning on all name related searches. The assumption (German vs. UK Parliament Member) is sadly wrong, but that would have been interesting facts. ;-)

  • Guest

    So, Google is showing this for any name search based on their ever growing database of names. They realized that they can’t only show it for the names where things were removed – because then they indicate that that person had something removed.

    E.G. Eventually, it will show for every name search – regardless of whether something has been removed.

Facebook
Twitter