Image Hijacking in Google Image Search: How it works and how to protect

Image Hijacking in the Google Search results is an issue that has a long history. It is obviously an easy way to redirect traffic to other sites, mainly spammy websites. It has looked like a Google bug since 2007 when it first occured, so let’s dive into it to understand what is happening.

Reports about image hijacking go back to 2007 when Philipp Lenssen reported about it in his Google Blogoscoped Blog. Barry Schwartz also reported about that phenomen at Search Engine Land. So this is an old issue, but it still exists! Here is an example. It is in German, but there aren’t many cases like that out there and the Google algorithm seems to fix it after about one day automatically. However, Google hasn’t seemed to fix the algorithmic issues itself when indexing pictures.

When you search for “weihnachtskarten” (Christmas card) in the German Google, you will get something such as this:

google-search-images

As you see, we are not in the image search, but the same images still appear. Normally, in the organic results you will see some images such as these, all of which lead you to the website that contains the image. And the second image will lead you to the page trendstrom.de. In the picture below, you can see the same image at the bottom left.

Having said that, I do not know why people would buy a Christmas card with a slippery woman on the top right. However, it really is on that page… hey, it’s a free world.

trendstrom

The image hijacking effect

That is what we are used to. Now suddenly, from one day to the other, most of the images lead to another page:

spam-page

You can see the same image on that page, as well as some additional ones. They are the first images that the search results page Google shows in the organic search results for “weihnachtskarten”. But what happened? First, let’s take a look at the URL link that is behind the image in the Google search results:

http://images.google.de/imgres?tbnid=GkKkdSsUDjSaFM:&tbnh=90&tbnw=126&sa=X&imgurl=http://www.trendstrom.de/weihnachten/motive/weihnachtskarte-8.jpg&imgrefurl=http://golfdenantescarquefou.com/de-tag-weihnachtskarte%2B_%2B.html&h=299&w=420&tbm=isch&gbv=1&sei=tQxOVuufD8WqsQHs3pHoDQ

The parameter imgrefurl is the URL of the page. And this has changed. Now let’s look at the source code of that spammy page:source

As you can see, the page has implemented the image from its original source! The link URL still has the same imgurl from the original source. Clicking on it will open the image in full size. The question is: Why does Google use the URL of that page instead of the page where it was originally used?

First, we can see that the page is in French. Therefore, it doesn’t make sense to link to that page from German search results. Next, the website http://golfdenantescarquefou.com/ has no real trust or high rankings in the search results. Therefore, there is no logical reason to link to that page.

Now one day later, the situation has changed. Google has seemed to correct that as the link goes to the original source again. Everything’s fine? No, that’s not the case! Two days later, it’s the same situation but with a different target:

brokediscount

That page is in German, but it’s a different domain! So what has happened?

Getting back to the first spam page, I get this:

golf-error

Obviously that page has been removed. Not only the page but the whole domain. This could be the reason for the change.
But the pages tell me that somebody is making a business out of this hijacking. So I looked at the WHOIS information of both domains and – BAM! – they are the same. So let’s make a reverse domain check of that domain:
reverse-ip-check

As one can see, there are many domains on the same server with the same IP address. There are some domains with real content, but the following domain also exists on the server:
marketinghack
I imagine you have come to the same conclusion that I have: That sounds pretty much like what they are doing…
To sum up: Even after Google has solved the problem with their algorithm after some time has occured, the problem still exists because the spammers use numerous domains to do the image hijacking. That problem should generally be solved by Google. Here, the problematic domains went offline, so it is unclear if the algorithm has corrected the problem itself or if the switching off of thoise domains solves the problem indirectly. But it looks like there is still a problem with image hijacking, especially when someone puts an image every day on another domain like a cat and mouse game. When doing this periodically with special programmed domains one might hijack any image towards hijack traffic.
Two days later, another domain gets the traffic:
semiodata
In this case, the previous spam page redirects to this new spam page! So you want some traffic? Here is is how it works:

The image hijacking hack

1) You need numerous domains with ads on it and at least some links to those domains so that they get indexed.
2) Grab the top search result images and integrate them into a spam page.
Now Loop:
  a) Wait until the hijacking happens. Now wait about 48 hours.
  b) Switch everything to a new spam page at another domain. Redirect from the previous spam page with a 301 to the new one.
If you organize it well you can direct a lot of traffic for different keywords to those pages. If a domain gets banned,  just use the other ones. This prozess can be optimized very well.

Possible solutions

When you take a closer look at the source code of the spam page, you can see the rel=”prettyPhoto” within the <a> tag.

<a href=’http://www.trendstrom.de/weihnachten/motive/weihnachtskarte-8.jpg’ rel=’prettyPhoto‘>
<img src=’http://www.trendstrom.de/weihnachten/motive/weihnachtskarte-8.jpg’ style=’height:100px;margin-right:5px;margin-bottom:5px;’ /></a>

This could be a reason that leads to such a problem. Of course, there is obviously a bug in the Google algorithm, but it’s possible that the rel attribute leads to a misinterpretation by the algorithm. The rel attribute is normally used for nofollow or follow information. But officially there are also other possible values:

alternate, author, bookmark, help, license, next, noreferrer, prefetch, prev, search and tag

The tag could be a keyword for the current document. Therefore, the rel=”prettyPhoto” is correct HTML. In the case regarding our client, I will add the attribute to the source. Maybe this will help. I will update this blog article when I have new information. Most likely it’s only important that there is a rel attribute and the value doesn’t matter, but I will try the prettyPhoto value.

Another solution could be to change the URL of the image on the original website and hope that Google will fetch it soon and exchange the image. But of course, spammy websites may have an automatism that regularly checks the image results and changes them regularly. When doing this, you might put an image at the original URL that contains some text like “Welcome to a crappy spammy website”. Or add the words “Sandra, I love you!” and show it to your beloved. Hey! Twenty years ago you sprayed the name on a wall and you were happy! Did you ever try to write something on a third website without hacking? Now is your chance!

And the last solution of course could be that Google is finally fixing this issue, 8 years after it appeared for the first time. Yes, it is not that easy for an algorithm to understand what the origin is. But obviously the algorithm is able to solve the problem; however, a gap of 2 days is needed for the issue to be solved.

If you have had the same problem and you have other suggestions or ideas, please share them!

About Thomas Kaiser

Thomas Kaiser, founder and CEO of Forecheck LLC and cyberpromote GmbH, launched his first company at 23. He developed the first MPEG-2 video coder for Windows at the Technical University of Munich. In 1997 he invented “RankIt!!”, the first SEO software program in Germany. He has also written several books and is a sought-after speaker at SEO conferences and events. He loves playing guitar, enjoys his 5 kids and has drunk SEO milk since birth. You can write him at thomas /at/ forecheck.com.
Facebook
Twitter
Google Plus