Breaking

New top story on Hacker News: Tell HN: A case of negative SEO I caught on my service and how I dealt with it

Tell HN: A case of negative SEO I caught on my service and how I dealt with it
40 by santah | 2 comments on Hacker News.
Recently, my service https://ift.tt/3qT7y8w experienced a huge drop in Google rankings. As I've been running it for more than 15 years, this is far from the first time this has happened. Usually I've been able to attribute big fluctuations (positive or negative) either to something I did, a Google algo change, or some external factor. For example, about 2 years ago, something similar happened. While digging through my Search Console I discovered that Russian websites generated thousands of links pointing to a page on Next Episode with pornographic keywords used as link anchors. This was so effective that they managed to get those keywords to the top of the "Top linking text" in Google Search Console - naturally (most likely) resulting in drop in rankings for the regular keywords and the domain in general. About a week ago, while trying to investigate the current drop in rankings and browsing through my "Latest links" external links export from Google Search Console, I noticed something funny. There were thousands of links in there (from 3 domains) following the same structure as on Next Episode: domain/show-name domain/show-name/browse domain/show-name/season-1, etc. Following these links revealed something even funnier: all of them displayed content directly from my site! Not even scraped/cached content - they were dynamically pulling content from my server and displaying it on their domain. Even the search worked, the news archive and the top charts. Here is a list of those domains as an image: https://ift.tt/3jFDLOa . I've since blocked their access, so opening any of them will not show my website right now, but here is how it looked: https://ift.tt/3jKbGW3 Now, my first thought was that those were maybe scraping the content as part of a link farm (to spam with ads?), but I also wanted to know more. I experimented with Google searches that included pages from my website, like "Hot Shows - Next Episode" and ones with very specific news posts subjects like "Streaming Services Availability added to Episodes and Movies" (posted in September last year). Imagine my surprise when I discovered that not only the domains above were indexed by Google (and were listed in the Search results), but there were 4-5 more domains that did the same thing and some of them even outranked mine! Here is a full list of domains that I discovered by searching for my news posts subjects: https://ift.tt/2NgviVR . If you Google for site:domain.com you'll see some of them have thousands of pages indexed by Google. Trying out more keyword searches, I was also able to discover these domains: https://ift.tt/3aUe3la (as they've cached the content, they still work). Those all seem to be part of the same operation, but they serve a different purpose - they have only scraped the home page of Next Episode and all their links point to inside pages on the other domains. I suspect this is to generate incoming links to the other domains and give them some credibility. As with the links with adult keywords text anchors mentioned above - I suspect this whole thing is a negative SEO campaign - I don't see any other reason for it to be happening and it seems to be achieving its goal. Once I found all I could find about the domains involved in this, I took some action: 1) disavowed all those domains through the Google disavow tool 2) investigated if I could redirect their pages to mine (as they were dynamically pulling the content - I could change it to whatever I wanted). I managed to make it work through JavaScript (though interestingly, it had to be obfuscated as they were doing some sanitizing when pulling my content and replacing strings like "window.location.href" with "window.loc1ion.href"), but in the end I decided against it and: 3) I blocked their IPs through CloudFlare (all Russian IPs). An interesting thing here is that once I blocked an IP, the domain would somehow automatically switch to another IP to pull my content from, but once I blocked like 10 or 15 of them - they seem to have run out of IPs and now they stay blocked. I looked for a way to report those domains to Google, but as of today, I've not found the place to do it. Does anybody know? Today, about a week after I blocked the domains that pulled content from my site, they still have thousands of my pages indexed in Google and are ranking better in some search results than me. I'm guessing with time, Google will catch up with the fact they don't show any content anymore and will delist those pages. This whole thing was very new to me so I hope it'll raise awareness that this is going on and maybe help someone else catch it happening to their website. I'd appreciate any feedback on this and I'm around if you have any questions. It would also be interesting to hear about anyone's related experiences. Cheers!

No comments:

Post a Comment

Technology