Google’s Classification Algorithms

As I said in my last post, I’m trying my best to post more often. The one thing I want to stress is that I don’t post unless I have something worthy of posting – I don’t want to be come a 1 post a day blog where I end up telling you that the title tags are AMAZINGGGG for ranking. But anyway, today I want to talk about Google’s Classification Algorithms. Long and short, guess what – the meta tags do matter (you know, in my opinion) a lot more than you probably think right now.

Let’s take a few common knowledge concepts in the SEO world and really think about it and how they work:

Links From Related Sites Are More Powerful
I don’t think anyone out there will really argue this one. A related PR4 link will “carry” more weight than a random PR4 link. That’s what we’ve all been taught and told from the almighty SEO guru’s, right? Well let’s assume for a moment that this is really 100% true and why it works. The real questions you should have asked yourself the first time someone said this to you, is how does google tell if the sites are related? It’s not just keyword density, or incoming anchor text’s to that page – what if it’s a lot more? Google has consistently said that incoming links can never, ever hurt your website because it would make it easy for Blackhat’s to ‘kill’ your rankings – right? If you actually go back and listen to their technicians speak on camera about it (answering questions from the general public) you’ll notice they always say conflicting answers while they dance around this subject. I recall one day seeing one technician say that incoming links can’t hurt you, then looking at the next video in the list and the technician said “if for some reason you think another site was to blame for this, submit a reinclusion request”. I quote that because I think that’s what he/she said, but I can’t remember completely.

So the real question is, can this happen? Yes, it can. I’ve done it a few times. Just figure out the triggers and give it a whirl, it’s a lot easier than you’d ever think. But let’s not get off topic, shall we? Let’s get back to classification algorithms. So let’s look at how Google will classify your site. Everyone says that meta tags are pretty much useless when it comes to ranking. But what if it was actually helpful in determining how much link juice is passed to you from another site, via related content? What if your site and their site both had a meta tag saying your niche was ‘weight loss’, do you think your link would be unrelated or related? Ever think that maybe meta tags are there to help you classify the actual subjects of your site, therefore affecting the incoming link juice from other sites? Give it a whirl, you’d be amazed at the actual results. Personally I won’t say that the rankings were specifically from this change, but I’ve had a site jump from the high 150’s to low 40’s in the span of 3 days with no link building by just adding some meta tags. Coincidence? Maybe, maybe not. Just food for thought.

Too Many Links Too Fast Will Result in Sandboxing
There are a lot more factors in play with this one that make it appear that way, yeah. You first need to clarify exactly what sandboxing is. To some people sandboxing is this little place where your site goes before it can prove that it’s normal. It reminds me a lot about the kid in preschool that stays in the corner looking at the paint dry, personally. For a lot of people sandboxing is a 3rd stage to sites. There’s sandboxed, normal, and authority. To other people (me included) I believe there’s literally 2 types, which is normal and authority. To me sandbox’d is nuked to all fucking hell, not ranking for damn near anything including it’s own name. The only time you’ll see this site is if you do or What if it wasn’t necessarily that too many links too fast results in sandboxing, but merely too many links without resulting traffic to a site results in sandboxing? If you don’t think that Google actually shares information between analytics and the SERP engines that you’re not really looking too deep into things.

I emplore you to try a little test. Try link building fast where people will actually click through to your site. Article sites, directories with traffic that people will actually click through, etc. Throw all of them at your site overnight and see what happens. Try it on a fresh domain even. The biggest determining factor here is domain age. If the domain is aged, it will go quite well and you’ll magically rank higher across most of your terms. If not, you’ll most likely drop out of the SERPs for a few days (not a deindex, though) and then bounce back with higher rankings. There’s a lot more to Google’s madness than people will give them credit for. Their algorithm is one of the most complex algorithms I’ve seen in a long ass time, but it’s not impossible to figure out how it works. It’s no different than any other algorithm at this time, which means it was written by humans and it can be figured out by humans. You just need to get into their mindset. Chug a redbull or two, pop a NODOZ and buckle down in a very poorly lit room with techno blaring and just think about every-little-stinkin-detail and you’ll start seeing a little bit about it that’s different (Disclaimer: I do not recommend anyone doing what I just said, I merely over-exaggerated for effect). Never overlook anything.

What I Don’t Have in Quality, I Make Up In Quantity
A lot of the people that I talk to about link building usually say the same thing. I don’t need stinkin PR8 links when I can throw 200 PR5’s at it. And to a certain point, that’s true. But let’s consider the last algorithm update. Most won’t argue that Social Bookmarking links are fairly well nuked now (stay classy, autopligg). The update really started making me think. What if Google decided to change the algorithm so that your potential ranking power was changed from (individual link power * links) to (total link power / links)? Instead of, for example, 500 PR4 links outweighing 5000 PR3’s. Would your sites still rank? Would they still carry so much authority that the new post you just wrote gets top 5 for the title? Keep that in mind. The goal of a good search engine marketer is to be able to not just adapt to algorith changes, but anticipate future algorithm changes and plan ahead for them. Make it work all the time, not just when you scramble to fix it.

One of the experiments I had lately was a 4-5 year old .com domain with only 5 links, but all 5 links were from Yahoo Directory. It ranked top 50 for it’s term (and the term, btw, had 900,000 on allintitle) with nothing but a blank front page with the term in the h1 and title. Pretty cool, eh? I threw 50 or 100ish PR2-5 at it and I got kicked to 200+ for the same term. What the hell, right? It looks at least to me like it’s definitely possible for the changes to be in testing with certain things. The goal of Google is to provide the best user experience (credit to Steve when I forgot all about that one) so wouldn’t they try to discourage or possibly even attempt to derail certain ways to manipulate the SERPs? Yes my friends, SEO is a way of modify and manipulate the SERPs. Just keep that in mind when you start your tests and try different techniques and methods.

Google Only Listens To What You Tell Them
Now this, is going to be a fun one. If you set the text in a title tag to “hey this is my site”, the title for your site will always be “hey this is my site” right? Nope. Not always. One of my clients actually is a prime example. They own the domain <term> Without my knowledge they changed their site completely, and the title was reset to index. I went to check on their rankings, and for the term <term> their title was not index, but actually <term>. I click through and see index. Confused as hell, I click back and check the google cache date. It’s current, today in fact. I query <term> inc, and their title was <term> inc. Same pages, same caches, different terms, different titles. Interesting, eh? Do you need more proof about backend classification?

Here’s another interesting one for you to sleep on. A member over at WickedFire actually followed up with me about one of the previous posts where I mentioned graphics being read via OCR and that contributing to your ranking. He PM’d me to actually let me know that one of the google bots (or whatever) actually took the tag line in his logo, and used it for the meta description in the SERPs. It’s no where else on the pages, not in alt tags or comment tags even – just in the image. After reading his message again I definitely laughed on the inside, and then stood up with a very loud “WHAT THE FUCK?” Chea son. Dis shit happened. About 15 minutes later I started to blur the backgrounds behind the tag lines on all of my logos to a solid color. Right now in your head you should be saying “shit just got real”. Yeah.

The bottom line that you should take from this is not what the intention of your site is. It’s what Google takes as your intention. Sometimes you need to spell shit out. Sometimes you need to make sure they can’t mistake the point behind one of your sites or pages. You need to make sure that Google understands exactly what it’s about.

Anyway, it’s almost 2 AM here but this was my random little session for the night. I’ll still be up for another hour or two if you want to poke me on Skype ( is the username). Also if anyone wants to have a drunken idea session at Affiliate Summit, I’ll be there. Leave a comment with some ideas ;) Peace!


6 years ago by in Search Engine Marketing | You can follow any responses to this entry through the RSS feed. You can leave a response, or trackback from your own site.
About the

My name is Rob Adler and I'm an algo-holic. I spend most of my time coding, data mining, spidering and consulting for SEO. I hope the posts here are beneficial for you, and hopefully I can blow your mind every now and again.

12 Comments to Google’s Classification Algorithms
    • Alan Bleiweiss
    • Great article – and while I don’t off-the-cuff take your word for it on each aspect, I completely agree that we need to think things through and test stuff. Too many people play the “but everyone says..” game.

    • Contempt
    • Indeed, and thanks for the comment. I do understand a lot of the things that I say would spark the “yeah right” mentality but I’m glad I could make a few people here and there think differently. :)

    • Victory
    • Links to fast is bad if your content doesn’t merit it. It seems like people have this delusion that G takes the same number of CPU cycles for each page that it visits, which is patently untrue.

      In reality G is going to scrutinize your site more if it seems to be more relevant at the moment. Relevant at the moment can be guessed upon via the number of new links that are pointing to it.

      So if you get to many links to fast, than G is going to through many many CPU cycles to see if your page is really something it should really be getting out to its users. If not or if those links are crap, well then G can guess why you are getting them.

    • Contempt
    • To an extent, yes. I agree completely with the point you make about relevant. We’ll talk tomorrow if you can message me on Skype I’ll show you what I mean. There’s a few factors that I can’t write about that shows exactly what merits links and what doesn’t. :)

    • axi
    • Hi,
      first excuse my poor English. I read some of your articles, and some are quite interesting.
      I am curious in what do you think about content translations.
      For example if I have a french site of movies, I can scrap content from amazon reviews or editorial opinions, translate, and put on my site.
      I tried in the past a little test, but google seems to detect those traductions.

      What do you think? Did you test something like that?

    • Contempt
    • Using translations as content “spinning” is nothing new. To be honest, as long as I’ve been doing this I’ve known about that one. But yeah, in my opinion – though I haven’t completely tested it – they can tell the difference depending on who spiders it. In other words, spiders english and spanish is different than if and spiders it.

    • axi
    • I have a doubt since while concerning to some exact keyword searches penalties.
      For example, I have one site related to wireless networks, and
      if you search “wifi networks” it doesn’t appears in results or appears in +600 position, and if you search “wifi wireless networks” it is first in ranking because that term have a lot of exact anchors.
      Or for example my domain name have wifi keyword, ando all of the links to my site have wifi keyword in their anchors, but if you search “wifi” my site don’t appear on that search.

      The obvius thougt is related to the competitivy of the keywords, but sometimes this is not always true, for example “wifi networks” is not very competitive search in my language.
      No keyword stuffing.. no bh techniques, it just happens.

      This kind of penalties ( or maybe they aren’t penalties, and are indexation techniques ) dissapear with time? Do you have experience with this?

      Thanks in advance

  1. Pingback: yhelapy

    • SEO hosting
    • “I threw 50 or 100ish PR2-5 at it and I got kicked to 200+ for the same term. What the hell, right?”

      You mentioned that your site got kicked, however, I’m curious to know whether it got back up(stronger and higher) after a few days?
      In my experience when I lose rankings for a few sudden links thrown at my sites, they lose rankings for a few days and jump back up.
      I’ve never seen any of my site lose rankings because of too many incoming links suddenly.

      Ofcourse, xRumer has caused problems for me but those links are really spammy. However, regular legit looking links have never been a problem for me, 200-500, definitely not.

    • Contempt
    • Yeah it got a lot higher with auth links. Auth ftw.

      I always use a mix of different types of link building within all of my sites now, it’s a lot easier to avoid it almost entirely.

    • I loved this article. I never thought of Google looking at not just links, but traffic from links when considering whether to dump a site in the sandbox. Makes sense though, and made me reevaluate a few things, so thanks.

Leave A Response

* Required