If You’ve Said It Once, You Should Say it Again – Repetition is Good for Google
May 18, 2009 by Aaron Rubman
While Google is arguably the largest data retrieval system in the world, it is not built like the physical retrieval systems that are familiar to most of us who work in office (and home office) environments.
Most of us employ some sort of grouping or filing system when we are given a document that we think we may need to find again in the future. This might be a Rolodex, a well maintained cabinet of hanging files, or even (perish the thought) ill defined stacks of paper where the most recently used documents tend to gravitate towards the top.
Basically, you know which documents are enough like each other that you would look for them together. But I’d be willing to bet that at least once during your life you have walked into the room or office of someone who uses a different sort of filing style and been completely baffled. It may make perfect sense for one person to store “opera tickets” next to “open bank accounts” while another would consider them so categorically different that they’d each be stored in entirely different cabinets.
Google Has to Deal With it All
No matter how someone structures their website, or which concepts they think go together on a page, Google still has to know how to retrieve it.
Google is now big enough that they could theoretically set up an organizing scheme like the Dewie Decimal system or the Standard Industrial Classification – and people would flock to it to keep from loosing their place on Google’s listings. However, this would also force anyone wishing to search Google to be familiar with their classification system, and anything that makes a search engine less user friendly will also make it less popular.
Instead Google uses some variation on a full text search. In other words, they send machines through every web page in order to read every word that appears on that page. These machines are frequently referred to as ‘bots’ (short for robot) or ‘spiders’ (because they wander the Web).
What Do These ‘Spiders’ Look For?
Google is, quite sensibly, not giving all the details. They want to inspire people to write good content, and not poor content that just happens to mesh well with their formula.
However, until someone actually comes out with a way for a machine to actually understand what it reads, the most likely method is the use of a concordance file, or in plain speech, a computer generated index that tracks the number of times a phrase is used, the portion of the copy which includes that phrase, or some combination of both.
This is why anything worth saying once online, is worth saying again. If you repeat yourself, Google will be more likely to realize you are actually talking about a topic and not just bringing it up in passing.
With this in mind, is there anything in this blog entry that you feel bears repeating?


[...] I discussed previously, anything that is worth saying once is worth saying again. Google looks to see which words and phrases re-appear on the same page, and uses this to [...]