Saturday 26 February 2011

Google Revamps Its Search Engine to Fight Cheaters

By AMIR EFRATI

Google Inc., long considered the gold standard of Internet search, is changing the secret formula it uses to rank Web pages as it struggles to combat websites that have been able to game its system.

The Internet giant, which handles nearly two-thirds of the world's Web searches, has been under fire recently over the quality of its results. Google said it changed its mathematical formula late Thursday in order to better weed out "low-quality" sites that offer users little value. Some such sites offer just enough content to appear in search results and lure users to pages loaded with advertisements.

Google has changed its search algorithm in an effort to filter out data from "content farms" in search results. Marcelo Prince, Jessica Vascellaro and Simon Constable discuss how this affect site rankings and revenues for businesses.

Google generates billions of dollars from advertising linked to its search engine, whose influence as a front door to the world's online content and commerce continues to grow by the year. Google's power over the fortunes of so many other companies has made it a target of competitor complaints. It has also faced government investigations, including scrutiny by regulators in the U.S. and Europe.

The Silicon Valley company built its business on the strength of algorithms that yield speedy results. The company constantly refines those formulas, and sometimes takes manual action to penalize companies that it believes use tricks to artificially rise in search rankings. In recent weeks, it has cracked down on retailers J.C. Penney Co. and Overstock.com Inc.

Last month, Google acknowledged it "can and should do better" to beat back sites that "copy content from other websites" or provide information that is "just not very useful" but are ranked highly anyway.

"I've never seen Google be attacked on the relevancy of their results the way they have these past couple of months," said Danny Sullivan, editor of a widely read blog about the field called Search Engine Land.

The debate about Google's results was sparked by a recent blog post by Vivek Wadhwa, a former technology executive and a visiting scholar at the University of California-Berkeley. He wrote that his students had trouble finding basic information about the founders of start-up companies on Google.

"The problem is that content on the internet is growing exponentially and the vast majority of this content is spam," or of little use, he wrote. "Google has become a jungle."

Editors' Deep Dive: Firms Catch Up to Search Tweaks

Access thousands of business sources not available on the free web. Learn More

On Friday, Mr. Wadhwa said in an interview that he had previously "written Google off" but is now "optimistic they may well get this under control," though it will take time to see whether there are improvements. "It's not rocket science; they know who the bad guys are, they compensate the companies" by letting them post Google ads and share revenue, he said.

Google search engineer Amit Singhal said in an interview that the company added numerous "signals," or factors it would incorporate into its algorithm. Among those signals are "how users interact with" a site.

It also used feedback from hundreds of people it regularly hires to evaluate changes. These "human raters" were asked to look at search results and decide whether they would give their credit card number to a site or follow its medical advice, Mr. Singhal said.

On Thursday night, Mr. Singhal and a colleague wrote in a blog post that most of the changes would be "so subtle that very few people notice them" but "it's a big step in the right direction of helping people find ever higher quality in our results."

About 12% of U.S.-based queries would be affected by the change, Google said, and the changes would expand to non-U.S. users in the near future.

Google didn't give examples of Web pages that rose or dropped in its rankings for certain queries, setting off a wave of speculation by professionals whose job it is to help sites rise in Google's results.

"It has to be that some sites will go up and some will go down," the Google engineers wrote, adding that sites with original content "such as research, in-depth reports, thoughtful analysis and so on" will move up.

Many sites rely on Web traffic from Google, and even a small drop in the rankings could have a large impact and potentially reduce revenue. On Friday some large content creators, such as HubPages.com and ChaCha.com, said they noticed significant changes to traffic for some of their pages.

Demand Media Inc., which recently went public and runs large content sites such as eHow.com and Answerbag.com, said "we haven't seen a material net impact."

Mr. Sullivan, the blogger, said an eHow page with what he characterized as "shallow" content previously appeared as the first Google search result when users searched "how to get pregnant fast." Since Google's change Thursday, the eHow page has dropped out of the top results.

Thursday's move was an example of Google's tremendous influence over the Web, which has drawn scrutiny from U.S. and overseas governments that have launched probes to see whether it is involved in anticompetitive behavior. More recently, some websites have complained that Google is placing links to its own services ahead of Google's competitors.

Google says it acts in the best interest of users, and frustration by some sites is understandable.

"Google has an enormous amount of power to make or break businesses," said Scott Jones, chief executive of ChaCha Search Inc., a question-and-answer site, who said he was seeing some negative effects from Thursday's algorithm change, especially for Web pages on his site that have short, "bite-sized" content.

"It's unfair, I think, that Google made some wide, paint-brush decisions here in their algorithm that didn't take into account a site like ChaCha that does have unique content created at fairly high cost," he said.

Paul Edmondson, chief executive of HubPages.com, which shares ad revenue with writers that publish Web content about a variety of topics from making scarves to Mexico's Day of the Dead holiday, said it was too early to tell how his site, would fare under the changes.

Web traffic sent by Google to a HubPages article about nose piercing rose by 40% since yesterday, he said, while traffic to an article on "what happens if you abandon your home and let it foreclose" dropped by 80%.

Google said the effort that resulted in the latest search change has been underway for about a year. In order to learn which sites users find to be of poor quality, Google earlier this month began offering software for its Chrome browser that allows users to block sites from their search results if they deem them to be low quality.

Once blocked, the sites won't appear during future searches. Google on Thursday said that while it didn't use data from the experiment to influence the changes it made to its algorithm, it found that the algorithm change covered 84% of the Internet sites that were the "most-blocked" by users.

One new competitor to Google, start-up search engine Blekko, relies on its users to weed out what they believe are poor sites in categories such as health, cars and personal finance.

"Overall Google has done a great job and there are very few cracks in the system," said Seth Besmertnik, chief executive of Conductor Inc., a company that helps companies such as General Electric Co. and Federal Express rank highly on search engines. "But spammers are getting smarter and Google needs to keep getting smarter."

Write to Amir Efrati at amir.efrati@wsj.com