2012-04-25 13:28:10
It only takes half a second for Google to return a search based on keywords you type in, but there’s a whole lot more happening behind the scenes to give you the results you need. Google on Monday launched a video that explains the science behind how the massive search engine actually works.
Matt Cutts, software engineer head of Google’s webspam team, details in a YouTube video how the search engine giant thoroughly scours the web on a daily basis to provide the most up-to-date results to users.
“There are three things you need to do to be the best search engine in the world. First, you need to crawl the web comprehensively and deeply, then you want to rank or serve those pages and return the most relevant ones first,” Cutts said.
Although Google crawls the web on a daily basis, that wasn’t always the case.
“We used to crawl for 30 days… and then index for about a week and push that data out — and that would take about a week,” Cutts said. “Sometimes you would hit a data center with new data and sometimes you would hit a data center with old data.”
But this method wasn’t optimized since a lot of the information would be out of date. In 2003, Google switched to crawling a significant amount of the Internet each day. By scouring the web each day for new content, it incrementally updated its index.
“We have gotten even better over time, and at this point, we can keep it very fresh,” Cutts said.
To do so, page rank is the key deciding factor as to how likely you are to see a link: “We basically take page rank as the primary determinant and the more page rank you have — that is, the more people that link to you and the more reputable those people are — the more likely it is that we will discover your page relatively early in the crawl,” Cutts said.
Google also places a lot of emphasis on word order. For example, a search for pop singer “Katy Perry” will look for results with those two words next to each other, rather than having “Katy” and the word “Perry” show up in different parts of the content.
Finding the right balance between word proximity, page reputation and links pointing to it is the key.
“That’s kind of the secret sauce,” Cutt added.
Google then sends that query out to hundreds of different machines all at once, which look through their fraction of the web that has been indexed to find the best match.
“We say, ‘what’s the best page that matches this query across our entire index?” Cutts said. “We take that page and we try to show it with a useful snippet, so we show the keywords in the context of the document and get it all back in under half a second.”
πηγή: mashable.com
Matt Cutts, software engineer head of Google’s webspam team, details in a YouTube video how the search engine giant thoroughly scours the web on a daily basis to provide the most up-to-date results to users.
“There are three things you need to do to be the best search engine in the world. First, you need to crawl the web comprehensively and deeply, then you want to rank or serve those pages and return the most relevant ones first,” Cutts said.
Although Google crawls the web on a daily basis, that wasn’t always the case.
“We used to crawl for 30 days… and then index for about a week and push that data out — and that would take about a week,” Cutts said. “Sometimes you would hit a data center with new data and sometimes you would hit a data center with old data.”
But this method wasn’t optimized since a lot of the information would be out of date. In 2003, Google switched to crawling a significant amount of the Internet each day. By scouring the web each day for new content, it incrementally updated its index.
“We have gotten even better over time, and at this point, we can keep it very fresh,” Cutts said.
To do so, page rank is the key deciding factor as to how likely you are to see a link: “We basically take page rank as the primary determinant and the more page rank you have — that is, the more people that link to you and the more reputable those people are — the more likely it is that we will discover your page relatively early in the crawl,” Cutts said.
Google also places a lot of emphasis on word order. For example, a search for pop singer “Katy Perry” will look for results with those two words next to each other, rather than having “Katy” and the word “Perry” show up in different parts of the content.
Finding the right balance between word proximity, page reputation and links pointing to it is the key.
“That’s kind of the secret sauce,” Cutt added.
Google then sends that query out to hundreds of different machines all at once, which look through their fraction of the web that has been indexed to find the best match.
“We say, ‘what’s the best page that matches this query across our entire index?” Cutts said. “We take that page and we try to show it with a useful snippet, so we show the keywords in the context of the document and get it all back in under half a second.”
πηγή: mashable.com
ΜΟΙΡΑΣΤΕΙΤΕ
ΔΕΙΤΕ ΑΚΟΜΑ
ΠΡΟΗΓΟΥΜΕΝΟ ΑΡΘΡΟ
Αναγνώστης κατά της υποψηφιότητας του Π. Δήμα
ΕΠΟΜΕΝΟ ΑΡΘΡΟ
Π. ΚΑΜΜΕΝΟΣ: ΣΤΟΧΟΣ ΜΑΣ ΝΑ ΚΥΒΕΡΝΗΣΟΥΜΕ
ΣΧΟΛΙΑΣΤΕ