Search engines have a short list of critical
operations that allows them to provide relevant web
results when searchers use their system to find
information.
Crawling the
Web
Search engines run automated programs, called "bots"
or "spiders", that use the hyperlink structure of
the web to "crawl" the pages and documents that make
up the World Wide Web. Estimates are that of the
approximately 20 billion existing pages, search
engines have crawled between 8 and 10 billion.
Indexing
Documents
Once a page has been crawled, its contents can be
"indexed" - stored in a giant database of documents
that makes up a search engine's "index". This index
needs to be tightly managed so that requests which
must search and sort billions of documents can be
completed in fractions of a second.
Processing
Queries
When a request for information comes into the search
engine (hundreds of millions do each day), the
engine retrieves from its index all the document
that match the query. A match is determined if the
terms or phrase is found on the page in the manner
specified by the user. For example, a search for
car and driver magazine at Google returns 8.25
million results, but a search for the same phrase in
quotes ("car
and driver magazine") returns only 166 thousand
results. In the first system, commonly called "Findall"
mode, Google returned all documents which had the
terms "car", "driver", and "magazine" (they ignore
the term "and" because it's not useful to
narrowing the results), while in the second search,
only those pages with the exact phrase "car and
driver magazine" were returned. Other advanced
operators (Google has a
list of 11) can change which results a search
engine will consider a match for a given query.
Ranking
Results
Once the search engine has determined which results
are a match for the query, the engine's algorithm (a
mathematical equation commonly used for sorting)
runs calculations on each of the results to
determine which is most relevant to the given query.
They sort these on the results pages in order from
most relevant to least so that users can make a
choice about which to select.
Although a search engine's operations are not
particularly lengthy, systems like Google, Yahoo!,
AskJeeves, and MSN are among the most complex,
processing-intensive computers in the world, managing
millions of calculations each second and funneling
demands for information to an enormous group of users.
Speed Bumps & Walls
Certain types of navigation may hinder or entirely
prevent search engines from reaching your website's
content. As search engine spiders crawl the web, they
rely on the architecture of hyperlinks to find new
documents and revisit those that may have changed. In
the analogy of speed bumps and walls, complex links
and deep site structures with little unique content
may serve as "bumps." Data that cannot be accessed by
spiderable links qualify as "walls."
Possible "Speed Bumps" for SE Spiders:
URLs with 2+ dynamic
parameters; i.e. http://www.url.com/page.php?id=4&CK=34rr&User=%Tom%
(spiders may be reluctant to crawl complex URLs like
this because they often result in errors with
non-human visitors)
Pages with more than
100 unique links to other pages on the site (spiders
may not follow each one)
Pages buried more
than 3 clicks/links from the home page of a website
(unless there are many other external links pointing
to the site, spiders will often ignore deep pages)
Pages requiring a
"Session ID" or Cookie to enable navigation (spiders
may not be able to retain these elements as a
browser user can)
Pages that are split
into "frames" can hinder crawling and cause
confusion about which pages to rank in the results.
Possible "Walls" for SE Spiders:
Pages accessible only
via a select form and submit button
Pages requiring a
drop down menu (HTML attribute) to access them
Documents accessible
only via a search box
Documents blocked
purposefully (via a robots meta tag or robots.txt
file - see
more on these here)
Pages requiring a
login
Pages that re-direct
before showing content (search engines call this
cloaking or bait-and-switch and may actually ban
sites that use this tactic)
The key to ensuring that a site's contents are
fully crawlable is to provide direct, HTML links to
each page you want the search engine spiders to index.
Remember that if a page cannot be accessed from the
home page (where most spiders are likely to start
their crawl), it is likely that it will not be indexed
by the search engines. A sitemap (which is
discussed later in this guide) can be of
tremendous help for this purpose.
Measuring Relevance and Popularity
Modern commercial search engines rely on the
science of information retrieval (IR). That science
has existed since the middle of the 20th century, when
retrieval systems powered computers in libraries,
research facilities, and government labs. Early in the
development of search systems, IR scientists realized
that two critical components made up the majority of
search functionality:
Relevance - the degree
to which the content of the documents returned in a
search matched the user's query intention and terms.
The relevance of a document increases if the terms
or phrase queried by the user occurs multiple times
and shows up in the title of the work or in
important headlines or subheaders.
Popularity - the
relative importance, measured via citation (the act
of one work referencing another, as often occurs in
academic and business documents) of a given document
that matches the user's query. The popularity of a
given document increases with every other document
that references it.
These two items were translated to web search 40
years later and manifest themselves in the form of
document analysis and link analysis.
In document analysis, search engines look at
whether the search terms are found in important areas
of the document - the title, the meta data, the
heading tags, and the body of text content. They also
attempt to automatically measure the quality of the
document (through complex systems beyond the scope of
this guide).
In link analysis, search engines measure not only
who is linking to a site or page, but what they are
saying about that page/site. They also have a good
grasp on who is affiliated with whom (through
historical link data, the site's registration records,
and other sources), who is worthy of being trusted
(links from .edu and .gov pages are generally more
valuable for this reason), and contextual data about
the site the page is hosted on (who links to that
site, what they say about the site, etc.).
Link and document analysis combine and overlap
hundreds of factors that can be individually measured
and filtered through the search engine algorithms (the
set of instructions that tells the engines what
importance to assign to each factor). The algorithm
then determines scoring for the documents and
(ideally) lists results in decreasing order of
importance (rankings).
Information Search Engines Can Trust
As search engines index the web's link structure
and page contents, they find two distinct kinds of
information about a given site or page - attributes of
the page/site itself and descriptives about that
site/page from other pages. Since the web is such a
commercial place, with so many parties interested in
ranking well for particular searches, the engines have
learned that they cannot always rely on websites to be
honest about their importance. Thus, the days when
artificially stuffed meta tags and keyword-rich pages
dominated search results (pre-1998) have vanished and
given way to search engines that measure trust via
links and content.
The theory goes that if hundreds or thousands of
other websites link to you, your site must be popular,
and thus, have value. If those links come from very
popular and important (and thus, trustworthy)
websites, their power is multiplied to even greater
degrees. Links from sites like NYTimes.com, Yale.edu,
Whitehouse.gov, and others carry with them inherent
trust that search engines then use to boost your
ranking position. If, on the other hand, the links
that point to you are from low-quality, interlinked
sites or automated garbage domains (aka link farms),
search engines have systems in place to discount the
value of those links.
The most well-known system for ranking sites based
on link data is the simplistic formula developed by
Google's founders - PageRank. PageRank, which relies
on a mathematical formula (based around finding a
given document in a random pattern of clicking on
links), is
described by Google in their technology section:
PageRank relies on the
uniquely democratic nature of the web by using its
vast link structure as an indicator of an individual
page's value. In essence, Google interprets a link
from page A to page B as a vote, by page A, for page
B. But, Google looks at more than the sheer volume
of votes, or links a page receives; it also analyzes
the page that casts the vote. Votes cast by pages
that are themselves "important" weigh more heavily
and help to make other pages "important."
Google uses a PageRank “proxy” value, which
logarithmically translates the actual PageRank of a
document to a value between 1 and 10, to rank Web
sites listed in its
directory (which offers a PageRank order or an
Alphabetical order for listings) and in its toolbar
(below).
Google's toolbar (available
here) includes an icon that shows a PageRank value
from 0-10
PageRank is, in essence, a rough system for
estimating the value of a given link based on the
links that point to the host page. Since PageRank's
inception in the late '90s, more subtle and
sophisticated link analysis systems have taken the
place of PageRank. Thus, in the modern era of SEO, the
PageRank measurement in Google's toolbar, directory,
or through sites that query the service is of limited
value. Pages with PR8 can be found ranked 20-30
positions below pages with a PR3 or PR4. In addition,
the toolbar numbers are updated only every 3-6 months
by Google, making the values even less useful. Rather
than focusing on PageRank, it's important to think
holistically about a link's worth.
Here's a small list of the most important factors
search engines look at when attempting to value a
link:
The Anchor Text of Link -
Anchor text describes the visible characters and
words that hyperlink to another document or location
on the web. For example, in the phrase "CNN
is a good source of news, but I actually prefer
the BBC's take on
events," two unique pieces of anchor text exist
- "CNN" is the anchor text pointing to http://www.cnn.com,
while "the BBC's take on events" points to
http://news.bbc.co.uk. Search engines use this
text to help them determine the subject matter of
the linked-to document. In the example above, the
links would tell the search engine that when users
search for "CNN", SEOmoz.org thinks that http://www.cnn.com
is a relevant site for the term "CNN" and that
http://news.bbc.co.uk is relevant to "the BBC's
take on events". If hundreds or thousands of sites
think that a particular page is relevant for a given
set of terms, that page can manage to rank well even
if the terms NEVER appear in the text itself (for
example, see the BBC's explanation of why Google
ranks certain pages for the term "Miserable
Failure").
Global Popularity of the Site -
More popular sites, as denoted by the number and
power of the links pointing to them, provide more
powerful links. Thus, while a link from SEOmoz may
be a valuable vote for a site, a link from bbc.co.uk
or cnn.com carries far more weight. This is one area
where PageRank (assuming it was accurate) could be a
good measure, as it's designed to calculate global
popularity.
Popularity of Site in Relevant
Communities - In the example above, the
weight or power of a site's vote is based on its raw
popularity across the web. As search engines became
more sophisticated and granular in their approach to
link data, they acknowledged the existence of
"topical communities"; sites on the same subject
that often interlink with one another, referencing
documents and providing unique data on a particular
topic. Sites in these communities provide more value
when they link to a site/page on a relevant subject
rather than a site that is largely irrelevant to
their topic.
Text Directly Surrounding the Link
- Search engines have been noted to weight the text
directly surrounding a link with greater important
and relevant than the other text on the page. Thus,
a link from inside an on-topic paragraph may carry
greater weight than a link in the sidebar or footer.
Subject Matter of the Linking Page
- The topical relationship between the subject of a
given page and the sites/pages linked to on it may
also factor into the value a search engine assigns
to that link. Thus, it will be more valuable to have
links from pages that are related to the site/page's
subject matter than those that have little to do
with the topic.
Link metrics are in place so that search engines
can find information to trust. In the academic world,
greater citation meant greater importance, but in a
commercial environment, manipulation and conflicting
interests interfere with the purity of citation-based
measurements. Thus, on the modern WWW, the source,
style, and context of those citations is vital to
ensuring high quality results.
The Anatomy of a HyperLink
A standard hyperlink in HTML code looks like this:
In this example, the code simply indicates that
the text "SEOmoz" (called the "anchor text" of the
link) should be hyperlinked to the page http://www.seomoz.org.
A search engine would interpret this code as a
message that the page carrying this code believed
the page http://www.seomoz.org to be relevant to the
text on the page and particularly relevant to the
term "SEOmoz".
A more complex piece of HTML code for a link may
include additional attributes such as:
In this example, new elements such as the link
title and rel attribute may influence how a search
engine views the link, despite its appearance on the
page remaining unchanged. The title attribute may
serve as an additional piece of information, telling
the search engine that http://www.seomoz.org, in
addition to being related to the term "SEOmoz", is
also relevant to the phrase "Rand's Site". The rel
attribute, originally designed to describe the
relationship between the linked-to page and the
linking page, has, with the recent emergence of the
"nofollow" descriptive, become more complex.
"Nofollow" is a tag designed specifically for
search engines. When ascribed to a link in the rel
attribute, it tells the engine's ranking system that
the link should not be considered an editorially
approved "vote" for the linked-to page. Currently, 3
major search engines (Yahoo!, MSN, & Google) all
support "nofollow". AskJeeves, due to its unique
ranking system, does not support nofollow, and
ignores its presence in link code. For more
information about how this works, visit
Danny Sullivan's description of nofollow's inception
on the SEW blog.
Some links may be assigned to images, rather than
text:
<a href="http://www.seomoz.org/randfish.php"><img
src="rand.jpg" alt="Rand Fishkin of SEOmoz"></a>
This example shows an image named "rand.jpg" linking
to the page - http://www.seomoz.org/randfish.php.
The alt attribute, designed originally to display in
place of images that were slow to load or on
voice-based browsers for the blind, reads "Rand
Fishkin of SEOmoz" (in many browsers, you can see
the alt text by hovering the mouse over the images).
Search engines can use the information in an
image-based link, including the name of the image
and the alt attribute to interpret what the
linked-to page is about.
Other types of links may also be used on the web,
many of which pass no ranking or spidering value due
to their use of re-direct, Javascript, or other
technologies. A link that does not have the classic <a
href="URL">text</a> format, be it image or text,
should be generally considered not to pass link value
via the search engines (although in rare instances,
engines may attempt to follow these more complex style
links).
In this example, the redirect used scrambles the
URL by writing it backwards, but unscrambles it
later with a script and sends the visitor to the
site. It can be assumed that this passes no search
engine link value.
<a href="redirectiontarget.htm">SEOmoz</a>
This sample shows the very simple piece of
Javascript code that calls a function referenced in
the document to pull up a specified page. Creative
uses of Javascript like this can also be assumed to
pass no link value to a search engine.
It's important to understand that, based on a
link's anatomy, search engines can (or cannot)
interpret and use the data therein. Whereas the right
sort of links can provide great value, the wrong sort
will be virtually useless (for search ranking
purposes). More detailed information on links is
available at this resource -
anatomy and deployment of links.
Keywords and Queries
Search engines rely on the terms queried by users
to determine which results to put through their
algorithms, order, and return to the user. But, rather
than simply recognizing and retrieving exact matches
for query terms, search engines use their knowledge of
semantics (the science of language) to construct
intelligent matching for queries. An example might be
a search for loan providers that also
returned results that did not contain that specific
phrase, but instead had the term lenders.
The engines collect data based on the frequency of
use of terms and the co-occurrence of words and
phrases throughout the web. If certain terms or
phrases are often found together on pages or sites,
search engines can construct intelligent theories
about their relationships. Mining semantic data
through the incredible corpus that is the Internet has
given search engines some of the most accurate data
about word ontologies and the connections between
words ever assembled artificially. This immense
knowledge of language and its usage gives them the
ability to determine which pages in a site are
topically related, what the topic of a page or site
is, how the link structure of the web divides into
topical communties, and much, much more.
Search engines' growing artificial intelligence on
the subject of language means that queries will
increasingly return more intelligent, evolved results.
This heavy investment in the field of natural language
processing (NLP) will help to achieve greater
understanding of the meaning and intent behind their
users' queries. Over the long term, users can expect
the results of this work to produce increased
relevancy in the SERPs (Search Engine Results Pages)
and more accurate guesses from the engines as to the
intent of a user's queries.
Sorting the Wheat from the Chaff
In the classic world of Information Retrieval, when
no commercial interests existed in the databases, very
simplistic algorithms could be used to return high
quality results. On the world wide web, however, the
opposite is true. Commercial interests in the SERPs
are a constant issue for modern search engines. With
every new focus on quality control and growth in
relevance metrics, there are thousands of individuals
(many in the field of SEO) dedicated to manipulating
these metrics in order to control the SERPs, typically
by aiming to list their sites/pages first.
The worst kind of results are what the industry
refers to as "search spam" - pages and sites with
little real value that contain primarily re-directs to
other pages, lists of links, scraped (copied) content,
etc. These pages are so irrelevant and useless that
search engines are highly focused on removing them
from the index. Naturally, the monetary incentives are
similar to email spam - although few visit and fewer
click on the links (which are what provide the spam
publisher with revenue), the sheer quantity is the
decisive factor in producing income.
Other "spam" results range from sites that are of
low quality or affiliate status that search engines
would prefer not to list, to high quality sites and
businesses that are using the link structure of the
web to manipulate the results in their favor. Search
engines are focused on clearing out all types of
manipulation and hope to eventually achieve fully
relevant and organic algorithms to determine ranking
order. So-called "search engine spammers" engage in a
constant battle against these tactics, seeking new
loopholes and methods for manipulation, resulting in a
never-ending struggle.
This guide is NOT about how to manipulate the
search engines to achieve rankings, but rather how to
create a website that search engines and users will be
happy to have ranking permanently in the top
positions, thanks to its relevance, quality, and user
friendliness.
Paid Placement and Secondary Sources in
the Results
The search engine results pages contain not only
listings of documents found to be relevant to the
user's query, but other content, including paid
advertisements and secondary source results. Google,
for example, serves up ads from its well-known
AdWords program (which currently fuels more than
99% of Google's revenues), as well as secondary
content from its
local search,
product search (called Froogle), and
image search results.
Below is a screenshot of Google's search engine
results page. Hover on any of the areas of the image
to reveal the source of the content:
The sites/pages ranking in the "organic" search
results receive the lion's share of searcher eyeballs
and clicks - between 60-70%, depending on factors such
as the prominence of ads, relevance of secondary
content, etc. The practice of optimization for the
paid search results is called SEM, or Search Engine
Marketing, while optimizing to rank in the secondary
results requires unique, advanced methods of targeting
specific searches in arenas such as local search,
product search, image search, and others. While all of
these practices are a valuable part of any online
marketing campaign, they are beyond the scope of this
guide. Our sole focus remains on the "organic"
results, although links at the bottom of this paper
can help direct you to resources on other subjects.
Google Launches Google Finance At long last,
Google has launched its own Google Finance
service. For years, those seeking specialty
financial information via Google have been sent
to competitors such as Yahoo and MSN. Now
Google's providing financial information
directly to its own users.
Search Engine Forums Spotlight Links to the
week's topics from search engine forums across
the web: Google Considers European Retail Push -
Using Link Farms as a Weapon? - What Do You Love
& Hate About Google - How I Made a Million in 3
Months - What Is A 'Poison Keyword?' - Is a
Site's Age a Factor? and more.
Finding the Values of U.S. Homes Talking
about the 'real estate bubble' is a popular
parlor game these days. Now you can find your
own estimates for how over-or-under valued your
neighborhood is thanks to a new web site called
Zillow.
Winning a Search Advertising Bid War Your
pay-per-click campaign is advancing steadily
until suddenly, the ranking officer at some
joker's company launches a surprise attack,
overruns your position, and pins you down in a
bloody bidding war. What do you do? Pull back?
Hunker down? Or counter-attack?
Search Engine Forums Spotlight Links to the
week's topics from search engine forums across
the web: Microsoft's Windows Live Search Opens;
Bye-Bye MSN Search? - Demographics Added to
AdWords - It's Illegal And Unethical And They
Know It! - Click Fraud Issue Growing - MSN
AdCenter Primer for New Users, and more.
Search Engine Optimization for Podcasts Podcasting
is comparatively new, though there are already
numerous podcast search engines and it's
important to optimize your audio files if you
want listeners to find your spoken content.
Exalead's Advanced Query Operators Via Gary
Price - Exalead, a major competitior in the 2nd
tier search engines, has an incredible array of
advanced operators to help sophisticated users
refine queries. The offerings include: An
advanced search page with the ability to modify
virtually every aspect of keywords on/off page A
set of advanced operators to narrow your search
Another set that can widen your search A ...
Calculating Relevancy of a Search Result The
poor results for SEO got me thinking about the
ways in which engineers would measure the
quality and relevance of the search results for
a query. I came up with a few quick ideas, but
I'd love to hear feedback and ideas. Perhaps a
small article/project can bubble up from this
brew. Psychology of Searcher Informational vs.
Transactional query (looking to research or buy)
Navigational vs. Search query (going to a
specific place or browsing for the right place)
Broad scope vs. Narrow scope (researching a
category or searching for an exact item)
Open-Ended vs. Specific-Minded (Is open to
receiving multiple viewpoints or ...
Domain Searches Using the Overture KW Tool Lots
of people use the Overture Keyword Tool to
estimate search volumes and get ideas for
additional keywords to target. I went
there over the weekend to do a few domain
searches. I felt that it would be a good
way to measure "brand strength" or get some idea
of the number of type-ins a domain might be
getting.Some of the results that I found were
staggering, others sobering, and some were
rather funny. I typed in the URLs for most
of my sites and only one of them had a
meaningful count - about 500. ...
SEOmoz's News Bend I've been getting a lot
of feedback about SEOmoz's blog shying away from
many of the most mainstream and oft-discussed
topics in the field. Everything from who's
sueing Google to stock earnings to acquisitions
and bannings gets written about at sites like
SERoundtable, Threadwatch & SEW, but Rand never
mentions it... why? The answer is that I don't
think it's worth providing additional coverage
of those topics. If they relate to something
we're writing about or we can add a completely
new angle, that's great, but in general posts
like the on I made...
A Morning Stroll through Hypertext Town Over
at Debra's Linkspiel blog, guest writer Bill
Slawski is talking about offline advertising for
online services. This was my favorite bit:
Walking through a parking lot, I feel an urge to
click on cars, noticing that some Maryland and
Pennsylvania license plates include URLs instead
of state mottos. Kind of hard to mouse over
vehicles while they are barreling down the
highway, but it must help spread the word that
these states have websites. As online push
marketing becomes more and more expensive, gu...
China Search Marketing Tour in Shanghai &
Nanjing More from David Temple on the China
Search Marketing Tour: The China Search
Marketing Tour left Beijing and headed to
Shanghai. All the tour members liked the "vibe"
in Shanghai, a very modern city. Shopping was
the first priority and everyone headed down to
Nanjing Road, a famous shopping area of
Shanghai. Our bargaining skills came into play
and there were good deals to be had. We had
lunch at a Japanese restaurant, Ajisen Ramen,
(of all places) but the food was good and the
price was right. That evening we attended a
fantastic acrobatic show. Women spinning plates,
men juggling, feats of strength, etc. The show
cul...
Dancing on the dark side Rand didn't just
ask his viewing public to vote 'yea' or 'nay' on
black hat topics, he asked the SEOMoz team as
well. Since I still hope to be an elf when
I grow up, I gave Rand my usual elven advice:
don't do it -- unless you want people to talk
about you.Black hat is another of those SEO buzz
expressions that gets done to death, but right
now it's still cool to talk about hats and hat
colors. Personally, I think everyone
should be classified a grey hat.In time, we'll
probably use Black Hat to mean something
entirely different from "violating search engine
guidelines". No expression keeps its
original meaning for long.&nb...
Top Results for SEO Over the 3 years or so
that I've been watching the ranking websites for
the term "SEO" at Google, there have been some
massive changes. From the dropouts of big sites
like SEOInc and SEO-Guy, to the incredibly quick
climb of MattCutts (now ranking #4 at my DC),
the SEO SERPs have been anything but static.
While Matt's been rising, Aaron's SEOBook has
fallen a few spots, and the stock quote for "STORA
ENSO OYJ" still appears at the top (only at G...
Wikipedia Link Building Suggestion For
anyone who uses Wikipedia's lax security to add
links, let me make a suggestion - add your link
in the "discussion" section for the page
(example) and ask others to review whether the
link you're providing will be pertinent or not.
This way, you get a real, non-nofollow,
indexable link from Wikipedia and you are at a
much lower risk of pissing off the community of
editors. Granted, Wikipedia's fact pages do
provide some nice click-through traffic, and the
value of being on the actual page, both for
visitors and search engines is greater, but I
think if you have doubts about the validity of
the...
B-Ball Madness, Web Video Style? These guys
are hungry, that?s why we watch them. No
contract disputes, no ego tirades ? at least not
yet ? just pure college basketball and lots of
it. Starting this week, we?re pulling together
as much NCAA tournament coverage...
Know Any Good Engineers or Operations Managers? One
of the benefits of del.icio.us now being part of
Yahoo is that we can afford to hire more people
to give the service the attention it needs to
grow bigger, faster, and better. In fact, we're
looking to beef...
A Chat with Andrei Broder (Part III) A while
back, Andrei Broder, a Yahoo! Research Fellow
and Vice President of Emerging Search
Technology, spent an afternoon telling us a bit
about his decades-long history within the search
industry and talking about his future projects.
To wrap up...
Achtung Maybe: Report from the ETech Attention
Zone O'Reilly's fifth Emerging Technology
Conference wrapped up in San Diego on Thursday.
The weather was unsettled and unseasonably cool,
but it never put a chill on the flow of big
ideas or diminished the quality of conversation
in the...
A chat with Andrei Broder (Part II) Last
week, we published the first of a three-part
interview with Andrei Broder, Yahoo! Research
Fellow and VP of emerging search technology for
Yahoo! In today?s segment, we spend some time
chatting with Andrei about what he means by
?search...
Making Money with Shopping APIs, and More Yahoo!
is now accepting applications for a commercial
version of our Shopping APIs featuring product
search, price comparison, ratings & reviews, and
shopping browse. The program, which is now in
limited beta, is similar to an affiliate model
in that...
Searching out the video buzz on Hollywood?s
biggest night Here at Yahoo!, we work on
bringing you lots of ways to tap the web for
whatever interests you ? movies, music, games,
etc ? both on Yahoo and from across the web.
This weekend, we?re featuring news, images,
video...
"Search without a box" - A chat with Andrei
Broder (Part 1) A while back, we spent an
hour interviewing a new colleague of ours,
Andrei Broder. Andrei joins our talented team
here at Yahoo!, in the role of Yahoo! Research
Fellow and Vice President of Emerging Search
Technology. Andrei's decades-long career...
Going deeper into the Wikipedia We've been a
big fan of Wikipedia for a while now, and we've
been working together with the Wikimedia folks
to make the Wikipedia even more accessible and
easy to use. Now, as part of our larger effort
to get...
What's been going on with Yahoo! Answers? It?s
hard for us on the core product team to believe
that it?s only been 2 months since we launched
the beta of Yahoo! Answers. Since then, we've
added many features to the site in a steady
stream of improvements....
Sniffing out the Best in Show on Yahoo! Video
Search Some may argue that the ?mockumentary?
Best in Show did little to portray competitive
dog showing in a positive light, but it can also
be argued that the movie did a great deal to
help bring the sport to mainstream...
Yahoo! Toolbar for Mozilla Firefox, reloaded Firefox
and Yahoo! fans take note. Mozilla.org released
Firefox 1.5 last November and is gaining
converts faster than ever. The Yahoo! Firefox
Toolbar 1.0 worked well with Firefox 1.5, and
since that release launched, we?ve been working
hard to fix...
My Web 2.0 Update We wanted to give you a
quick update on what we?ve been up to with My
Web. Not everything around here happens with
thunderous fanfare, though we have been known to
jump up and down when the occasion calls. :-)...
Super Bowl XL ? that?s 40, not extra large,
tough guy Ah, Super Bowl. For some, it?s a
time-honored tradition, for others, just an
opportunity to brush up on Roman numerals. This
annual gridiron classic, now four decades deep,
certainly attracts a diverse crowd of sports
fans and not-so-sports fans who...
Search in the Future Over on the Unofficial
Yahoo Weblog, Joe poses a thought provoking
question: Is search really the future for Yahoo!
(or anyone else for that matter)? You bet it is.
As we've said in the past, search is one of
Yahoo's...
Happy St. Patrick's Day Today's the day to
celebrate the Irish. Remember to wear some green
and join in the festivities. We have. Quick
Links:* Holiday History * St. Patrick's
Biography * All About Ireland...
Welcome Ask.com Italia! I'm excited to
announce that now Italians too can enjoy Ask's
advanced search technology. Our new Ask.com
Italia was officially launched in beta on March
8. Italian users may have caught a glimpse of
Ask.com and its remarkable technology...
Benvenuto Ask.com Italia! Vi annuncio con
immenso piacere che ora anche gli Italiani
potranno approfittare della tecnologia di
ricerca avanzata Ask. Il nuovo Ask.com Italia 蠳tato
lanciato ufficialmente in versione beta l'8
marzo. Probabilmente gli utenti italiani avranno
giࠡvuto modo...
Ask.com France: A Fresh Alternative The
French community has been waiting a long time
for a newcomer in the search engine arena. The
wait is over! At the same time as Ask.com is
unveiling a new image for its American, British,
Japanese, Spanish and...
Ask.com France: une alternative fraeur
Cela fait longtemps que la communaut頦ran硩se
attend l’arriv饠d’un nouveau venu sur la sc讥 des
moteurs de recherche. Voil࠱ui est fait ! Ask.com
a profit頤u d鶯ilement de la nouvelle image de ses
sites am鲩cain, anglais, japonais,...
Inside Ask Maps The new Maps product we
launched last week has had a fantastic response
from our users, the press, and the blogosphere .
This has turned out to be both a good and bad
thing for Maps. We are getting...
Code Red Party Recap As many of you know,
yesterday was kind of a big day for us. We
finally unveiled the new Ask.com and Barry
Diller got a chance to articulate why our search
engine is a relevant part of the search world...
The New Ask.com Blasts Off It's...alive!!!
On Sunday night at around 8:15pm PST we
officially launched the new Ask.com into orbit.
As usual with these things, it's not without
some bugs here and there, so we'll get right on
those. But she's up there and...
Another Brand Retirement of Note: Teoma
With all of the focus on our flagship brand
changing, we wanted to point out another brand
shift we'll be making soon: we are rolling the
Teoma brand in to Ask, pointing the Teoma.com
domain to http://search.ask.com. As we've...
Live from New York, It’s…. Special Thanks
to the WebmasterRadio.FM folks who have offered
to webcast and podcast Barry Diller’s SES
keynote address Monday. The opening keynote will
be broadcast live in a special edition of The
Daily Search Cast with SES Chair Danny
Sullivan...
I like the search in Google Finance (Just a
quick note. I’ve been trying to get to bed by
midnight each night.) Here’s the thing. I don’t
check our stock price that often. When I do,
it’s mostly to assure myself that I can still
afford plenty of cat food and/or cat toys to
keep our cat in the style to which she [...]
Fortune Cookie Sometimes I forget I’m a
geek. Then I get a fortune cookie like There
will always be delightful mysteries in your
life. Lucky numbers: 10 22 25 19 31 41 and I
immediately notice “Hmm. Those numbers could
form an IP address of 102.225.193.141. Should I
ping it?” By the way, the best fortune cookie I
ever got? It was [...]
Googlebot: Keep out! Okay, that last post
was pretty earnest, so I feel the need to post
something really technical now. At SES New York,
someone asked “Why don’t you provide a
parameter, like ‘?googlebot=nocrawl’ to say
‘Googlebot, don’t index this page’?” That was a
pretty good question. The short answer would be
that on pages you don’t want [...]
Google++ If you’re not a Googler, please
ignore this post. Okay, it’s just us Googlers
now, right? I’m sure you’ve seen Danny
Sullivan’s post about 25 things he loves about
Google and 25 things he hates about Google. If
your service got a shout-out on the love list,
congratulations. There’s a ton of stuff that
Google is [...]
SEO Advice: clean house before press releases
(Just a quickie post) A quick tip related to
checking your own site: if you’re going to send
out a press release about your SEO company, make
sure your site is clean before you do the press
release. Site in Internet Explorer: Site after
pressing the <ctrl>-A key: We often read press
releases too. If you’re going to attract [...]
SEO Advice: check your own site Remember a
while ago when I said that you should check your
website for spam before doing a reinclusion
request? In general, any time you think Google
may be dropping your site, it’s a good idea to
check for spam on your own site. For example, a
site called “The People’s Cube” recently posted
an [...]
Review: ShuttlePRO Multimedia Controller
Short review: It just works. I love it. Highly
recommended if you’re looking for a flexible USB
input device. It looks like this: Longer review:
I was in an Apple store with a friend this
weekend who was buying a video iPod, and I saw
the ShuttlePRO2 from Contour Design and bought
it on an impulse. [...]
Send. More. Spam reports. I’m looking for
some more spam reports. Not in the comments on
my blog; please use Google’s spam report form.
I’m especially looking for
Chinese/Japanese/Korean spam. Oh, and any sites
that consist mostly of English keyword stuffing
would be nice too. No special keywords are
needed, but feel free to put “cjk” on any
Chinese/Japanese/Korean [...]
How to sign up for WebmasterWorld
WebmasterWorld (also referred to as WMW) is one
of the earliest forums to talk about search
engine optimization (SEO) and site owner issues.
Back in 2001 or so, there were 3-4 main public
SEO forums, including WMW, JimWorld (now found
at searchengineforums.com), and Doug Heil’s
forum. Each of those sites continues to discuss
exciting SEO-related [...]
Back to Work Okay, a good night’s sleep
helps cure the crankiness, so it’s back to work.
No Etech, no Cebit, no SXSW, no GDC for me. No
SES in China (though that would be fun), no
AD:TECH. Just a nice month or two of solid work,
I hope. Time to get back into the swing [...]