Beruflich Dokumente
Kultur Dokumente
The first law of e-commerce is that if users cannot find the product, they cannot buy it either. Jakob Nielsen
Internet search engines are special sites on the Web that are designed to help people find information stored on other sites. There are differences in the ways various search engines work, but they all perform three basic tasks:
1. They search the Internet -- or select pieces of the Internet -- based on important words. 2. They keep an index of the words they find, and where they find them. 3. They allow users to look for words or combinations of words found in that index.
91 60 28 16 13 6 213
Google has one of the largest databases of Web pages, including many other types of web documents (blog posts, wiki pages, group discussion threads and document formats (e.g., PDFs, Word or Excel documents, PowerPoints). Despite the presence of all these formats, Google's popularity ranking often makes pages worth looking at rise near the top of search results.
Features in common
Google alone is often not sufficient, however. Less than half the searchable Web is fully searchable in Google. Overlap studies show that about half of the pages in any search engine database exist only in that database. Getting a second opinion is therefore often worth your time.
Ask.com or Yahoo! Search.
Things You CAN Do in Google, Yahoo!, and Ask.com Phrase Searching by enclosing terms in double quotes OR searching with capitalized OR - excludes, + requires exact form of word Limit results by language in Advanced Search
Things NOT Supported in Google, Yahoo!, or Ask.com Truncation - use OR searches for variants (airline OR airlines) Case sensitivity capitalization does
Size, type
HUGE. Size not HUGE. Claims over disclosed in any way that allows comparison. 20 billion total "web objects." Probably the biggest.
Boolean logic
Popularity ranking using PageRank. Indexes the first 101KB of a Web page, and 120KB of PDF's. ~ before a word finds synonyms sometimes (~help > FAQ, tutorial, etc.)
Shortcuts give quick access to dictionary, synonyms, patents, traffic, stocks, encyclopedia, and more.
+Requires/ Excludes
Sub-Searching
Add terms.
Based on page popularity measured in links to it from other pages: high rank if a lot of other pages link to it. Automatic Fuzzy Results Ranking Fuzzy AND also AND. invoked. Matching and ranking based on "cached" version of pages that may not be the most recent version.
Field limiting Based on SubjectSpecific Popularity, links to a page by related pages. Truncation Stemming
No truncation. Stems some words. Search Neither. Search variant endings and synonyms separately, with OR as in Google. separating with OR (capitalized): airline OR airlines
Role of search engines for e-commerce 80% of traffic determined by search 60% would use search to research a purchase 67% would choose a natural search result Examples (each month in the UK):
500,000 search for shopping 100,000 for clothes, shirts & shoes 1,000,000 for mobile phone 250,000 for furniture 25,000 for bed linen
Language
Translation
Yes.
No.
Orders
Analysis suggests roughly 30% of searchers will click a top three result, another 20% on rest of page one (top ten).
Sony RDR-GX7
what the term "natural" or "organic" search engine-listing means, they describe the "editorial" search results on any particular engine. These results are professed to be non-biased - meaning that the engine will not accept money to influence the rankings of any individual sites. Web Search Engine / 17 (59) A/Prof. Yang, Zhonghua Web Search Engine / 18 (59) A/Prof. Yang, Zhonghua
9.6%
M SN
Alt avist a
Ask Jeeves
Ot hers
Indexability The site must be navigated by robots and spiders Its content must be readable Robots dont like frames Robots dont like Flash Robots cant read into product catalogues
body copy
Relevance The content of your site must be relevant It must reflect the keywords Keywords are the words or phrases that web users use to search for information on the web Where and how you place and present these keywords in your site is vital
Where to put keywords Page title (the single most important place) Description meta tag (appears in listings) Body headers (H1) and copy Image/file names Image alt tags URLs Keywords meta tag and Offsite descriptions (directories etc)
Popularity Determined primarily by number of inbound & relevant links Influenced by frequency and recency of updates Visible in Googles Page Rank
Category page (make it relevant) Product page (make it relevant) Offsite relevance (directories, links)
How do I increase popularity? Get lots of people to link to your site (with the right keywords) Common approaches:
Get in the important directories Self-managed affiliate programmes Develop valuable content Research, surveys and quizzes Weblogs (blogs) Social bookmarks (del.icio.us)
Searching a database
Search Engines for the general web (like all those listed above) do not really search the World Wide Web directly. Each one searches a database of the full text of web pages selected from the billions of web pages out there residing on servers.
When you search the web using a search engine, you are always searching a somewhat stale copy of the real web page. When you click on links provided in a search engine's search results, you retrieve from the server the current version of the page.
Robots: Spider
Search engine databases are selected and built by computer robot programs called spiders (Web crawler).
Although it is said they "crawl" the web in their hunt for pages to include, in truth they stay in one place. They find the pages for potential inclusion by following the links in the pages they already have in their database (i.e., already "know about"). They cannot think or type a URL or use judgment to "decide" to go look something up and see what's on the web about it.
If a web page is never linked to in any other page, search engine spiders cannot find it. The only way a brand new page - one that no other page has ever linked to - can get into a search engine is for its URL to be sent by some human to the search engine companies as a request that the new page be included.
All search engine companies offer ways to do this.
Indexing
After spiders find pages, they pass them on to another computer program for "indexing."
This program identifies the text, links, and other content in the page and stores it in the search engine database's files so that the database can be searched by keyword and whatever more advanced approaches are offered, and the page will be found if your search matches its content.
"Spiders" take a Web page's content and create key search words that enable online users to find pages they're looking for.
What to look
Meta Tags
When the Google spider looked at an HTML page, it took note of two things:
The words within the page Where the words were found
Meta tags allow the owner of a page to specify key words and concepts under which the page will be indexed.
There is, however, a danger in over-reliance on meta tags, because a careless or unscrupulous page owner might add meta tags that fit very popular topics but have nothing to do with the actual contents of the page. To protect against this, spiders will correlate meta tags with page content, rejecting the meta tags that don't match the words on the page.
Words occurring in the title, subtitles, meta tags and other positions of relative importance were noted for special consideration during a subsequent user search.
The Google spider was built to index every significant word on a page, leaving out the articles "a," "an" and "the." Other spiders take different approaches.
The meta description tag allows you to influence the description of your page in the crawlers that support the tag
But Google ignores the meta description tag and instead will automatically generate its own description for this page
Indexing: weight
The robots tag lets you specify that a particular page should NOT be indexed by a search engine.
To make for more useful results, most search engines store more than just the word and URL. An engine might store the number of times that the word appears on a page. The engine might assign a weight to each entry, with increasing values assigned to words as they appear near the top of the document, in sub-headings, in links, in the meta tags or in the title of the page.
Each commercial search engine has a different formula for assigning weight to the words in its index.
How do crawler-based search engines go about determining relevancy follow a set of rules, known as an algorithm.
Exactly how a particular search engine's algorithm works is a closely-kept trade secret.
One of the main rules in a ranking algorithm involves the location and frequency of keywords on a web page. Call it the location / frequency method, for short.
Pages with the search terms appearing in the HTML title tag are often assumed to be more relevant than others to the topic. Search engines will also check to see if the search keywords appear near the top of a web page, Frequency is the other major factor in how search engines determine relevancy. A search engine will analyze how often keywords appear in relation to other words in a web page
However, all major search engines follow the general rules below.
How Search Engines Rank Web Pages "off the page" ranking criteria.
Off the page factors are those that a webmasters cannot easily influence. Chief among these is link analysis.
How Search Engines Rank Web Pages In addition, sophisticated techniques are used to screen out attempts by webmasters to build "artificial" links designed to boost their rankings. Another off the page factor is click through measurement.
a search engine may watch what results someone selects for a particular search, then eventually drop high-ranking pages that aren't attracting clicks, while promoting lower-ranking pages that do pull in visitors.
By analyzing how pages link to each other, a search engine can both determine what a page is about and whether that page is deemed to be "important"
Your keywords need to be reflected in the page content. consider "expanding" your text references, where appropriate.
For example, a stamp collecting page might have references to "collectors" and "collecting." Expanding these references to "stamp collectors" and "stamp collecting" reinforces your strategic keywords in a legitimate and natural manner.
Build Inbound Links Every major search engine uses link analysis as part of its ranking algorithm. By building links, you can help improve how well your pages perform in link analysis systems.
You want links from good web pages that are related to the topics you want to be found for.
Build Inbound Links Here's one simple means to find those good links.
Using a search engine, search for your target keywords. Look at the pages that appear in the top results. Now visit those pages and ask the site owners if they will link to you. Not everyone will, especially sites that are extremely competitive with yours.
Most search engines will index the other pages from your web site by following links from a page you submit to them.
submit the top two or three pages that best summarize your web site.
Submitting To Directories
Some types of pages and links are excluded from most search engines by policy. Others are excluded because search engine spiders cannot access them. Pages that are excluded are referred to as the "Invisible Web
what you don't see in search engine results.
Submitting To Directories: Yahoo & The Open Directory The Open Directory Project (aka ODP or DMOZ) is a volunteer-built guide to the web.
It is provided as an option at many major search engines, including Google. Given this, being listed with the Open Directory can add value to any site. Submission is absolutely free.
Paid Search Advertising: Google AdWords, Yahoo Search Marketing & Microsoft adCenter Every major search engine with significant market share accepts paid listings.
This unique form of search engine advertising guarantees that your site will appear in the top results for the keyword terms you target within a day or less. Paid search listings are also called sponsored listings and/or Pay Per Click (PPC) listings.
Size of database: How many documents does the search engine claim it has? How much of the total web are you able to search? Freshness ("up-to-dateness"): Search engine databases consist of copies of web pages and other documents that were made when their crawlers or spiders last visited each site. How often is the database refreshed to find new pages? How often do their crawlers update the Database of web copies of the web pages you are searching? Completeness of documents text: Is the database really "full" text, or only parts of the pages? Is every word indexed? Types of documents offered: All search engines offer web pages. Do they also have extensive PDF, Word, Excel, PowerPoint, and other formats like WordPerfect? Are they full-text searchable? Speed and consistency: How fast is it? How consistent is it? Do you get different results at different times?
Basic Search options and limitations: Automatic default of AND assumed between words? Accepts " " to create phrases? Is there an easy way to allow for synonyms and equivalent The search terms (OR searching)? Can you OR phrases or just single engine's capabilities: All words? Advanced Search options and limitations: Can you search engines require your search terms in specific fields, such as the document title? Can you require some words in certain fields let you enter some keywords and others anywhere? Can you restrict to documents only from a certain domain (org, edu, gov, etc.)? Limit to more and search on than one or only one? Can you limit by type of document (pdf them. What happens inside? or excel, etc.)? More than one? Can you limit by language? Can you limit in How reliably and easily can you limit to date last updated? General limitations and features: What do you have to do ways that will make it search on common or stop words? Maximum limit on increase your search terms or on search complexity? Ability to search chances of finding what you within previous results? Can you count on consistent results are looking for? from search to search and from day to day? Can you customize the search or display? Is there a "family" filter? Does it work well? Is it easy to turn on or off?
Results display All search engines return a list of results it "thinks" are what you are looking for. How well does it "think like you expect it think"?
Ranking: Are they ranked by popularity or relevancy or both? Do pages with your words juxtaposed (like a phrase) rank highest? Do you get pages with only some of your words, perhaps in addition to pages with them all? Display: Are your keywords highlighted in context, showing excerpts from the web pages which caused the match? Some other excerpt from the page? Collapse pages from the same site: If it shows only one or a few pages from a site, does it show the one(s) with your terms? How easy is it to see all from the site? Can this be changed and saved as your preferred search method?
Other features
Search engine designers try to come up with all kinds of features and services that they hope will allure you to their services.
A/Prof. Yang, Zhonghua
Summary Importance of search engine How it works How to user in term of ranking Features