Searching the Internet
General idea:
- First look for a category in a
directory
like Yahoo or Google Directory.
[Human-built, limited coverage, nicely structured]
- If no category,
use the flat list of hits from
a search engine like
Google or Alta Vista.
[Machine-built, wide coverage, but unstructured]
For a topic about which there is a lot of information,
like "Shakespeare":
- A search engine gives too many hits, with not enough structure:
We want something
more constrained.
-
A structured category in a directory is better:
For more obscure, or more localised, topics:
- You won't find a category in Yahoo or other directory.
-
You've no choice but a linear list of sites
(unstructured, but hopefully at least sorted
with the best ones near the top).
-
e.g. "Simmonscourt"
(a place in Dublin where the RDS have concerts,
and also a castle
I am interested in):
Directories v. Search engines
Directories - Hand-built. Hierarchical Structure.
Information is nicely organised.
Search engines - Machine built. Unordered List of sites.
Much more disorganised.
But because machine-built, indexes millions more pages than Directory.
Directories like Yahoo, because they are built by hand,
will always lag behind.
Directories - Only list the home page of each site.
Search engines - May list every single page on site.
Sometimes this is an advantage, sometimes a huge disadvantage.
Directories - For well-known, universal topics,
finding a good place to start on the Web.
For good places to link to for your readers' starting points.
Search engines - For obscure, once-off, heavy duty, user-driven searches.
Tips on Yahoo
Yahoo itself can be confusing.
Type "CGI" and you get a confusing page of hits,
a mix of categories and actual sites:
http://search.yahoo.co.uk/search?query=cgi
What we are looking for though is really if there is a dedicated
category for CGI, and in among the above list
we will see up at the top what we are looking for:
http://www.yahoo.com/Computers_and_Internet/CGI/
This, rather than the search results,
is the starting point we want to start our exploration of CGI.
This is also the kind of page that is good to link to,
if you want to provide a
Starting point for CGI
to your users.
Tips on search engines
Read the help page. Use all options.
Alta Vista has Boolean logic:
valera AND collins
valera OR collins
title:"de valera"
url:dcu.ie
url:dcu.ie AND ca2
Exercise
-
Find on the Web:
- Pi to 1000 digits
- Darwin's grandfather's birthday
- The Treaty Debates of 1921-22
- The teachings of Scientology
- What movies are on in Dublin now
- A graph of ex
- Aerial pictures of the North Korean concentration camps
- Copies of any 18th or 19th century newspaper
- Transcripts of any 18th or 19th century trial
- U.S. Election ads from any election before 2004
- List of pirate radio stations in Ireland
- For each item above,
what is the best hit to link to?
e.g. Authoritative site.
Site that links to other sites.
Site that will still exist in 10 years' time.
- Suggest something that cannot be found on the Web.
- Suggest something that cannot be found on the Web
for market reasons.
- Suggest something that cannot be found on the Web
for copyright reasons.
- Suggest something that cannot be found on the Web
for privacy reasons.
- Suggest something that cannot be found on the Web
for logistical reasons.
- Suggest something that cannot be found on the Web,
and we could be waiting 500 years to see it online.