Columbia State Community College Finney Memorial Library
Find Books Find Articles Find Help Research Assistance Faculty Links

Seven Searching Mistakes

(compiled from SearchDay, Mar. 28, 2002

In the lighthearted spirit of the popular books for "idiots" and "dummies," here's a look at seven common blunders that are virtually guaranteed to deliver useless, nonsensical, or completely worthless search results. Some of these gaffes might surprise you.  But once you recognize them, it's easy to banish these little gremlins forever from your Web search tool kit.

1.   Stop Words
2.   Boolean Operators
3.   Common Words
4.   Heteronyms
5.   Capitalization
6.   Proximity
7.   Wrong Search Engine

1.  Stop Words

Some search engines simply ignore certain words. They are never used to find a matching document, despite what amounts to a direct command when you type them into a search form.

These are called "stop words" because the search engine doesn't "stop" when the words are found in the index (if they are even indexed at all). Why not? Because stop words are either too common to generate meaningful results, or are parts of speech like adverbs, conjunctions, prepositions, or forms of "be" that mean nothing unless they're part of a phrase with more "important" nouns and verbs.

If you use a stop word in a query you may get wildly irrelevant results. For example, the phrase "searching the web" contains two stop words: "the" and "web." Though it's not a particularly common word, web is used so frequently on the Internet that it's virtually worthless as a finding aid.

Stripping out the stop words, "searching the web" becomes "searching," which will naturally lead to results describing everything from criminal manhunts to quests for enlightenment—and if you're lucky, maybe even something about searching the web.

How can you identify stop words? Google tells you when it's ignoring a stop word, at the very top of a results page. You can force Google to include a stop word in a query by putting a plus sign in front of it. AlltheWeb takes a different approach -- it often automatically rewrites your query to include a stop word as part of a quoted phrase with other query terms. Check out the link below to the 300 most common words in English, many of which are stop words.

300 Most Common Words in English
http://www.zingman.com/commonWords.html
many of the 300 Most Common Words in English shown in this list are treated as stop words.

2.  Boolean Operators

Boolean operators, like "and," "or," and "not," can help narrow search results—when used properly. The problem is that Boolean operators, because of their apparent simplicity, appear to be easy to use. Maybe, and/or not really.

According to Ran Hock, author of The Extreme Searcher's Guide to web Search Engines, search engines implement Boolean features in different ways. For example, while some accept a simple "not," others require "and not" for the same effect. Additionally, some engines require that Boolean operators be capitalized, while others do not (or and do not?).

If you really want to use Boolean operators, learn how to use them. Two outstanding tutorials on using these features are listed below.

Search Engine Math
http://www.searchenginewatch.com/facts/math.html
Most search engines implement a simple form of Boolean logic that's relatively easy to master. This tutorial by Search Engine Watch's Danny Sullivan shows how to use this search engine math.

Boolean Searching on the Internet
http://library.albany.edu/internet/boolean.html
an outstanding, detailed, and comprehensive overview of Boolean searching on the Internet, from the University at Albany Libraries.

3.  "Vulgar" (common) Words

Vulgar comes from the Latin vulgus, meaning common. Like some educated sophisticates, search engines have a problem with common words. It's not that they're being snotty or pretentious. It's that some words are so common that they appear in literally millions of documents, making them virtually useless as a finding aid.

Take weather, for example. There are thousands of sites providing weather information, from local forecasts to elaborate treatises on meteorology. Tighten your query by using focusing words to narrow the scope of your search. Rather than merely searching for "weather," construct a query like "Cicely Alaska annual snowfall," or something equally specific.

4.  Heteronyms

Be careful when a word has multiple meanings. Think of the word "bond" as an example. If you just the single word "bond" as a query, the search engine has to figure out if you're looking for information about financial bonds, chemical bonds, or even James Bond. Make it easier for the engine to help you. Ask yourself the question before the search engine does for you, and phrase your query accordingly.

Search engines are also easily confused by heteronyms, words that are spelled identically but have different meanings when pronounced differently. For example, "lead," pronounced LEED, means to guide. Pronounced as LED, though, the word refers to the metal element. When you can, use concrete synonyms instead of heteronyms.

The Heteronym Home Page
http://www-personal.umich.edu/~cellis/heteronym.html
Using synonyms in search queries can be an effective way of narrowing the focus of your search -- unless they're heteronyms. For more examples, see this page.
 

5.  Capitalization

Yet another problem for the searcher is whether to use capital letters in a query. Some engines are case sensitive, while others are not. As a rule of thumb, it's a good idea to always use lower case letters when you search. This will typically return results that contain both upper and lower case letters.

If you use uppercase letters in a query to a case sensitive engine, results will only include documents that also use upper case letters. This is usually a good thing for proper nouns like names or places, which use initial upper case letters anyway. But it might cause you to miss other documents where case-sensitivity is less important.

6.  Proximity

Most search engines do a good job at matching simple phrases, like "Afghan refugees," or "space shuttle missions." The distance between one word and another in a document is referred to as proximity. Some search engines will give a positive result if your query words appear anywhere on a page, whether or not they are near each other, or are used together in a phrase.

If you're searching for something where your keywords must be near each other to get good results, your only option is to use AltaVista's advanced search and the NEAR operator in your query. This finds documents containing either specified words or phrases within 10 words of each other.

7.  Using the Wrong Search Engine

And now for the number one most common searching mistake:

If you're determined to find what you're looking for on the Web, be sure you're using the right tools for the job. Search engines vary widely in scope, function, and quality. You'll waste a lot of time if you don't choose the best search engine for each specific searching task.

Should you use a crawler-based search engine, or a human compiled web directory? How about a specialized search site, a database, or an invisible web resource? By analyzing your needs and comparing them with the strengths and weaknesses of each search service *before* you search, you'll likely get better results.

This may sound like a chicken and egg problem -- how do you know which search tools is best without trying them out first? Experience helps. There's also a terrific table created by Debbie Abilock that's a virtual cheat sheet for a wide variety of information needs.

If you're relatively new to searching and get stuck, don't be hard on yourself. One of the most ridiculous misconceptions I've ever heard is that "you can find anything on the Internet." This is about as true as saying that there are diamonds in every coal mine.  And though it may sound like heresy coming from someone who lives and breathes web search, sometimes your best bet for finding information is to log off and take a trip to your local library.

Libraries have tons of resources that aren't available on the Web. And librarians are trained experts who are usually more than willing to help you find what you're looking for. When you're getting nowhere on the Web, take advantage of these (usually very nice) "human search engines."

Updated 04/09/03; KD

Staff Directory Sites & Centers Columbia Campus
Library Hours
Library Homepage
This page has been visited Hit Counter times.

Last Updated 16 May 2008

   

Columbia State Community College is a Tennessee Board of Regents Institution
and is an affirmative action and equal opportunity employer committed to
the education of a nonracially identifiable student body. © Copyright 2007