|
Defining the
“Internet”
The Internet has been
defined at “a network of networks”. It’s a system of computers linked together
by phone lines and cable systems. Many different kinds of information are
stored on these computers, and thus available over the
Internet. So many different kinds of research can be done there.
Some of this information can
be found by using a “search engine” website, such as Google,
Yahoo, Lycos, Ask Jeeves, etc., or by typing in a
known URL (Universal Resource Locator, or Internet address.) The quality and
accuracy of information found on these free websites varies greatly, as will be
explained below.
Other information you find
on the Internet is the electronic version of print resources, such as
encyclopedias and other reference books, and periodical indexes (searchable
lists of articles from newspapers, popular magazines, and scholarly journals.)
Generally speaking, a much larger percentage of this information is going to be
accurate. This information is stored in what we call article databases. Access
to these databases seems free to you, but that’s because you are a student at
Columbia State. Your tuition, fees, and taxes pay for the library to give you
access to these databases.
That’s not to say that you
should never use a search engine, or that you should never believe information
you see on a website you found using a search engine. Which approach you use
depends on the type of research you are doing. To decide how to approach your
research, let’s first talk a little bit about how the two types of Websites,
and how they are created.
What do
Search Engines do?
(This section adapted from
Finding
Information on the Internet: A Tutorial)
Search engines don’t search
the Internet directly. Each one searches lists of web pages chosen from all the
web pages in existence. So when you search the Web with a search engine, you
are using information that’s not completely up to date, although the webpages you get when you click on a link in your search
are the up-to-date pages.
Computer programs called
spiders create these lists, usually referred to as “indexes”. The spider
programs don’t “think,” and they can’t use judgment to "decide” how to
look up all the websites on a specific subject, or figure out different ways to
look up a subject to be sure they have found all the websites pertaining to it.
Remember, computers are not smart, they are just dumb very fast.
So how do these programs
find the websites they include in an index? By following the links in websites
already in the index. The index is built out of the structure of the Web
itself, without any judgment as to the validity or usefulness of the websites
listed.
After spider programs
discover webpages that are not already in their
indexes, another program categorizes the information in the page and stores it
in the search engine’s index files. When you use search engines, whether you do
a keyword search or use the engine’s directory search, you are searching these
index files.
How do
websites get on the Internet?
A website is a collection of
files, stored on a computer connected to the Internet. These files can be many
types. Some are files of text (and usually images) designed to be used as is.
You open them up, read them, and close them without making any changes to them.
These are called “static” pages, because they aren’t created or changed by the
user. Others are databases of information, such as the search engine’s indexes,
the library’s catalog database which lists everything the library owns, or the
reference or periodical indexes mentioned above. The pages where you input your
search terms are static pages, but the pages of results are temporary, or “dynamic”
pages. These dynamic pages are not stored on the website’s computers, so the
spiders cannot find them.
No Internet
"police"
A website can be put up by
any person or organization that can afford to buy space on an
Internet-connected computer. Anybody can put up websites that claim anything
they like. There is no organization that polices the Internet, no Board of
Directors that evaluates which websites relay accurate information, no editors
to check out the claims made on the Web. This is the biggest problem with using
search engines to do research. Remember our discussion of spider programs
above? These programs make no judgments about the quality or accuracy of the
websites they index.
Libraries and
Materials Selection
On the other hand, lots of
human judgment goes into the materials made available to you through a library,
whether these are “hard-copy” (paper, videotape, CD or DVD, etc) or electronic.
First, the editorial staff
of the company producing the material decides whether they want to publish it
or not. Now, admittedly, not every publisher considers the accuracy of the
materials the top criteria for producing it. But as savvy media consumers, we
have a feel for which publications or publishers are generally trustworthy.
Most people take the information published by, say, The National Enquirer, with a big grain of salt. Time and Newsweek magazines are more reliable, but because of their general
reader audience, they may publish some generalizations that an expert in that
field would object to. The New England
Journal of Medicine, since it’s written for professionals in the medical
field, is more reliable yet.
(We are not considering here
the political bias or social agenda of any given publication – that come
outside the scope of this essay.)
In a library, the people
buying the books and subscribing to the periodicals add another layer of human
judgment in deciding what to include. The library staff also uses judgment in
deciding which reference databases and periodical indexes to provide to their
patrons. We look at the reputation of the publisher and their track record in
the past and we read reviews printed in reviewing publications with a good
track record for accuracy. We also consider such things as the appropriateness
of the material to our patrons, and, in the case of an academic library,
whether our school teaches about the subject covered in the publication.
Based on all these factors,
we choose whether or not to add materials to our collection. For example, the
Columbia State library catalog lists eight times as many books on nursing as it
does on astronomy, since we don’t teach astronomy here, but we have a big
nursing program. And the Columbia State library doesn’t provide access to
periodical databases designed for children the way your local public library
does.
So if you are doing academic
research, or if you are looking for information that was published in hard copy
at some point, you want to use the library’s article databases, or its
online catalog. And please do come into the library, the first few times you do
this. Yes, Columbia State gives you access from off campus to the online
catalog and to our article databases. And it may be appealing to “do
research in your bunny slippers”, but when you are first learning how to do
electronic research, having a librarian help you navigate all the different
sources can save you a lot of time and effort. If the person behind the
reference desk looks busy, it’s just because we keep busy until someone comes
along to fulfill our primary purpose in sitting there -- to help our patrons.
And we don’t just help with
the catalog and the article databases. Depending on your research, you may
also want to use websites. But, since the search engines are not evaluating
what is listed in their indexes and directories, you will need to do some
evaluation of these sources yourself. We can also help you here.
Evaluating
websites
The quickest way to evaluate
a website is to figure out the author of the site, and the credentials of that
author.
One way to do this is to
look at the URL, or Internet address. Although there isn’t anyone out there
policing the Internet, there is an organization responsible for assigning URLs.
This organization has some rules for assigning URLs. Let’s look at a few URLs
and decode them.
http://www.columbiastate.edu/index.htm The URL for Columbia State’s homepage
http://www.mtv.com The URL for the MTV homepage
Remember,
websites are collections of files. Just as the hard drive of your computer has
folders, the computer on which these files are stored has directories. So just as an assignment you have saved on the hard
drive may have a path of C:\My Documents\CIS 013\First Assignment, website
addresses have pathnames as well. Many times you will see a
homepage with an address that ends in “index.htm” or index.html”, such as
Columbia State’s homepage. Other websites don’t use that naming
convention.
http://www.columbiastate.edu/library/
Tutorials/tutorials_index.htm
Here’s
a much longer URL. It’s on the Columbia State website, in the “library”
directory, which has in it a directory named “Tutorials”, and
the file it is pointing to is “tutorials_index.htm”.
(notice
that, unlike Microsoft Windows pathnames and filenames, URL’s don’t have spaces
in them).
Here
are some more URLs:
To tell
something about who is responsible for a website, find the first part of the
URL (the domain). Look at the 3-letter code at the end of the domain
name (the extension).
http://www.irs.gov/pub/irs-pdf/f1040.pdf
Websites
that are owned by governmental entities in the United States end in .gov .
These
include such useful sites as the IRS, the National Library of Medicine (http://www.nlm.nih.gov) and the Federal
Trade Commission (http://www.ftc.gov/bcp/menu-internet.htm)
is the URL for their webpages on E-commerce and the
Internet)
Not
all of these are Federal sites – for example, many (but not all) the official
websites for U.S. state governments use a URL formed by “www” and a dot, the
state’s name and a dot, and the extension “gov” –
www.tennessee.gov, www.alabama.gov, etc.
http://www.donhr.navy.mil/Jobs/default.asp
Websites
that are owned by parts of the United States military end in
.mil .
After
these two designations, it gets a little fuzzier.
http://www.earthlink.net/about/policies/privacy.html
Internet
networks are allowed to use the ending of .net.
However, since they are in the business of selling space on their computers to
other people and organizations, you could easily see a URL that looks like
this:
http://home.earthlink.net/~tsmith12/index.html
This
is the homepage for a (hypothetical) customer of the Earthlink company.
The
same thing can be true of the .com ending, used for commercial sites --
although many .com addresses are online stores, or sites belonging to
for-profit entities:
www.hondacars.com
www.amazon.com
www.mtv.com
Many
of the Internet Service Providers, who sell or give space to people wanting to
post websites, also end their domain names with .com -- so these hypothetical webpages might have addresses like this:
http://www.aol.com/members/~jjones/homepage.html
http://www.TheBestISP.com/~tsmith/myblog
A
similar situation exists with the .edu code –
http://www.columbiastate.edu/links.htm is the URL for the “Important
Links” page of the Columbia State website. It’s an official page of the
college. If a faculty member put up a website, its URL would look something like this:
http://www.columbiastate.edu/tsmith/homepage.htm
and
some educational institutions allow their students to put up websites as well.
So just because a URL’s domain ends in .edu, don’t
assume it’s an official page of the college or university.
Just
the opposite can be true of the .org extension.
http://www.pbs.org/wgbh/masterpiece/
Websites
that are owned by (usually) non-profit organizations end in .org. However, if
an organization doesn’t register its own domain name, it could have a URL like
this:
www.maurywebpages.com/mcas.htm The URL of the
Maury County Animal Shelter
So
looking at that 3-letter code isn’t always foolproof. But it is a start. Keep
it in mind as you look over the website – look for a link to “about us” or
other kinds of information identifying the author of the page. If you can’t
figure out who the author is, beware! If the author doesn’t identify him or
herself, it’s usually because they don’t want to stand behind what they have
written.
When to use
which
There are many times when
using a search engine and websites is a perfectly legitimate way of looking up
something.
The United States government
is disseminating a lot of its information over the Internet. A website ending in
.gov or .mil is just as legitimate as the paper copy of the same information.
The same can be said for the official websites of smaller government entities:
www.tennessee.gov -- State of Tennessee
official website
www.maurycounty-tn.gov - Maury County official website
Are you looking up
information on a medical condition? A museum? A special interest association?
Many of these have websites, and the information published here is as accurate
as it would be if it had been published on paper:
www.diabetes.org - American Diabetes
Association
www.memphisrocknsoul.org -
Memphis Rock 'n' Soul Museum
www.nascar.com
- Official NASCAR website
Academia is publishing online
as well. You may not always be able to access the webpages of a professor at
another institution, but if you can, it may have valuable information. Be aware,
however, that information published on a campus website may very well not go
through as strict a review process as an article published in an academic
journal.
Popular Culture information
is frequently published first on the Web. Again, keep in mind that this
information will be of varying reliability. Information about your favorite TV
show may be more reliable if it appears on the TV network’s “official site” for
the show.
Much consumer information is
available over the Internet. From designing your new car to finding the weekly
sales at the local grocery, it’s available online. However, you again need to be
careful when using these sites – look for the privacy and security statements
before entering any personal information.
(Did you know that for many
years, nothing was allowed to be sold over the Internet? Hard to believe, isn’t
it?)
However, if you are doing any
kind of in depth research on historical events, scientific or medical
information, or any kind of academic research, do not rely on freely available
websites alone. Keep in mind that these websites are primarily self-monitored,
and the information on them is only as reliable as its source. Information in
article databases has been through varying degrees of editorial scrutiny
and review.
And of course, if you are
doing an academic assignment, follow whatever guidelines for acceptable sources
given you by your instructor!
How Websites and
article databases come together
A little bit earlier I said
that the library staff could help you evaluate websites. There are several ways
we do this. One is to simply recommend a website to you, that we know about from
our professional reading and research. Another is to point you to a reliable
directory of websites, created by humans using human judgment. Some of these
are:
(you will notice that most of
these were compiled by libraries)
And another way is to point
you to an article database that includes listing for reliable websites. Two
of these are the
Encylopaedia Britannica and the
SIRS Knowledge Source database.
Library resources and
Internet websites –
complementary sources for information
When doing research, it’s not
a one-or-the-other proposition. The Internet is merely a communication medium.
Some of the information on it is perfectly legitimate, just as if it had been
published in paper format by a reliable publisher. Other websites are no more
substantial than a flyer photocopied at your local copy shop. Consider the
source, and the purpose for your research. And please let the library help –
that’s what we’re here for!
For
more information:
See Columbia State’s “Doing
Internet Research" webpage.
|