Previous Page TOC Next Page


26

Searching Gopherspace with Veronica

How can you find a copy of the General Accounting Office's 1993 transition report on health care reform, get a great recipe for a chicken dish, and find a summer job at Cornell University—all without leaving your home? It's simple if you use the networking tool created by Fred Barrie and Steve Foster at the University of Nevada. This tool, called Veronica, is an indexer that simplifies the search for resources found in the Internet.

The development of the Internet gave all computer users, even those in remote areas, access to computer resources formerly available only in big cities or on university campuses. However, one of the problems new users (and even some old timers) have is that they need to know what resources exist, and where those resources are located, before they can effectively use the Internet to gain information. Gopher, a tool developed at the University of Minnesota, helped Internet users by enabling them to more easily browse menus that contained the many Internet resources. Now, users no longer have to know exactly where resources are located to effectively use the Internet. Instead, they can browse Gopher to find them.

As more and more information became available on the Internet, Gopher became extremely popular, but the increase in information created problems of its own. As the number of Gopher servers increased, it took users longer to find the information they wanted. On the Gopher newsgroup, there were daily postings requesting information about whether a piece of information existed in Gopher and how to retrieve it. It became almost impossible to wade through the Gopher menus and make any sense out of them. To effectively use Gopher, Gopher gurus had to "burrow" through Gopher servers, adding interesting Gopher items to their bookmark lists almost daily so that when the need arose, they would be ready. That's where Veronica comes in. It helps simplify searching for information and making connections to information in Gopherspace.

With Veronica, Gopher users simply have to type keywords to initiate a search of the titles of menu items in Gopher. For example, if you type the word chicken, you receive a list of approximately 577 titles that contain the word chicken. There are chicken jokes and recipes ranging from chicken curry to chicken casserole. Then you can select the menu item that contains the information you want. If it's in Gopher, Veronica can help you find it quickly.

How To Use Veronica

One of the great features of Veronica is that if you have mastered Gopher, you know how to use Veronica. Veronica was built specifically for Gopher. Veronica works with Gopher by creating Gopher menus that contain direct links to the information requested. Previous indexing methods of the network only had hints on how to get the information, such as "FTP to this host and retrieve that file." A novice can use Veronica for the first time and get exactly what he or she wants. However, Veronica has many, many options for the advanced user. It allows a user to restrict or expand the search of Gopher menu titles. It also allows logical-query operators for the construction of complicated searches. However, keep in mind that Veronica is not a full text index. It can index only titles of Gopher menu items.

To access Veronica, first start up Gopher and connect to the main Gopher server for Veronica (gopher.unr.edu). If the default Gopher server you connect to does not have a link to Veronica, you can access it directly by connecting to gopher.unr.edu. If you have a UNIX Gopher client, you can access Veronica by typing gopher gopher.unr.edu and selecting the Veronica directory. The listing of Veronica searches provided by this server contains a list of currently running Veronica searches.

You should be aware of some restrictions in Veronica. First, searches are not case-sensitive: the queries veronica and VERONICA are identical. In addition, all Veronica searches have a default number of 200 menu items, which means that only the first 200 items found are returned when you submit a query. This prevents users who accidentally type the keyword Gopher from getting the 10,000 plus menu items that are in Veronica. If you want to see more than the default 200 Gopher items, add the option -m followed by an integer number. To get the first 400 results, for example, add -m400 to your query. To see fewer items than the default of 200, use the -m option with an integer less than the default. If you want to see all records in the Veronica database that match your criteria, add the -m option without any number to your query.

One of the simplest searches in Veronica is on a single keyword. The following example shows a search on the keyword internet:

Internet Gopher Information Client 2.0 pl10

Search ALL of Gopherspace ( 3300 servers ) using veronica

1. Search gopherspace at NYSERNet <?>

2. Search gopherspace at PSINet <?>

—> 3. Search gopherspace at University of Pisa <?>

4. Search gopherspace at University of Cologne <?>

5. Search Gopher Directory Titles at NYSERNet <?>

6. Search Gopher Directory Titles at PSINet <?>

lqqqqqqqqqqqqqqqqqqSearch gopherspace at University of Pisaqqqqqqqqqqqqqqqqqqk

x x

x Words to search for x

x x

x internet x

x x

x [Cancel: ^G] [Erase: ^U] [Accept: Enter] x

mqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqqj

Press ? for Help, q to Quit, u to go up a menu Searching..\

This search returns a menu list of records that have the word internet in the title field:

Internet Gopher Information Client 2.0 pl10

Search gopherspace at University of Pisa: internet

—> 1. CA-91:17.DECnet-Internet.Gateway.vulnerability.

2. CA-91:18.Active.Internet.tftp.Attacks.

3. CA-92:03.Internet.Intruder.Activity.

4. CA-93:14.Internet.Security.Scanner.

5. b-43.ciac-decnet-internet-gateway.

6. c-16.ciac-net-internet-intrusions.

7. CA-91:17.DECnet-Internet.Gateway.vulnerability.

8. CA-91:18.Active.Internet.tftp.Attacks.

9. CA-92:03.Internet.Intruder.Activity.

10. CA-93:14.Internet.Security.Scanner.

11. b-43.ciac-decnet-internet-gateway.

12. c-16.ciac-net-internet-intrusions.

13. Internet Resource Guide/

14. Internet FYI Series/

15. Internet Resource Guide (ds.internic.net)/

16. Internet Resources/

17. Internet RFC documents/

18. The Hitchhikers Guide to the Internet.

You can further restrict the results of a query to a certain set of Gopher types. This restriction is done by adding the -t (for type) option to your query. The -t flag is followed by a list of Gopher types you want to see. You can specify more than one type in the query. Simply put all the types together with no spaces after the -t. A partial list of common Gopher types can be found in Table 26.1 (refer to Chapter 25, "Using and Finding Gophers," for more information on Gopher types). To restrict the search to just Gopher directories, for example, add -t1 to your search. If you want to restrict your search to directories and text files, add -t01 to your search. The items Search Gopher Directory Titles... on the Veronica menu are Veronica searches with the -t1 option already supplied.

Type


Description


0

Item is a file

1

Item is a directory

2

Item is a CSO (qi) phone-book server

3

Error

4

Item is a BinHexed Macintosh file (discouraged)

5

Item is a DOS binary archive of some kind (discouraged)

6

Item is a UNIX uuencoded file (discouraged)

7

Item is an Index-Search server (like Veronica)

8

Item is a pointer to a Telnet session

9

Item is a binary file of some sort

I

Item is an image

s

Item is a sound

You can expand the search Veronica performs by adding the metacharacter * to the end of a string of characters. Veronica finds all words that start with that string of letters. For example, a search for chick* matches chick, chicken, chickens, and so on. This form of word stemming allows you to find all forms of a certain word—useful when you want to find the singular and plural forms of a keyword (such as star and stars).

Veronica understands the logical operators AND, OR, NOT, and parentheses. The AND operator matches Gopher titles that have both words in the title. The OR operator matches Gopher menu items that have one word or the other in the title. Another form of restriction is the NOT operator, which matches queries for which the first word is in the title and the second word is not. The parentheses operator modifies the order of interpretation of the previous operators.

The last option Veronica recognizes is -l, which creates a link file suitable for use in a Gopher server or a user's bookmark file. This option is used mostly by Gopher server administrators who want to add Gopher menu items returned by Veronica to their Gopher servers. If a Gopher server is dedicated to environmental information, for example, the server's administrator can do a search on environ* -m -l to receive a list of all titles that have words starting with environ and immediately put this list in the Gopher server.


TIP When using the -l option, the trick is to save the file first before viewing it. When you view a file returned by a Gopher search item (of which Veronica is one), the client normally adds highlighting to the words that meet your search criteria. This highlighting causes the Gopher server to not recognize the links properly.

When you initiate Veronica with a query, Veronica reads the query from right to left and interprets operators as they are encountered. The query chicken and wine is processed in the following order: wine and chicken. If two words are next to each other, Veronica inserts an implied AND between the two adjacent keywords. If in doubt about the order of keyword interpretation, use parentheses. Also note that Veronica options cannot be concatenated together; all options must have at least one space between them.

Sometimes, it is beneficial to know where a Gopher menu item is located. The Gopher server that contains the menu item Veronica returns usually has related information you may find useful. Currently, Veronica does not maintain this information. However, your Gopher client can help you in this department. By using the Get Info about this Item option in Gopher, you can find out which host and port the items are located at and view other pertinent information. On the UNIX Gopher client, for example, you do this by pressing the = key.

Example Searches

Following are some simple searches you can do with Veronica:

internet

Search on the keyword internet. This search returns a menu list of at most 200 records that have the word internet in the title field.

internet -m1000

Search on the keyword internet, but show 1,000 items instead of the default of 200.

women -t1

Search on the keyword women and have only Gopher directories in the menu list.

chicken and wine

Search on the keywords chicken and wine. This search returns a menu list of at most 200 records that have both chicken and wine in the title field.

chicken or wine -t1

Search on the keywords chicken or wine, requesting directories only. This menu contains directories with the words chicken or wine or both in the title.

Following are some advanced searches you can do with Veronica:

Chinese food not MSG

Search for all the titles with the words Chinese and food but not MSG. Remember that there is an implied AND between the two words.

chicken (wine or curry) -m

List all titles in the Veronica database that have the words chicken and either wine or curry: chicken wine and chicken curry match but not chicken pot pie.

(chicken or wine) NOT (MSG or growing)

List titles with the words chicken or wine but not msg or growing.

chicken* or wine*

Search for all titles with word chicken, chickens, and so on or wine, wines, wineries, and so on.

The following chart summarizes options and operators used by Veronica:

Option


Description


-t

Select matching Gopher types.

-m

Specify how many Gopher menu items to return (default is 200).

-l

Create a link file suitable for a Gopher server.

AND

Both words must be in the title.

OR

Either word must be in the title.

NOT

The first word must be in the title but not the second word.

*

Matches word stems.

How Veronica Works

Veronica has three distinct phases: a search and harvest phase, an index phase, and a user query phase. In the harvest phase, Veronica builds a database of Gopher menu items from Gopher servers. After the database is collected by Veronica, it is indexed for quick retrieval by keyword searches. The last phase, the user query, is the phase most people associate with Veronica: the searching of keywords that meet the user's criteria.

The Harvest Phase

Veronica and Gopher are both built on top of the Gopher protocol. This protocol enables individual users to communicate with servers. Without the protocol, it would be nearly impossible for a Macintosh Gopher to communicate with a VMS server. Veronica also takes advantage of this protocol when communicating between Gopher server and harvester. Because of this protocol, Gopher menu items have a common structure. The following line is a representation of a Gopher menu item's structure:

[Gopher Type][Title][tab][Selector String][tab][Hostname][tab][Port Number]

The Gopher Type is a one-character representation of the item (for example, a 0 for a text file and a 1 for a directory). The Title is the actual string of characters displayed by the Gopher client. The proper method of retrieving this item is to connect to the Hostname at the specified Port Number and send the Selector String to the server. If the item is a 0 (that is, a text file), the server sends back a text file; if the item is a 1 (that is, a directory), the server sends back a listing similar to this:

1About Gopher 1/AboutGopher gopher.unr.edu 70

1Computer Services Help Desk Gopher Server fremont.scs.unr.edu 70

1UNR Campus Information 1/UNR-Campus gopher.unr.edu 70

1Internet Services and Other Gophers 1/OtherServices gopher.unr.edu 70

1Libraries and Reference Services 1/Libraries gopher.unr.edu 70

1Documentation about the Internet 1/Network-Docs gopher.unr.edu 70

1Discipline-Specific Topics 1/Selected gopher.unr.edu 70

1Search ALL of Gopherspace ( 3300 servers ) using veronica

1/veronica gopher.unr.edu 70

7Search Nevada Gopher menus by title keyword(s) (2 servers)

futique. scs.unr.edu 8013

.

In the preceding example, the first line represents a directory (Type=1) whose title string is About Gopher. The proper way to retrieve this subdirectory is to send the selector string 1/AboutGopher to the host gopher.unr.edu at port 70. The last line in this example represents a search item (Type=7) whose title string is Search Nevada Gopher.... Again, the proper way to retrieve this search item is to send an empty selector string (because no selector string was given) to the host futique.scs.unr.edu at port 70.

The search and harvest phase begins by creating a connection to a Gopher server. The harvester sends the proper selector string to the server so that the server sends back a listing of its top level menu items. For example, the following menu incorporates the list shown in the preceding example.

Internet Gopher Information Client v1.1

Root gopher server: gopher.unr.edu

1. About Gopher/

2. Computer Services Help Desk Gopher Server/

3. UNR Campus Information/

4. Internet Services and Other Gophers /

5. Libraries and Reference Services/

6. Documentation about the Internet/

7. Discipline-Specific Topics/

—> 8. Search ALL of Gopherspace ( 3300 servers ) using veronica/

9. Search Nevada Gopher menus by title keyword(s) (2 servers) <?>

Press ? for Help, q to Quit, u to go up a menu Page: 1/1

For each item in the list of the top level menu, the harvester checks to see whether the item is local to the Gopher server being harvested or if it is a link to another Gopher server. The Veronica harvester does this check to eliminate redundant Gopher links and to record new Gopher servers so that it can harvest them later. Veronica then builds a database by adding local Gopher menu items to the database.

If the item is local to the current Gopher server, the harvester must decide whether the item is a directory. A Gopher server can be thought of as a tree structure (similar to a file system structure). By traversing the subdirectories, it is possible to search the entire Gopher server. Veronica adds the directory's selector string to a list of directories for the current Gopher server. After the harvester is finished with the current directory, a new connection is made to the next subdirectory. The selector string for that directory is sent to the server and the whole process starts again for the new subdirectory.

If, on the other hand, the item is a link to another server, the harvester must decide whether the item is a new Gopher server or some other network provider like an anonymous FTP server. A text file/directory is, in the opinion of the creators of Veronica, the only true test to see whether a host-and-port combination is a real Gopher server. Other links like Telnet sessions and index searches are often not Gopher servers. If the item is a text file/directory, the hostname and port number of the server is recorded so that a harvest of that new server can be made later.

After sending the selector string to the Gopher server for the top menu items, the harvester records seven subdirectories that are local to gopher.unr.edu, one link to a Gopher server (fremont.scs.unr.edu, port 70), and the index search for use in the Veronica database (Search Nevada Gopher Menus). For each of the seven subdirectories, a separate connection is made to the Gopher server. In this manner, a Gopher server can be harvested completely. After the current server is finished, the harvester starts to search the next server in the list of Gopher servers. In this manner, all of Gopherspace can be harvested.

The harvesting of Gopherspace is done approximately every two weeks. It takes approximately two to three days to index the world's Gopher servers. The computational time, along with the strain placed on Gopher servers during the harvest, prohibits more frequent harvests.

Index Phase

In the index phase, Veronica creates indexes of the words in the titles of Gopher menu items. The Veronica indexer uses a stop word list, which contains some of the most common words in the English language (for example, a, the, an, and for). The stop list has a two-fold purpose: First it makes the size of the indexes smaller. Second, user queries that request these common words do not waste computation time by returning results that may not be very useful. A search on the letter A, for example, is usually not very beneficial.

User Query Phase

The user query phase starts when the user selects the search menu item. The Gopher client sends a message to the Veronica server with the text of the keyword (for example, chicken). The Veronica server receives the query and splits the text into atomic words. For each word, a record list is created with the information on how to retrieve the Gopher menu items from the Veronica database. After each word's record list is created, any query logic is applied. A search for Alpha and Beta creates a list that has both the words Alpha and Beta in the titles. The resulting list of Gopher menu items that meet all the user's criteria is sent back to the client, where it is presented as a menu of Gopher items that the user can immediately access.

In the example of the keyword chicken, the Gopher client presents the following screen:

Internet Gopher Information Client v1.1

Search gopherspace at NYSERNet: chicken

—> 1. why.did.the.chicken.cross.the.road.

2. LIVER ACETONE POWDER CHICKEN.

3. Chicken_Little,tomato_sauce,and_agriculture-who_will_produce_tomor.

4. Chicken And Egg Problem With Uuencode....

5. Last Night'S Chicken&Egg Revisited.

6. Re: philosphic chicken fryers.

7. Re: philosphic chicken fryers.

8. Re: philosphic chicken fryers.

9. chicken-adobo.

10. chicken-aprict.

11. chicken-asian.

12. chicken-basil.

13. chicken-bourbn.

14. chicken-broth.

15. chicken-cacc-1.

16. chicken-cinn-1.

17. chicken-curry.

18. chicken-curry2.

Press ? for Help, q to Quit, u to go up a menu Page: 1/12

The final step in the user query phase is when the user selects a menu item from the dynamically created menu Veronica presented. If you want to see the first item in this menu (why.did.the.chicken.cross.the.road?), a connection is made to a Gopher server; the user's Gopher client views the following text file:

Why did the chicken cross the road?

Aristotle: To actualize its potential.

Roseanne Barr: Urrrrrp. What chicken?

George Bush: To face a kinder, gentler thousand points of headlights.

Julius Caesar: To come, to see, to conquer.

Candide: To cultivate its garden.

Bill the Cat: Oop Ack.

Buddha: If you ask this question, you deny your own chicken-nature.

Moses: Know ye that it is unclean to eat the chicken that

has crossed the road, and that the chicken that crosseth the

road doth so for its own preservation.

Future for Veronica

Veronica is not a mature service. It is constantly under scrutiny by the developers for improvements. Veronica was and still is an experiment in network information discovery and retrieval. As the developers learn more about the network, Veronica changes. Some of the ideas the Veronica development team have in mind are listed here:

Veronica truly lives up to the name of Very Easy Rodent Oriented Netwide Index to Computerized Archives. The developers of Veronica are constantly trying to put even more meaning to the first two words Very Easy.

Previous Page TOC Next Page