Previous Page TOC Next Page


24

Archie An Archive of Archives

Anonymous FTP can be one of the most exciting Internet discoveries you can make. Once you become aware of the enormous amount of information and software available on the Internet, the challenge becomes finding the particular item you need when you need it. This is where Archie comes to the rescue. Says Peter Deutsch, one of the creators of Archie:

The Archie service is a collection of resource discovery tools that together provide an electronic directory service for locating information in an Internet environment. Originally created to track the contents of anonymous FTP archive sites, the Archie service is now being expanded to include a variety of other online directories and resource listings.

Archie's primary use is to locate specific files available with anonymous FTP somewhere on the Internet. However, the latest versions of the Archie software can be applied to information gathering and distributed database tasks beyond anonymous FTP. Archie's anonymous FTP database is not absolutely comprehensive. There is a limit to the number of FTP sites that can be included in Archie's database; however, the database is extensive and indexes many of the most popular anonymous FTP locations.

Archie started as a project of the McGill University School of Computer Science to meet their internal needs for locating anonymous FTP sites. As with many other Internet resources, a good idea, once made public, spreads quickly. This is true of Archie. From its original site at McGill University in Canada, the Archie service has spread throughout the world and is accessible from anywhere on the Internet.

In 1992, Peter Deutsch and Alan Emtage (two of Archie's original creators) created a company named Bunyip Information Systems to develop a commercially supported version of the software. Version 3.0 of Archie was the initial result of this effort.

Why Use Archie?

Archie is not the only way to find files available for anonymous FTP. Often, electronic mailing lists and newsgroups can be good sources of FTP information related to a particular topic or computing system. These forums, however, require that you do one of the following:

Depending on the activity level of a particular mailing list or newsgroup, either of these occurrences is possible. It is obvious, however, that either scenario requires patience and perhaps a little bit of luck.

Archie has the advantage of being automated and immediately or almost immediately accessible; it also enables you to conduct multiple attempts to search for a particular item of interest. As you will see, you can access Archie in a number of ways, and Archie offers several methods of searching for a program or file.

How Is Archie Used?

There are three primary ways to access the Archie service. If you use a computer directly connected to the Internet, it may be possible to run an Archie client program. The client performs the communication tasks with an Archie server to accomplish your search. The client program must be installed on your computer. (Client programs are available for many popular computer systems. Clients are most numerous for UNIX operating systems; however, DOS, Windows, and Macintosh clients are also available.) Using an Archie client enables you to enter one command (usually archie) to specify your search parameters. You can use various command options to control how the search is accomplished (more about Archie clients later).

With a direct connection to the Internet and access to a Telnet client program, you can establish a remote terminal session with an Archie server and enter commands interactively. You simply Telnet to the closest Archie server and log in as archie. (For a list of Archie servers, see the sidebar entitled, "Archie Servers Throughout the World," later in this chapter.) You can then enter commands at the prompt; the results return to your terminal session screen. Most Archie servers support a limited number of direct connections, so the Telnet method is often the most difficult to accomplish when the servers are busy. Because client programs and electronic mail both demand fewer resources of the Archie server, it may actually be quicker in the long run to use either of those methods for performing your Archie search.

If you do not have direct access to the Internet, you can still use Archie if you can send electronic mail to the Internet. You can send e-mail to an Archie server with your search commands in the mail message body; the Archie server e-mails the results back to you. This is the least interactive method of using Archie, but it is often useful when you want to do an unattended search of the Archie database. You can just fire off a mail message and later, when you next read your mail, the results will be awaiting you.

The Different Parts of Archie

The heart of the Archie service is a database of the file systems of anonymous FTP sites (which number in the thousands). Each server maintains its own database. Special resource discovery software runs each night to update about one thirtieth of the database so that each file system image is updated approximately once a month. This procedure ensures a reasonably accurate representation of contents at the FTP sites without creating an overly large amount of traffic on the Internet. Servers are set up to share information as well, so that starting a new server does not necessarily require a comprehensive resource discovery operation.

A second database maintained at Archie servers is the whatis database. The whatis database cross-references numerous terms with associated file or directory names. Not surprisingly, you can search the whatis database using the whatis command. Because the Archie database indexes filenames within directory structures, your search for a name that seems intuitive may not find the exact software you want. If you search for a term in the whatis database, the server returns the names of associated files that you can then locate with a normal Archie command. Note that you can only search the whatis database using the Telnet or e-mail methods of using Archie. The Archie client programs are only capable of searching the file system database.

The various Archie servers are obviously a very important part of the Archie service. They are, in fact, the very foundation of the information management process. As you have seen so far, the servers collect and maintain the anonymous FTP site information, maintain the whatis database, and accept and process queries to either database using one of the three methods described earlier. The Archie servers are generally computers running the UNIX operating system and are connected to the Internet. These computers are maintained for the most part by universities and network service organizations and are offered as a public resource to Internet users.


ARCHIE SERVERS THROUGHOUT THE WORLD

As you can see from the following list, Archie is a world-wide service. You can specify any of the addresses below as the primary server for an Archie client. You can Telnet to these addresses to use Archie interactively, or you can send e-mail to archie@<server address> with appropriate search commands. This list was adapted from a list found on the Bunyip anonymous FTP site. The file can be found at: ftp://ftp.bunyip.com/pub/bunyip-docs/serverlist.

Server Address


Numeric Address


Location


archie.au

139.130.23.2

Australia

archie.univie.ac.at

131.130.1.23

Austria

archie.bunyip.com

192.77.55.2

Canada

archie.cs.mcgill.ca

132.206.51.250

Canada

archie.uqam.ca

132.208.250.10

Canada

archie.funet.fi

128.214.6.102

Finland

archie.univ-rennes1.fr

129.20.254.2

France

archie.th-darmstadt.de

130.83.22.1

Germany

archie.ac.il

132.65.16.8

Israel

archie.unipi.it

131.114.21.10

Italy

archie.wide.ad.jp

133.4.3.6

Japan

archie.hana.nm.kr

128.134.1.1

Korea

archie.sogang.ac.kr

163.239.1.11

Korea

archie.uninett.no

128.39.2.20

Norway

archie.rediris.es

130.206.1.2

Spain

archie.luth.se

130.240.12.23

Sweden

archie.switch.ch

130.59.1.40

Switzerland

archie.switch.ch

130.59.10.40

Switzerland

archie.ncu.edu.tw

192.83.166.12

Taiwan

archie.doc.ic.ac.uk

146.169.16.11

UK

archie.doc.ic.ac.uk

146.169.17.5

UK

archie.doc.ic.ac.uk

146.169.2.10

UK

archie.doc.ic.ac.uk

146.169.32.5

UK

archie.doc.ic.ac.uk

146.169.33.5

UK

archie.doc.ic.ac.uk

146.169.43.1

UK

archie.doc.ic.ac.uk

155.198.1.40

UK

archie.doc.ic.ac.uk

155.198.191.4

UK

archie.hensa.ac.uk

129.12.43.17

UK

archie.sura.net

128.167.254.195

USA(MD)

archie.unl.edu

129.93.1.14

USA(NE)

archie.internic.net

192.20.225.200

USA(NJ)

archie.internic.net

192.20.239.132

USA(NJ)

archie.internic.net

198.49.45.10

USA(NJ)

archie.rutgers.edu

128.6.18.15

USA(NJ)

archie.ans.net

147.225.1.10

USA(NY)

As you can see from the preceding sidebar, there is probably an Archie server near you. When selecting a server, it is best to use one that is closest to you through the Internet. If you attempt to use Archie with Telnet and that server is busy, you can select an alternative location. You should take care, however, in selecting an alternative site. If you are in the western United States, and the University of Nebraska server is busy, it probably does not make sense to try the one in Korea. You can, but you will be adding to the bandwidth of a transoceanic network link and probably will not be receiving the best network performance, either. The New York or Maryland server may be a better choice.

The Archie client programs provide you with the most direct access to the Archie FTP file system database. Because the clients are based on the Prospero file system, they do not have access to the whatis database. The Prospero file system is software used to organize and search file references (and is partially the basis for the Archie service itself).


ARCHIE CLIENT PROGRAMS

Listed below are the most commonly used Archie client programs. There may be client programs available for other systems; however, these are the ones most widely available on anonymous FTP sites.

Filename


Author Name and Address


Description


c-archie-1.[1-3].tar.Z

Brendan Kehoe
(brendan@cs.widener.edu)

Command-line program written in C

c-archie-1.[2,3]
-for-vms.com

Brendan Kehoe
(brendan@cs.widener.edu)

Command-line program written for DECVAX/VMS

archie.el

Brendan Kehoe
(brendan@cs.widener.edu)

Command-line interface written for emacs

perl-archie-3.8.tar.Z

Khun Yee Fung
(clipper@csd.uwo.ca)

Command-line program written in perl

xarchie-1.[1-3].tar.Z

George Ferguson
(ferguson@cs.rochester.edu)

X11(R4) client program using the Athena widget set

archie-one-liner.sh

Mark Moraes
University of Toronto

UNIX shell program

NeXTArchie.tar.Z

Scott Stark
(me@superc.che.udel.edu)

NeXTStep client program

mac-archie-client-09.hqx

Chris J McNeil
(cmcneil@macc2.mta.ca)

Macintosh client

Anarchie-140.sit

Peter Lewis
(peter@kagi.com)

Macintosh FTP and Archie client

archie.zip

Brad Clemens
(bkc@omnigate.clarkson.edu)

PC DOS client program

wsarch06.zip

David Woakes
(david@maxwell.demon.co.uk)

Windows WinSock client

The Archie client programs commonly available usually run on computers with UNIX operating systems. The reason for this may be that UNIX computers have traditionally been the foundation for many Internet networks and protocols; the Archie service itself is greatly dependent on UNIX systems. Clients are available, however, for NeXTStep (admittedly a flavor of UNIX), VAX VMS, DOS, and even IBM's VM/CMS (see the sidebar entitled "A Note for IBM Mainframe Users").


ARCHIE CLIENT PROGRAM ANONYMOUS FTP SITES

These sites on the Internet are good places to look for the most common Archie client programs. Most of them maintain copies of the various client programs listed in the preceding sidebar and can be accessed with anonymous FTP.

ftp://ftp.bunyip.com/pub/archie-clients
ftp://gatekeeper.dec.com/.3/net/infosys/archie/clients
ftp://cs.columbia.edu/archives/mirror2/uunet/networking/info-service/archie/clients
ftp://sunsite.unc.edu/pub/packages/infosystems/archie
ftp://ftp.mr.net/pub/Info/archie/clients
ftp://ftp.sura.net/pub/archie/clients
ftp://ftp.uu.net/networking/info-service/archie/clients
ftp://ftp.luth.se/pub/infosystems/archie/clients

Anarchie Sites:

ftp://redback.cs.uwa.edu.au//Others/PeterLewis/
ftp://amug.org/peterlewis/
ftp://nic.switch.ch/software/mac/peterlewis/


A NOTE FOR IBM MAINFRAME USERS

Even if you use an IBM mainframe, Archie may still be accessible to you. An MVS Archie client was written by Alasdair Grant at the University of Cambridge Computer Laboratory. This served as the basis for a VM/CMS client written by Arthur J. Ecock (ECKCU@CUNYVM.CUNY.EDU) at the City University of New York. The VM/CMS client requires IBM's VM/TCP version 2 or greater and the freely available RXSOCKET software package. The Archie client is distributed with version 2 of the RXSOCKET software. The package is available on the Bitnet Listserv installation at CUNYVM. You can retrieve the package by sending a mail message to listserv@cunyvm.cuny.edu with SENDME RXSOCKET PACKAGE as the first line of the message.

Using Archie

A client program may be the simplest way to use Archie, and the simplest way to use a client program may be as follows:

archie string

In this syntax, string is any search string you want to specify. By default, the search tries to exactly match the string you specify. For example (using the popular Kermit public domain file transfer protocol as our guinea pig), if you know that there is a file out there named c-kermit, the following command should return one or more sites where a file with that name is available:

archie c-kermit

Not all files are that easy, however, and often the names are longer than your search string or they have the string embedded somewhere within them. Fortunately, the Archie clients have some options you can use to be more or less specific when doing your search. For example, this command returns all occurrences of files that contain the exact substring c-kermit (capitalization is respected):

archie -c c-kermit


NOTE For the sake of this discussion, the examples reflect the use of the Archie C client, written by Brendan Kehoe. Other clients, however, support the same or similar command options. The DOS client options, for example, are quite similar to the C client options. The Windows and Macintosh clients provide fields to enter a search string and provide an interface for specifying a search type and a target Archie server. If you retrieve a particular client with anonymous FTP, be sure to retrieve any documentation or manual files that accompany that client. These files provide the exact command syntax and options.


A SAMPLE ARCHIE CLIENT QUERY

The following is an example of using an Archie command-line client program to perform a substring search of the Archie database in which case is ignored. For the sake of space, the output listing has been edited. A normal search returns a default maximum of 95 "hits" of the search string.

archie -s c-kermit

Host gatekeeper.dec.com
Location: /.2/usenet/comp.sources.unix/volume1
FILE -r-r-r- 6994 Jun 1 1989 c-kermit.ann.Z
FILE -r-r-r- 3208 Dec 1 1986 c-kermit.old.Z

Host uhunix2.uhcc.hawaii.edu
Location: /pub/amiga/fish/f0/ff026
FILE -rw-r-r- 238032 Jul 9 1992 C-kermit.lha


Host osl.csc.ncsu.edu
Location: /pub/communications
FILE -rw-r-r- 617243 Aug 31 1992 C-Kermit_5a.tar.Z

Host ftp.shsu.edu
Location: /KERMIT.DIR;1
FILE -rw-r-x-w- 2 Dec 1 1992 C-KERMIT-V5A-DOC.ZIP-LST;1
FILE -rw-r-x-w- 3 Dec 1 1992 C-KERMIT-V5A-EXE.ZIP-LST;1
FILE -rw-r-x-w- 11 Dec 1 1992 C-KERMIT-V5A-SRC.ZIP-LST;1

Host ftp.utoledo.edu
Location: /KERMIT_ROOT.DIR;1/KERMIT_BINARY.DIR;1
FILE -r—x-w- 1084 Apr 4 1993 VMS-C-KERMIT-MULTINET.EXE;1
FILE -r—x-w- 1133 Apr 4 1993 VMS-C-KERMIT-WOLLONGONG.EXE;1

Host wuarchive.wustl.edu
Location: /systems/amiga/aminet/comm/misc
FILE -rw-rw-r- 697822 Dec 11 1992 C-Kermit-5A-188.lha

Location: /systems/mac/info-mac/Old/comm
FILE -rw-r-r- 198461 Sep 29 1991 mac-kermit-098-63.hqx

Host ugle.unit.no
Location: /pub/kermit/os2
DIRECTORY drwxrwxr-x 1024 Jan 11 1993 c-kermit
Location: /pub/kermit/vms
DIRECTORY drwxrwxr-x 1024 Aug 16 09:46 c-kermit

The preceding sidebar shows yet another variation on this search example. The -s option specifies a search in which the string is matched anywhere within a target file and the case of the letters is ignored. You can see by the example that the results vary quite a bit. Note the variations: c-kermit, C-kermit, C-KERMIT, VMS-C-KERMIT, and mac-kermit (yes, mac-kermit contains the substring c-kermit).

Perhaps it is appropriate for a note about the output Archie returns to you. The word Host indicates each Internet anonymous FTP host (so far, pretty simple). Under each host, you see one or more designations of Location: which shows you the file path to the location of the file. In other words, when doing an anonymous FTP file transfer, once connected, you can use this command to move to the directory where the file is stored: cd path, where path is the complete string that follows Location:. After the location is the designation FILE or DIRECTORY, depending on which is found, followed by a UNIX-style permissions listing, a file date, and the filename itself. This information is usually enough to do a successful anonymous FTP download to retrieve the file you want.

Following are more of the options available with the C Archie client program. They not only tell you how to control your Archie search, but also reveal a bit about the operation of the client itself. (If you use the client on a UNIX operating system, you can usually enter the command man archie to see these options as well as other information.)

Option


Description


-

A - by itself enables you to search for that character in a substring. For example, the command archie -c - -v5, looks for the occurrences of -v5 in a filename.

-c

Searches substrings while respecting the case of the letters.

-e

Performs an exact string match (the default).

-h hostname

Tells the client to query the Archie server specified by hostname.

-L

Shows a list of the Archie servers known to the client program when it was compiled, as well as the name of the client's default Archie server. For a current server list, send e-mail to archie@archie.mcgill.ca (or to Archie at any Archie server) with the text servers as the body of the message.

-m hits

Specifies the maximum number of search string matches (database hits) to return (the default is 95).

-o filename

Specifies the name of a file in which to store the results of a query.

-r

Specifies that the next argument is a regular expression for specifying the search string.

-s

Searches substrings without respecting the case of the letters.

-t

Sorts the results by descending date.

-V

Prompts the server to print some status comments while performing long searches.

The options -c, -r, and -s are mutually exclusive; if more than one of these is specified, only the last one is used. Using -e with any of these three options causes the server to first check for an exact match and then perform a substring or regular expression search.

Accessing an Archie Server with Telnet

If your computer supports remote terminal sessions over the Internet with a Telnet client program, you can connect to an Archie server directly. As mentioned previously, this is not always the most efficient way to perform a search, and the use of client programs is encouraged to save both network and server resources. One thing you can easily do through a direct connection, which clients do not support, is a search of the whatis database.

To connect to an Archie server, enter this command:

telnet server

In this syntax, server is one of the servers listed in the sidebar, "Archie Servers Throughout the World," earlier in this chapter. You are prompted for a username, to which you should reply Archie. At this point, you receive one of two messages. The server may tell you that there are too many people using it and to try a different site. The server follows such a message by immediately closing the Telnet session. The alternative to this rejection message is a greeting by the server followed by an Archie prompt.


A SAMPLE ARCHIE TELNET SESSION

The following shows a sample Archie query done in a Telnet session to an Archie server. The server used is kept anonymous to prevent it from bearing the brunt of too many test sessions. This example shows a typical session using a whatis command, setting some search parameters, and finding a file.

telnet archie.xxx.xxx
Trying xxx.xx.x.xx ...
Connected to xxxxxxx.xxx.xxx.
Escape character is '^]'.
SunOS UNIX
login: archie
password: archie
Last login: Sun Jan 2 21:46:06
SunOS Release 4.1.2 #1: Wed Dec 16 12:10:12 EST 1992
#############################################################################

Welcome to the ARCHIE server.

Please report problems to archie-admin@foo.edu. We encourage
people to use client software to connect rather than actually logging in.
Client software is available on ftp.xxx.xxx in the /pub/archie/clients directory

If you need further instructions, type help at the archie> prompt.
#############################################################################

archie> whatis kermit

c-kermit.ann C-Kermit & USENET
ckermit The 'C' implementation of Kermit
cu-shar Allows kermit, cu, and UUCP to all share the same lines
dialout Kill getty and kermit programs
kermit Communications software package
kermit.hdb Kermit patches to enable dial to use HDB database
okstate UUCP Access to Kermit Distribution
unboo.bas Decode Kermit boo format

archie> set search sub
archie> set maxhits 5
archie> set sortby hostname
archie> prog c-kermit

# matches / % database searched: 5 / 7%
Sorting by hostname

Host ftp.cs.uni-sb.de (134.96.7.254)
Last updated 00:05 6 Jul 1993

Location: /pub/comm
FILE rw-r-r- 412196 Jul 2 1990 C-kermit.tar.Z

Host ftp.uu.net (192.48.96.9)
Last updated 05:27 31 Jul 1993

Location: /usenet/comp.sources.unix/volume1
FILE rw-rw-r- 3208 Nov 30 1986 c-kermit.old.Z
FILE rw-rw-r- 6994 May 31 1989 c-kermit.ann.Z

Host imag.imag.fr (129.88.32.1)
Last updated 00:37 19 Sep 1993

Location: /a/durga/Ftp/archive/macintosh/serial-com
FILE rw-r-r- 198461 May 20 1992 mac-kermit-098-63.hqx
Location: /ftp.old/archive/macintosh/serial-com
FILE rw-r-r- 198461 May 20 1992 mac-kermit-098-63.hqx

archie> quit
Connection closed by foreign host.

You can enter a number of commands at the Archie prompt. The command prog string executes a database search for the particular string. The set command allows you to control how the search is accomplished. The search in the preceding sidebar used the following commands:

Command


Meaning


set search sub

Specifies a substring search without respect for case

set maxhits 5

Sets the maximum number of matches to five

set sortby hostname

Specifies that the results are sorted alphabetically by host name

prog c-kermit

Begins the actual search on the string c-kermit

The maxhits number of 5 is chosen to keep the example output brief. While doing the search, some servers keep you apprised of how many hits are found as well as the percentage of the database that is searched. In the preceding example, notice that the server had to search only about 7 percent of the database before finding 5 matches. It is also obvious that this is not a complete search, so by controlling the maxhits variable (by default 100), you can control how much of the database is searched. If you find what you are looking for early in the search, you may not need to change the maxhits value. Depending on the frequency of the string, you may have to increase the maxhits value or be more restrictive in the string for which you search (see "About Archie Regular Expressions," later in this chapter, for information).

Some of the commands and their parameters you can enter at the Archie prompt are listed in Table 24.1 (these apply to version 3.0 of the Archie server).

Command


Meaning


bye

Closes your session with the Archie server.

exit

Acts the same as bye.

help topic

You can type help by itself to receive a general message and to enter the help system, or you can enter it followed by a supported topic name, for example, help set.

list string

You can enter list by itself to obtain a list of all known sites in the Archie database. Enter list with a search string to match a string or regular expression to the site names.

prog string

Specifies a search string or regular expression depending on the value of the search variable (refer to Table 24.2).

set variable value

Enables you to control your search by setting various variables to arbitrary or specific values (refer to Table 24.2).

quit

Acts the same as exit.

site string

Lists the files found at a particular archive site.

unset variable

Clears the value of the specified variable and resets it to the default value (if any).

whatis string

Searches the whatis database for the term matching string.

The commands set and prog may be the most used when you access an Archie server directly. The variables that can be set are listed in Table 24.2.

Variable


Meaning


autologout number

Sets the number of minutes before an automatic log out occurs.

mailto string

Sets an address to which output is to be mailed (rather than printed on the screen).

maxhits number

Specifies the maximum number of matches before the server stops searching.

pager

The command set pager specifies that the output pauses between pages; unset pager turns off this feature.

search value

Specifies how the database search is to occur—the possible values are sub (a case-insensitive substring match), subcase (a case-sensitive substring match), exact (an exact match), and regex (a regular expression search). Compound searches can be done by combining values using the format value1_value2—if value1 is not successful, the server attempts a search with value2; for example, exact_sub.

sortby value

Controls how the search results are sorted—the possible values are none (an unsorted listing), filename (sorted by filename), hostname (sorted by host name), size (sorted by file size), and time (sorted by modification time). Note that any of these options can be specified with an r immediately preceding the value to receive a listing in reverse order (you can even specify rnone, although this option has no effect).

status

The command set status causes the server to report search progress; unset status turns off this feature.

term string

Enables you to describe your terminal; for example, if you use a VT-100, you can type term vt100.

Accessing Archie with E-Mail

You can reach Archie with electronic mail by sending a message to the address archie@server, where server is any of the servers listed in the sidebar, "Archie Servers Throughout the World," earlier in this chapter. The server commands should be placed in the subject and body of the e-mail message. Although most servers respond to an e-mail request, a response is not guaranteed. If you have a question about a specific server, send e-mail to the address archie-admin@server.

Archie servers have one feature that can be both helpful and annoying. If the server receives a message with an unknown command, an incorrect command, or no command, it treats the message as a help request and returns the complete help file. This is helpful if you want the help file but annoying if you just happened to misspell your command. A help request also supersedes all other commands. So if you make a mistake, all you get is help.

The server parses your subject line for a command. This means that if you need to do only one command, you can send a mail message with a subject line and no message body. Alternatively, you can either include a valid command as the subject of your message or leave the subject blank.

The commands supported in an e-mail message search are identical to the ones valid for a direct Telnet connection. You can even send the command help topic to find information about a particular feature without receiving the entire help file. The following sidebar shows an example of output returned from an e-mail query.


A SAMPLE ARCHIE E-MAIL SEARCH

The following listing shows what to expect to see if you send an e-mail message to an Archie server. The first of the output shows the result of a whatis command. The search itself is limited to five hits, a case-insensitive search is used, and the output is sorted by last modification time.

>> whatis kermit
c-kermit.ann C-Kermit & USENET
ckermit The 'C' implementation of Kermit
cu-shar Allows kermit, cu, and UUCP to all share the same lines
dialout Kill getty and kermit programs
kermit Communications software package
kermit.hdb Kermit patches to enable dial to use HDB database
okstate UUCP Access to Kermit Distribution
unboo.bas Decode Kermit boo format

>> set search sub
>> set maxhits 5
>> set sortby time
>> prog c-kermit

# Search type: sub.

Host freebsd.cdrom.com (192.153.46.2)
Last updated 03:49 11 Nov 1993
Location: /pub/aminet/comm/misc
FILE -rw-r-r- 697822 bytes 00:00 12 Dec 1992 C-Kermit-5A-188.lha

Host ftp.wustl.edu (128.252.135.4)
Last updated 09:12 22 Dec 1993
Location: /systems/amiga/aminet/comm/misc
FILE -rw-rw-r- 47 bytes 00:00 12 Dec 1992 C-Kermit-5A-188.readme

Host freebsd.cdrom.com (192.153.46.2)
Last updated 03:49 11 Nov 1993
Location: /pub/aminet/comm/misc
FILE -rw-r-r- 47 bytes 00:00 12 Dec 1992 C-Kermit-5A-188.readme

Host ftp.wustl.edu (128.252.135.4)

Last updated 09:12 22 Dec 1993
Location: /systems/amiga/aminet/comm/misc
FILE -rw-rw-r- 697822 bytes 00:00 11 Dec 1992 C-Kermit-5A-188.lha
Location: /systems/mac/info-mac/Old/comm
FILE -rw-r-r- 198461 bytes 23:00 29 Sep 1991 mac-kermit-098-63.hqx

The output from a successful e-mail query echoes the commands you specify, prefixed by two greater-than signs (>>). The rest of the output should look rather familiar by now.

Accessing Archie with Gopher

Archie searches can be performed with a Gopher client courtesy of a Gopher gateway to the Archie service. Unfortunately, success in using the Gopher gateway still depends on getting access to the very busy Archie servers. In addition, popular Gopher sites that support the Archie gateway may themselves be so busy that access is impossible. The Gopher interface does provide a sometimes useful alternative to client, e-mail, or Telnet access to Archie, especially if you can access it at low-use times.

Choosing an FTP Site and Files from an Archie Listing

By definition, Archie returns listings of files available for anonymous FTP. However, because any search may return numerous site listings for the same file, you are often faced with a decision about which site to use. Following are some guidelines for selecting a site.

About Archie Regular Expressions

If you are a UNIX guru, you probably know lots about regular expressions. The rest of us mere mortals must pick up as much useful information about this topic as we can maintain in our brains—just enough to be useful. What are regular expressions? They are simply symbolic ways to specify patterns of strings by using wildcards and other substitution characters. Archie regular expressions are a combination of UNIX regular expressions and Archie's own syntax.

You can use a regular expression as a search string in an Archie query by using the -r option with an Archie client or using the command set search regex when doing a Telnet or e-mail search. When using regular expressions with a client program (especially on a UNIX system), the string should be enclosed in double or single quotation marks.

The possible elements of an Archie regular expression are given in the following chart:

Element


Meaning


^

When used at the start of an expression, anchors the characters following to the beginning of the string. In other words, ^c-kermit finds all file occurrences beginning with the exact string c-kermit.

$

When used at the end of an expression, anchors the preceding characters to the end of the string. For example, txt$ finds all file occurrences ending with the exact string txt.

.

Represents an arbitrary character.

*

Represents multiple repetitions of the character immediately preceding it.

\

Causes the character immediately following it to be treated as a literal part of the string. For example, \.txt searches for the actual string .txt rather than an arbitrary character followed by txt.

[]

Encloses lists or ranges of characters that can be substituted in that position. For example, a[b-d]e returns abe, ace, or ade. Single characters, lists, and ranges of character can be included within one pair of brackets. Prefixing the contents of the brackets with ^ returns any match that is not one of the listed characters.

With these elements in hand, you can be quite detailed in your search specification. Note that regular expression searches are acted on as if they are case-sensitive substring searches, once the expression has been interpreted.

Here are some examples:

Search String


What It Does


^C.*\.txt$

Matches any string that begins with C (^C), is followed by any number of arbitrary characters (.*), is then followed by the occurrence of a period (\.), and ends in the string txt (txt$)

^kermit\.[Vv]5$

Returns kermit.v5 or kermit.V5

The sidebar entitled "A Sample Regular Expression Search" shows what you can specify if you are looking for occurrences of the string c-kermit associated with the word sun. The C, the K, and the S may or may not be capitalized. The output of the search is successful in returning some matches. (These are not program files; they but appear to be archived news messages that may have the information you desire.)


A SAMPLE REGULAR EXPRESSION SEARCH

Following is a search done with an Archie client program and using the regular expression feature of the server. The search looks for the occurrence of any number of arbitrary characters (.*), followed by either case of the letter c ([cC]), followed by the character —, followed by either case of the letter k, followed by the string ermit, followed by any number of arbitrary characters, followed by either case of the letter s, followed by the string un, followed by any number of arbitrary characters. (Phew!)

archie -m 500 -r '.*[cC]-[kK]ermit.*[sS]un.*'
Host lth.se
Location: /pub/netnews/sys.sun/volume90/nov
FILE -r-r-r- 1213 Nov 26 1990 C-Kermit.Sun.and.vi
Location: /pub/netnews/sys.sun/volume91/jun
FILE -r-r-r- 336 Jun 5 1991 C-Kermit.needed.for.Sun

The Future of Archie

Bunyip is not content to rest on its laurels. Future applications of Archie may include other databases collected directly from the Internet, using the model that has proved to be so successful for organizing the anonymous FTP archives. These additions may include databases of information about Internet services, people on the Internet, and other databases and information resources.

Archie

To contact Bunyip, use the following postal address:

Bunyip Information Systems
310 St. Catherine St. West
Suite 202
Montreal, Quebec, Canada H2X 2A1
Phone: 1-514-875-8611
Fax: 1-514-875-8134

You can send e-mail to the following addresses:

info@bunyip.com

For information about Bunyip Information Systems

archie-group@bunyip.com

For inquiries about Archie

archie-admin@server

For information about the administration of an individual server

Previous Page TOC Next Page