The Electronic Superhighway: What's On It for Chemists?

Steven M. Bachrach

Department of Chemistry
Northern Illinois University
DeKalb, IL 60115

Introduction

Over the past year or two, the concept of the Internet has crossed over from computer elitists to the mass market. Articles on the "Internet", "Electronic Superhighway", "Infobahn", etc. have appeared in popular magazines, newspapers, and television news, inspiring many to connect in fear of being left behind. However, many users tend to be quite disappointed in the information content of the Internet, and have countered that what is being offered is an "electronic superhypeway". This also can be said about the efforts of the chemical community on the net. In this article I examine just what information is currently available on the Internet of use to chemists and what tools are being used to access the Internet. We will speculate as to what the future holds for chemists on the Internet.

Brief Background

The Internet is a loose association of computer networks that span the globe. What makes the Internet so special is a set of rules (called protocols) that define how data is transferred from one computer to the next. The initial developments of the Internet were funded from government initiatives (frequently from the military). However, over the past decade, most of the protocols and technical initiatives have been developed by the private sector, frequently by the users of the networks themselves, acting in a spirit of community cooperation, to develop a means for free and easy distribution of information. The major users of the Internet are now academic institutions, where Internet access is becoming universal -- students expect to have an Internet address when they arrive on campus.

There is still an open, unstructured feeling on the network, an attitude of non-conformity that leads to users "surfing" the net, looking for interesting tidbits. This lack of structure may be forbidding to scientists who are accustomed to well defined order in their information delivery systems: the library catalogs its holding according to, for example, the Library of Congress system; handbooks collect and collate data and present them in ordered tables; journals are bound together in sequence and have tables of contents and author indices; journal articles are abstracted, indexed and smade earchable, etc. The Internet has fostered an environment that appears to lack any order, any collective indices of where to go to find certain information, any rhyme or reason.

There is a certain charm, perhaps naïveté, to this apparent disorder and this looseness does foster some amazing creativity which I hope to convey here. Further, there have been a few efforts to bring some order to the net, specifically the Gopher and World-Wide Web systems, that offer an exciting avenue for information distribution.

In this paper, I will first briefly describe a few of the more useful Internet tools. Next, I will describe some of the more interesting and novel resources or application on the Internet. Then, I will describe some of the chemistry resources on the Internet. Finally, I will conclude with a comparison of what chemists have done on the net with respect to the rest of the "virtual" community and where we can expect to move in the future.

Internet Tools

The beauty of the Internet is the ability to easily connect to remote computers around the world and exchange data at extremely rapid rates. This has led to the development of a number of tools to meet a variety of needs. Examples of these tools, which I will discuss below, include electronic mail for exchanging messages, and Gopher and the World-Wide Web (WWW) for navigating the Internet.

Electronic Mail

Electronic mail (email) is probably the most used Internet tool. The digital analogue of the postal system, electronic mail delivery allows users to send messages, essentially instantaneously, across the globe. There are many advantages to email. Messages can be sent at any time of day, avoiding probelms with missed phone calls, differing time zones, and annoying phone mail systems. Readers of mail can do so at their leisure. Messages tend to be informal and brief so that turn-around time for replies is often remarkably quick . Finally, since most Internet users do not see any financial charges associated with their access, they are not limited by cost considerations. (Keep in mind that the Internet is not free; there are charges for access, line rentals, computer purchases, network upkeep, software, etc.)

There are many programs for handling email. I will not discuss any in particular, but rather just highlight some general features. Good mail packages allow you to send and receive messages with the same tool. Simple techniques are available to respond to messages, include external files in messages, send copies to a list of people, and forward mail to other locations.

Groups of people with similar interests have formed discussion groups, which operate with a central server that acts as a receiver of all messages and then redirects them to all members of the group (sometimes called a discussion list). The Usenet is a collection of over 5000 such discussion lists. Access to the Usenet is through a tool called news, which allows a user to read and contribute messages to any of the Usenet groups.

Gopher

In an effort to bring some order to the Internet, a group of software developers at the University of Minnesota developed the Gopher system. Gopher is a client/server information system. The server collects information and provides it to clients through a menu-based list. Clients simply select an option on this menu and the server delivers the documents.

The significant development of the Gopher system is that the information delivered is not restricted to text . The latest Gopher servers can deliver graphics and audio documents. Most importantly, Gopher can contain links to other Gopher servers (which may contain other information of interest), telnet sessions (to connect to remote computers), and WAIS sessions. WAIS (Wide-Area Information Service) is a search engine that indexes on every word in the document, freeing the WAIS owner from selecting keywords and allowing the user to search on any item they desire.

The gopher software is available for free to academic institutions. For commercial sites, there is a moderate charge for the server software. Gopher can be obtained by anonymous ftp to boombox.micro.umn.edu in the pub/gopher directory.

World-Wide Web

Gopher has been amazingly popular but there are limits to its use. Programmers at CERN have developed the World-Wide Web (frequently called WWW or Web) as an alternative to gopher. The Web is an information server that can handle all the operations that gopher provides (including links to gopher servers) along with a few significant new features. Principally, the Web is not menu-driven but rather works on a HyperText delivery system.

HyperText is a text-based linked collection of documents. It is perhaps best described with a hypothetical example. Suppose you are interested in Winston Churchill. Using traditional information services, you might start by reading the encyclopedia entry on Great Britain, come across a list of Prime Ministers, see Churchill's name, then pick up the 'C' volume and read the Churchill entry that might include a photograph of him. In a HyperText environment, Churchill's name in the England entry would be highlighted, indicating that selection of this item (usually by a mouse click on the item) will bring further information, in the form of a new document. Selection of Churchill's name brings the Churchill document which includes within the document an embedded picture of him. As you read the article, you are intrigued that he was a painter and select the word painter, which is a HyperText link to a new document which is a graphic of one of Churchill's own paintings. Returning back to the Churchill page (by a simple mouse click), you note that he was a great orator, and selecting here brings an audio clip of one of his famous speeches. The Hypertext links allows the user to follow information in any path they desire and are not restricted to a menu.

The WWW will transmit text with embedded graphics, separate graphics images, audio, and video. HyperText links can be to other HyperText documents residing on any server, gopher servers, WAIS, or telnet sessions. Like gopher, WWW is a client/server system. There are a number of different versions of the client and server software. The most popular client is NCSA Mosaic (available for Macintosh, Windows, and UNIX computers). NCSA also has developed a simple server called httpd. Both can be obtained by anonymous ftp to ftp.ncsa.uiuc.edu.

Internet Resources

The chemistry community is not well represented on the Internet. The number of resources that contain interesting and useful information is relatively small. Before I present some of the more interesting chemistry resources, I will present a small sampling of novel and exciting resources on the Internet as examples of what is on the net and hopefully inspire the chemical community to add their services to the Internet.

The WWW is the fastest growing segment of the Internet. In fact, in terms of bytes transmitted, the WWW is now the single biggest component of all Internet traffic. I will therefore restrict my discussion of non-chemistry resources to WWW sites.

As a brief aside, WWW information is located using a URL (Uniform Resource Locator). I will provide the URL for every site I discuss below. These are actaully active links in this document, so that you may go off an explore these sites as you wish.

Examples of Non-Chemistry Sites

Corporate Sites

Corporate use of the Internet has grown substantially in the past year. In fact, there are over 200 commercial WWW sites, a list of which are available. As might be expected, most of these commercial sites are computer or software companies and I have listed a few of these below. However, there are some very interesting WWW sites representing other commercial interests.

O'Reilly and Associates, a publishing firm, operates the Global News Net (GNN) which offers, among other things, commercial advertising opportunities on the Internet.

The Stanford Shopping Mall, located in Palo Alto CA, has a web site that describes all the stores in the mall and offers directions to the mall.

One of the most novel uses of the web is by the University of California at Irvine Bookstore. This web site has a searchable list of all books and recordings which can be purchased by email. They are advertising a recently published Ansel Adams photographic essay of the University of California. These photographs have been digitized and can be downloaded for local viewing before purchaing of notecards, posters, or book versions. The UC-Irvine bookstore has simply developed the electronic home catalog of the future.

While not yet operational, Bank of America has posted a prototype of electronic banking in the truly electronic age, when financial transfers will occur over a global network.

Computer Companies

Sports Sites

One of the most successful WWW sites this past year was the server for the 1994 Winter Olympics at Lillehammer, Norway ( Americas server; European server). When originally announced, this server logged over 100,000 accesses in the first day! Over the course of the Winter Olympics, this server presented updated results of all competitions, medal rankings, and color photographs of many of the events.

Following up on the huge success of the Winter Olympics web site, Sun Microsystems, in cooperation with other organizations, brought the 1994 World Cup Web site (Americas server; European server; Asia) to the Internet. Up to date results, standings and photos were supplied at these locations.

Academic/Museum Sites

Academic institutions are far and away the major users of the Internet and many have established WWW sites. I will list only a few here which offer some unusual features.

The National Institute for Nuclear Physics in Naples and its Department of Physics have installed a museum exhibit of early physics instrumentation. Making use of text and high-resolution graphics, the user can learn of the history and development of modern experimental physics.

Other museum WWW sites include the Dead Sea Scrolls, the University of California-Berkeley Museum of Paleontology, a Library of Congress exhibit on Columbus' voyage, and the Louvre, which won a Best of the Web contest this year.

Chemistry Resources on the Internet

Many disciplines rely heavily upon the Internet. For example, the physics group at Los Alamos National Laboratory operates an preprint archive that contains thousands of papers and is used by physics world-wide on a daily basis. A recent strike by the archive organizers (in an attempt to obtain further financial support of the archive) led to a sustained international uproar that bordered on an international political crisis! In contrast , the use of the Internet by chemists is minimal.

I will not attempt to give a detailed nor comprehensive listing of chemistry resources on the Internet. A few such lists are available; the best is by Gary Wiggins, University of Indiana Chemistry Library. Instead, I will highlight a few of the more prominent and useful examples.

Discussion Groups

The major Usenet group concerning chemistry is sci.chem , which serves the entire chemical community. One subgroup, sci.chem.organomet, serves the organometallic chemistry community.

A number of chemistry-related discussion lists (groups) have grown quite large and have become valuable tools for their participants. Of these, two discussion groups, the Computational Chemistry List (CCL) and Chemical Information Sources Discussion list (CHMINF-L), have made a significant impact on their respective fields.

The CCL serves the computational chemistry community, with computational chemistry defined in its most broad sense. There are over 1700 subscribers from over 40 countries on this list. Discussions range from the value and use of various programs, announcements of new resources, to detailed discussions of results. The list is maintained by Jan Labanowski. A searchable archive and a database of software are available by anonymous ftp, gopher (infomeister.osc.edu port 73) or through WWW. To subscribe to the CCL, send a message to chemistry-request@osc.edu. You will receive information on how to participate in discussions and the rules of the list. A survey of the CCL subscribers was recently taken using a WWW interface. Results of this survey will be posted in the near future.

The CHMINF-L serves the chemical information community; its postings cover how to use various information resources, how to find information, and the management of library and information resources. The list is maintained by Gary Wiggins. To subscribe to CHMINF-L, send the message subscribe chminf-l firstname lastname to listserv@iubvm.ucs.indiana.edu .

On-line Databases

Only a few on-line databases for chemists exist. Perhaps the database with the best known is the Fullerene database operated at the University of Arizona. This database catalogues all papers and accounts related to fullerene chemistry. Access to this database is via a telnet session to sabio.arizona.edu. After entering the system select databases and remote libraries followed by Buckballs database. Searching by author, article title, journal title, and keyword (in the title only) is available. A related database is operated by Elsevier, called the Fullerene Contents Alert. This database provides information on fullerene chemistry to be published by Elsevier.

A database of somewhat more dubious distinction (not for the quality of the database, but rather the reputation of its contents) is the Cold Fusion Bibliography, which is a WAIS database. A WAIS database uses no keywords, so you are free to search on any term you desire. This database contains articles related to all aspects of cold fusion.

Chemists at the University of Missouri-St. Louis have a database on chemistry textbooks in print. It is simply a list of textbooks broken down into categories relating to the various chemical disciplines.

The protein data bank is available on-lin. This database is operated by Brookhaven National Laboratory and holds crystal structure data that can be searched.

Electronic Journal Access

There are currently no exclusively electronic journals in the field of chemistry. Starting January 1995, Megalon S.A. will begin publishing the Journal of Molecular Modeling for on-line access. Subscribers will have gopher and ftp access to the journal articles immediately upon acceptance. At the end of the year a hard-copy and CD-ROM version will be published. Information on this journal can be obtained by sending a message to jmolmod@organik.uni-erlangen.de.

A Chemical Physics Preprint Archive is being maintained at Brown University. Borrowing heavily on technology developed at the Physics Preprint Archive, the Chemical Physics Preprint Archives houses preprints that are available before acceptance in a journal. Unfortunately, not many people have made use of this facility and the number of preprints is relatively small.

To try to decrease the time between acceptance of a manuscript and the date of actual publication, the Journal of Chemical Physics has initiated JCP Express which is a gopher server that makes accepted papers available on-line before publication. Manuscripts prepared in the appropriate format are placed on the gopher months before publication date.

The table of contents and abstracts of the Journal of Computational Chemistry , International Journal of Quantum Chemistry, and Theoretica Chemica Acta (no abstracts) are available at the CCL gopher server.

The American Chemical Society has placed the supplementary material for Journal of the American Chemical Society and Chemical Reviews on their gopher server for downloading. This service is offered for free for the remainder of 1994. Journal submission guidelines for authors of all ACS journals can be obtained on-line from the ACS gopher. Guidelines for a number of other journals are available from the NIU Web server.

Gopher and WWW Sites

There are many chemistry-related gopher and WWW sites. Most of these are run by a chemistry department at a university and principally offer information about that department, such as faculty research interests, graduate program descriptions, and course offerings. Instead of offering a comprehensive list of these servers (a good list is available, I will note here those servers offering unique and widely useful information.

The NIH Molecular Modeling Home Page is a central server of information pertaining to molecular modeling. It includes a nice form-based front-end to the Protein Data Bank and The NIH Guide to Molecular Modeling, an electronic textbook on molecular modeling.

A database of various representations of a wide assortment of biochemical compounds has been prepared as part of the Klotho project.

An essential component of ab initio quantum chemistry is the selection of the basis set. A form-based query system for obtaining basis sets formatted for many different quantum mechanics programs is operational at the Pacific Northwest Laboratory.

A project to serve chemistry talks and presentations to the Internet community at large is ongoing at Imperial College. A very nice paper describing this project and other Internet applications has recently been accepted by Journal of the Chemical Society, Chemical Communications and is also available on-line

The Quantum Chemistry Program Exchange (QCPE) operates a gopher site that lists the program descriptions for their holdings. Ordering information is available and programs can now be delivered by anonymous ftp.

The archives of the first on-line chemistry conference (on chemical education) are available for perusal. This conference, held in August 1993, contained 15 papers that were available by anonymous ftp. Discussions among the participants were held via a discussion group; these messages have been archived.

This November 1994, the First Electronic Computational Chemistry Conference will be held. This conference will be held using the WWW for distribution and viewing of the submitted papers. Discussions of these papers will occur through a discussion list. To register for the conference, which also subscribes you to the discussion list, send the message subscribe ECCC firstname lastname to listproc@hackberry.chem.niu.edu.

Finally, I am operating an experimental gopher and WWW site at NIU, funded by a grant from the Henry and Camille Dreyfus Foundation, to explore the use of the Internet for distributing chemical information. Our server supplies conference listings, including the only on-line source of the Gordon Research Conference schedules, a chemists' email address directory, journal submission guidelines, stock prices, and a quantum chemistry acronyms database. We recently initiated an on-line academic employment clearinghouse. An example of the poor penetration of the net into the chemistry community is that as of July 1994 only 160 people have placed their addresses into the directory.

The Future

While chemists have been slow to take advantage of the Internet, I am optimistic that growth on the net will be explosive in the future. As more of us become net-aware, new uses of the Internet will be developed and become a standard means for chemical communication. For example, Elsevier Scientific is developing a gopher site for advertising their wares and are looking into electronic publication. The Gordon Research Conferences will be operating their own gopher/web site in the near future and are looking into electronic registration. A prototype of how ACS abstract submission could be done has appeared on the web. This is a form-based entry of the abstract that produces a postscript output that could be directed to a printer and to automatic registration.

The real future of the Internet as a research tool for chemists will be with the advent of an electronic journal. While there are many problems with electronic publication that still need to be worked out (such as copyright protection, how to maintain the records over long times, particularly with the ever changing technology, subscription bases, etc.) I believe that a chemical electronic journal is likely to appear before the turn of this century. Conventional reference printed materials, like Chemical Abstracts (now available for access via the Internet), Beilstein, CRC Handbooks, etc., will become available on-line.

In fact, the better to way to imagine the future of chemical information is to think in new ways. For example, the development of the web allows an individual to "publish" their work privately, for general access by the community. Perhaps, individual publication to the net will become the norm. The advantages are many: quick access to the most recent work, ease of updating materials as new results are obtained, universal access to materials, etc. This does potentially lead to an even more fragmented collection of chemical knowledge, but the future clearly holds remarkable new means for communication.

Acknowledgment is made to the Camille and Henry Dreyfus Foundation for their generous support of this research.