The Cole Papers

Sam knows HTML: AskSam, an indexing engine, also is conversant in HTML, the markup language of the Web.

The decision after deciding to go on the Web: how to do it

You've decided to take the big leap.
You're going to do it; you're going digital. You're going to open up your own on-ramp to the InfoBahn; you're going to be an information provider on the Internet.

Yes! You've made the big decision, and you have only one question:

Now what?

You're experienced -- you know what to look for in a word processor; you know how to tell a good output device from one that is inadequate for your needs.

But how do you decide on an Internet server OS? Which flavor of UNIX is best? If you get what you pay for, is Linux worthless because it's free? Can you use a Mac? Can you get started on-line without bankrupting the newsroom?

We can't cover all Internet tools in a single article; in fact, we could fill this issue with a brief look at the tools in a single category, such as web servers. So we'll concentrate on the tools you can use to get started the way your publisher would want -- without spending a ton of money.

If you just want someone to do it all for you, you can buy a prepackaged server setup (see The Cole Papers, August 1995). For now, however, we'll operate under the premise that you want to do this yourself.

First, choose an operating system. Tip No. 1: Unless you or someone you love is already a UNIX maven, avoid it.

Now, before we get the usual 47,132 letters from enraged Linux types, remember that having a Linux system is a lot like owning a boat: While it's the center of your life, most of your friends just don't understand how you can get so worked up over caulking.

Linux is a very fine Posix-compliant version of UNIX that can be downloaded free from the Internet -- but that's also the support plan.

If you're building a small system for demonstration purposes, think about a Mac. This is especially true if some of your designers are just dying for a new PowerMac. Get them a new machine, appropriate the old one -- be a big hero.

Here's the scoop on server performance: The bottleneck, as with most networks, is in the I/O, not the processor. For web servers, for example, that means the number of pages that can be held in RAM -- rather than accessed from a disk -- is much more important than the speed of the processor.

A web server on a IIci stuffed with RAM will generally kick the butt of one on a PowerMac with, say, eight megabytes.

Here's one word about using DOS for a server: Don't. As for Windows 3.X -- you must be kidding. Windows 95? There are products out there for Win95, but that operating system is too new to be judged fairly.

If you want to set up an industrial-strength server, you should take a look at Win95's big brother -- Windows NT. Although NT shares the Windows name with Microsoft's other operating systems, it is in fact a completely separate 32-bit operating system. Short of UNIX, it is the closest thing to a mainframe operating system you can get for a microcomputer.

There's an NT explosion on the Internet right now, and for good reason. NT is fast -- it has superior multitasking and multithreading abilities, allowing it to handle multiple connections with aplomb. It's secure -- it meets the government's C2 security specifications. It's a snap to set up, maintain and use -- compared to a UNIX system.

As always, the surge in NT servers has led to a surge in NT products as suppliers rush to meet the demand. Consequently, some of the most serious server setups are now available for NT, such as Netscape Communication's Commerce Server, complete with transaction security.

Another advantage of NT is that it is scalable. You can start your site on a '486 (as long as it is loaded with RAM) and move up to a Pentium later. If after a while you need still more power, you can move up to a Pentium Pro, a multiprocessor machine or jump to RISC systems such as the DEC Alpha or the PowerPC -- all while staying with NT.

Resources
As soon as you've decided on an NT system, pick up the phone and order a copy of the Windows NT Resource Kit from the Microsoft Press.

OK, OK, we know how you feel; Bill Gates has quite enough of your money. Still, do this for yourself, not for the Gates Guy.

The four books in the set come in a slipcase. The whole deal is about the size (and heft) of a cinder block. And while the books are certainly good -- packed with more information on NT than seems absolutely necessary -- the thing you really want is the CD-ROM packed in the kit.

This contains just about everything you need to build your own Internet server, including: the Emwac World-Wide Web server; the Emwac Gopher server; the Emwac Wide Area Information Services server; a Wais tool, and a Domain Name Services server.

About the only other stuff you need to get going is a T-1 line and a mail server.

Another must-buy book is the Internet Server Construction Kit for Windows by Greg Bean, published by John Wiley & Sons Inc. Again, the attraction is the CD-ROM, although the book is well written and filled with good advice.

The disk is filled with shareware and freeware Internet server products, including web servers, sample web pages and configuration files, sample common gateway interface (CGI) files and instructions, and Finger servers -- in fact, way too much good stuff to do it justice here.

The products on these CDs are probably not the best choice if you are setting up a full-blown commercial service immediately. But they are powerful enough to do almost anything you want, and for a minimal investment you can get everything you need to get started and learn what you want to do before you pony up the big bucks.

Commercial servers
When you are ready to spend, there are a couple of products you should consider, though neither costs what you could call really big bucks.

The first is Netscape Commerce Server. This is pretty much the same product as the industry-leading server Netscape of Mountain View, Calif., has been turning out for UNIX, complete with a secure sockets layer for transaction processing, which means it is set up to handle credit card transactions.

The Commerce Server for NT runs $2995. For $795, you can buy a version that comes without the secure transaction processing.

A terrific product that newspapers should take a close look at is O'Reilly and Associates' WebSite. WebSite, which has a street price between $300 and $400, combines a pretty good World-Wide Web server with a set of strong web configuration and design tools, including a full-text indexing engine and a CGI search program for the index.

Just load it up and, voila! Instant library. Better yet, WebSite allows you to set up an autoindex directory -- any file placed in the directory automatically is added to the index. Combine that with the Windows NT Command Scheduler from the NT Resource Kit and you have a library that will pretty much run itself.

There is one drawback to all this wonderfulness, however: You'll not mistake WebSite's index engine for one of those engines idling under a Formula One hood. During testing, we dumped 60 megabytes of text files -- about 20,000 files -- into the web pages directory and let WebSite have at it.

Two days later, we decided to call O'Reilly's Sebastopol, Calif., offices, and were told that, never fear, WebSite should be finished in "two or three more days -- probably less than a week."

It's good, but it's no speed demon.

There are too many web servers on the market to cover here. These two products offer something other than the run-of-the-mill.

Web authoring tools
The World-Wide Web is based on HyperText Markup Language, which is a Document-Type Definition (a defined subset) of the Standard Generalized Markup Language (SGML).

Old newspaper techno-types usually are delighted when they get their first look at HTML, because it resembles the coding we all spent so many years cranking out on Atex or SII or TMS systems. As well it should: It's a lot like those formatting codes (though because it's based on SGML, HTML defines structure, not format).

Fortunately, though, there are some better tools than we had to put up with in the old days. Many allow the user to work in WYSIWYG mode -- What You See Is What You Get.

The Internet is stuffed with web authoring tools. There are freeware web authors, shareware web authors (try before you buy) and commercial web authors.

One convenient -- and free -- author is available from Microsoft. The Internet Assistant is a Microsoft Word for Windows add-on that installs a rudimentary web browser into Word, along with some decent HTML tools. Best of all, it allows HTML tags to be imposed using a mouse and the familiar Word styles paradigm -- simply wipe the mouse over the text, then click on a style sheet.

The Internet Assistant doesn't have the horsepower to handle the most complex web pages -- it doesn't support the most advanced Netscape tags, for example -- but it does a decent job on standard pages, and, as always, there are a lot more of those than the fancy ones.

Best of all, the fact that the Internet Assistant becomes a seamless part of Word means that you can get your people up and running with minimal training.

The Internet Assistant can be obtained from Microsoft or downloaded from CompuServe, the Microsoft Network or Microsoft's World-Wide Web site: http://www.microsoft.com. (There are similar products available for other word processors -- Novell, for example, offers one for WordPerfect.)

Another unusual web author is AskSam, the free-form text database from AskSam Systems of Perry, Fla.

AskSam has always specialized in the type of hypertext links that form the basis of the Web, so it made sense to the company to add the ability to read and write web pages. Because AskSam was designed from the ground up to assemble and link data, it allows you to do things other web tools don't, such as doing complex Boolean searches with multiple criteria or assembling data using criteria outside the links.

It also has full-text indexing, and, unlike WebSite, it has a screaming engine. In a timed test, AskSam 3.0 for Windows indexed 1.5 megabytes of raw text in just under 45 seconds. AskSam 3.0 for Windows -- which shipped in November -- has a street price of about $150.

AskSam is releasing a beta version of a web server indexing engine this month. Pricing had not been announced at press time.

For advanced work there are many fine web authors on the market, such as HoTMetaL Pro from SoftQuad of Toronto, Canada. It's a good idea to try one of the free or shareware authors first, however. They'll let you learn what type of web pages you want to build before you buy a more specialized tool.

And some of them -- such as the Internet Assistant -- are worth keeping even if you move to a commercial author for the heavy lifting.

AskSam Systems Inc.,
(800) 800-1997;
Microsoft Corp.,
(206) 882-8080;
Netscape Communications Corp.,
(415) 528-2619;
O'Reilly & Associates Inc.,
(707) 829-0515;
SoftQuad Inc.,
(416) 239-4801.

-- Christopher J. Feola

See also: Robust, cheap Mac Web server

From THE COLE PAPERS, December 1995, Copyright © 1995, All Rights Reserved.

Top | ColeGroup.com | Consulting | Cole Papers | NewsInc. | Cole's Store | Miscellanea | Search
Copyright © 1990-2012, The Cole Group. All Rights Reserved. Contact us.
Modified date: 12/ 9/1995, 2:12:22 AM.
URL: http://www.colepapers.net/TCP.archive/Cole_Papers_95/TCP_95_12/Web_tools.HTML