A Practical Guide to Writing Clients and Servers
HTTP is the network protocol of the Web. It is both simple and powerful. Knowing HTTP enables you to write Web browsers, Web servers, automatic page downloaders, link-checkers, and other useful tools.
This tutorial explains the simple, English-based structure of HTTP communication, and teaches you the practical details of writing HTTP clients and servers. It assumes you know basic socket programming. HTTP is simple enough for a beginning sockets programmer, so this page might be a good followup to a sockets tutorial. This Sockets FAQ focuses on C, but the underlying concepts are language-independent.
Since you're reading this, you probably already use CGI. If not, it makes sense to learn that first.
The whole tutorial is about 15 printed pages long, including examples. The first half explains basic HTTP 1.0, and the second half explains the new requirements and features of HTTP 1.1. This tutorial doesn't cover everything about HTTP; it explains the basic framework, how to comply with the requirements, and where to find out more when you need it. If you plan to use HTTP extensively, you should read the specification as well-- see the end of this document for more details.
Before getting started, understand the following two paragraphs:
HTTP or other network programs requires more care than programming for a single machine. Of course, you have to follow standards, or no one will understand you. But even more important is the burden you place on other machines. Write a bad program for your own machine, and you waste your own resources (CPU time, bandwidth, memory). Write a bad network program, and you waste other people's resources. Write a really bad network program, and you waste many thousands of people's resources at the same time. Sloppy and malicious network programming forces network standards to be modified, made safer but less efficient. So be careful, respectful, and cooperative, for everyone's sake.
In particular, don't be tempted to write programs that automatically follow Web links (called robots or spiders ) before you really know what you're doing. They can be useful, but a badly-written robot is one of the worst kinds of programs on the Web, blindly following a rapidly increasing number of links and quickly draining server resources. If you plan to write anything like a robot, please read more about them. There may already be a working program to do what you want. If you really need to write your own, please support the robots.txt de-facto standard.
OK, enough of that. Let's get started.