1. Concept: Virtual Hosting
Generally speaking, virtual hosting is the ability to host a service with many different names or customers. What all virtual hosting has in common is the ability to provide a different site or version or configuration of a particular service based on the end-user requesting a different name. The most common, of course, is web hosting, where a user types in http://nakedape.cc into his browser1 and gets my site, while another types in http://www.pdxlinux.org and gets a different site, notwithstanding the fact that both sites are running on one host under one process group of Apache at the same IP address and TCP port.
1.1. Virtual Web Hosting
When a user types in the URL or location for the web site (or clicks on a link to the web site or various other means), the browser looks at the hostname part of the URL, which is the part after the http:// and before the next /. In the case of the URL by which you arrived at this page, the nakedape.cc is the hostname. The browser then uses DNS to convert nakedape.cc into an IP address, 63.105.18.205 in this case. The browser assumes the TCP port 80, the port reserved through Internet standards bodies for web traffic, and connects to TCP port 80 on 63.105.18.205. 2
What happens when the web browser connects to the IP_address:TCP_port is where the magic happens. First, the path part3 is take from the URL and sent with the text GET prepended. There are usually some other bits of information afterwards which are called headers; these do things like tell the web server which languages the browser supports, in case there are pages for other languages that it can use. Among these headers is the Host header, which the web server uses to differentiate one site from another. The following example of verbose output from Curl might make things about more clear:$ curl -v http://nakedape.cc/wiki/ConceptVirtualHosting > /dev/null * About to connect() to nakedape.cc port 80 * Trying 63.105.18.205... * connected * Connected to nakedape.cc (63.105.18.205) port 80 > GET /wiki/ConceptVirtualHosting HTTP/1.1 User-Agent: curl/7.13.1 (i386-redhat-linux-gnu) libcurl/7.13.1 OpenSSL/0.9.7f zlib/1.2.2.2 libidn/0.5.15 Host: nakedape.cc Pragma: no-cache Accept: */* < HTTP/1.1 200 OK < Date: Mon, 31 Oct 2005 00:58:42 GMT < Server: Apache < Connection: close < Transfer-Encoding: chunked < Content-Type: text/html;charset=iso-8859-1
The > GET lines up until the blank line are what the browser sends and the lines starting with < are what the web server sends (the actual page HTML removed for conciseness).
When you are just setting up a site and haven't set up DNS, the only way to get to the site is by using the IP address in the URL location. How, then, is the web browser to know which Host name to send? It doesn't; so it just sends the IP address. With multiple sites configured on the same IP address (and TCP port), how is the web server to decide which site to send you? Depends really; it's usually the first one configured. If you want to reach any of the several sites you have configured, then you need to use the host's name and not just the IP address. But what if you haven't set up DNS or it hasn't taken effect yet? All operating systems (that I know of, at least), have an alternate means of mapping hostnames to addresses, generally called the hosts file. (On UNIX systems, this is part of the NameServiceSwitch.) On Windows, that file is in C:\Windows\System32\drivers\etc\hosts. By default there is a hosts.sam4 file, which is a sample.
1.1.1. IP-based Hosting
1.1.2. References
- 1 Strictly called a User-Agent, but that's not a point we'll belabor.
- 2 Note that the IP address and TCP port combination is a unique identifier--there may be many IP addresses with web servers listening on TCP port 80 and there may be other ports on my IP addresses in addition to 80, but in the whole Internet there is only one 63.105.18.205:80 and only one process listening on that port.
- 3 The part following the hostname, beginning with "/".
- 4 Not to be confused with the part of the SAM, Security Accounts Manager.
