How the Web Works

  • April 2, 2004
  • Read 67,750 times
 

Introduction

For many of us, the Internet and Web Browsing has become a daily activity. Whether it is for checking stock prices, buying food, doing work, ordering books and music, or just to browse a favorite site, web browsing has become an institution in our lives much the way television is. Have you ever wondered how this whole web thing works, though? This tutorial is designed to explain the history and concepts of the Web and how it works technically. After you browse to a site, you will understand actually how it is done and how your computer retrieves this information. Our first stop, is the history of the Web.

History of the Web

The Web finds its roots at CERN, the European Organization for Particle Physics Research, in 1989 when Tim Berners-Lee and Robert Cailliau designed a system called Enquire. This system would allow documents to have links between different pieces of data whether they be files on the local computer or stored on a remote computer. The main motivation is said to have been the ability to access library information that was spread across multiple servers at CERN.

On November 12th, 1990, Tim Berners-Lee published a formal proposal called "Information Management: A Proposal" that outlined the World Wide Web as we know it today by using a system for displaying information called HyperText, which was first described 1945 by a man named Vannevar Bush, to link documents into a large scale information pool. The following day on November 13th, 1990, Tim Berners-Lee created the first web page and that following December wrote the first web browser and web server. The name of this program that was created, was called the WorldWideWeb. Thus we have the name we use today.

As development of the WorldWideWeb continued, more people from around the world started to get involved , until in 1992 one of the first web browsers that supported graphics was introduced called Pei-Yuan Wei's Viola. This led to Marc Andreessen of NCSA, releasing in 1993 a program for UNIX called Mosaic. Mosaic was the spark that marked the rise in popularity of the World Wide Web and no longer kept it confined in the academic circles. Marc Andreesen went on to form Mosaic Communications, which then evolved into Netscape Communications. Netscape was the first mainstream graphical Web Browser.

As time went on, more features started to be added to the browser, more companies got on the Internet, and personal homepages started springing up everywhere, and the Web as we know it was created.

The Technology behind the Web

The web works on three standards. These standards are generally adhered to by all companies that make products that work with the World Wide Web.

These standards are:

URL (Uniform Resource Locator): These are the addresses that you enter into your web browser to connect to a web site. The URL is broken up into 4 parts which are the protocol, the hostname, the port number, and the path that you are requesting.

Protocol:
The protocol part of an URL is the funny string of characters that you see before the hostname. Examples are http, ftp, telnet:, etc. They are separated from the hostname with a colon and two forward slashes ( :// ). These protocols tell your browser what type of service to use when you connect with the web browser to the hostname. If you leave the protocol off your address, by default the Web Browser will assume you are using the HTTP protocol, which is for connecting to web sites, so there is no need to type in the http:// every time you go to a web site. If you specify another protocol like ftp, then the browser will act as an ftp client that will enable you to connect to a ftp server to download files.
Hostname:
The hostname is the address you are going to. For example, if you are going to the address https://www.bleepingcomputer.com, then www.bleepingcomputer.com is the hostname.
Port Number:
The port number is a number that you can append to the hostname with a colon ( : ) between them. For example https://www.bleepingcomputer.com:80. If you leave the port number off, which almost everyone does, then the browser will automatically use port 80 as that is the default port for the http protocol.
Path:
This is the path on the server, culminating with the filename you are trying to reach. For example, the URL https://www.bleepingcomputer.com/examples/example1.html. The path in this case is /examples/example1.html. This path corresponds to an actual directory structure on the web server. So on the web server there is a root directory, an examples directory underneath that root directory, and a file called example1.html underneath that.

HTTP (Hyper Text Transfer Protocol): This is a defined process of how to transfer information between a web browser and a web server. All web browsers and web servers follow this process.

HTML (Hyper Text Markup Language): This is the language used in web pages to format text, images, and page layout. This language is in pure text and is entered into a file that has an ending of html. It is possible to put HTML in documents that do not end in html, but for the purpose of this tutorial, we are only focusing on pure HTML documents. The text in these documents contain special codes, called tags, that tell the web browser when it reads the file how to format the text. Lets try an example below.

If you were to create a file called helloworld.html and save it on your hard drive, you could then open this file with your browser and have it displayed. The contents of this file will have the following text:

Hello World!!!!

If you were to open up this document in your browser you would see the following:

Hello World!!!!

As you can see the text, Hello World, has been shown to you in bold print. This was because we enclosed the words in the tags , which means any text after it will be bold, and then the ending means this is the end of the bold formatting. All tags in HTML have a beginning tag, that starts the formatting, and an ending tag, that stops the formatting. There are many many more tags available to use in HTML, the bold ( ) tag being just one of them.

Web Browser and Web Servers

In order for the Web to work you need web browsers and web servers which work hand in hand. The web browser is a piece of software that is used to interpret the information found in an HTML document and display the content of that document based upon the HTML tags found within it. A web server is a computer that stores HTML documents, otherwise known as web pages, and waits for connections from web browsers. When a web browser connects to a web server, the web server sends the requested document, if it exists, back to the web browser for display.

Actually Browsing a Web Site

Now that you understand the basics behind how the Web works, lets walk you through the actual process of how your computer goes to a web site and displays it in your browser.

The first step of course is to open your web browser, whether that be Netscape, Internet Explorer, or Mozilla. When your browser opens, you have the option of connecting to another web site. In the address field, type the location of where you would like to go. For this example, lets go to www.bleepingcomputer.com.

You type https://www.bleepingcomputer.com, or www.bleepingcomputer.com as the http:// is optional, in the address field and press enter or go. The below diagram explains what happens:

How the Web Works

 

As you can see, when you try to connect to a site, your web browser opens an Internet connection and tries to connect to the web server specified in the host portion of the URL. If it connects, the web browser sends the web server the path portion of the URL. If that path exists on the web server, the web server sends the content of the HTML file back to your browser. Your browser reads through the HTML of the document, following the instructions found there as it displays the information on your screen.

That is all there is to it to retrieving a web page from a remote computer.

Conclusion

I hope you have enjoyed this tutorial and as always if you have any questions, do not hesitate to ask us them in the forums.

--
Lawrence Abrams
Bleeping Computer Basic Internet Concepts Series
BleepingComputer.com: Computer Support & Tutorials for the beginning computer user.

Users who read this also read:

  • Domain Names & Hostnames Image
    Domain Names & Hostnames

    When you use the Internet, you use domain name and hostnames all the time. These hostnames and domain names when put together become the Internet address that you search with. The domain name without a hostname is also the most common email address. This article will explore what hostnames and domain names are and how they are used. We will also discuss TLD's, or Top Level Domains, such as ...

  • Introduction to the File Transfer Protocol (FTP) Image
    Introduction to the File Transfer Protocol (FTP)

    In current times if you want to transfer a file to a friend, you can just attach it in an email and send it off. With high speed bandwidth being so cheap and plentiful to the home user, transferring a file in this manner is usually more than adequate. What if you needed to transfer the file to someone immediately; there could be no delays, it has to be fast, and the files you are transferring may ...

  • What is Domain Name Resolution Image
    What is Domain Name Resolution

    When using the Internet most people connect to web sites, ftp servers or other Internet servers by connecting to a domain name, as in www.bleepingcomputer.com. Internet applications, though, do not communicate via domain names, but rather using IP addresses, such as 192.168.1.1. Therefore when you type a domain name in your program that you wish to connect to, your application must first convert ...

  • Introduction to RSS Image
    Introduction to RSS

    You may have noticed when browsing Bleeping Computer that there are these little orange RSS buttons all over the place. You may haved wondered what these were and, being an adventurous sort, you click on them and your screen becomes filled with strange codes that make no sense to you. I know you are confused, but there really is a good reason for these buttons. These buttons are called RSS feeds ...

  • IP Addresses Explained Image
    IP Addresses Explained

    Every machine on the the Internet has a unique number assigned to it, called an IP address. Without a unique IP address on your machine, you will not be able to communicate with other devices, users, and computers on the Internet. You can look at your IP address as if it were a telephone number, each one being unique and used to identify a way to reach you and only you.

 

Comments:

blog comments powered by Disqus
search tutorials
Mandiant mWise Conference 2024

Login