– What happens when you access a website?
– How does the browser know what content to show?
You probably have a basic understanding of what happens behind the scenes when you access a website and how informations are sent to our computers via the internet but lets take a deeper look at the sequence of events that actually take place.
How does it work?
Lets start by typing a URL in the browser and pressing enter. For this example I’ll use
1. DNS Lookup
First of all what is a DNS?
A DNS (Domain Name System) is basically a translation of your familiar domain name (www.cubui.com) into an IP address your browser can use (188.8.131.52). The IP address belongs to the computer which hosts the server of the website you are trying to access.
Now that you know what a DNS is, a DNS lookup is the process of figuring out what is the IP address for the visited domain and return it from the DNS server. This is like looking up a phone number in a phone book.
This process contains 5 steps:
- The first step is checking the browser cache which contains a repository of DNS records for websites you have previously visited. Depending on the browser, these records are usually cached for a fixed duration.
- If the DNS record was not found in the browser cache, the next step is to check the OS cache.
- The next check will be done in the router cache.
- If no record was found in the previous 3 caches, the last cache check will be done in the ISP cache.
- Obviously, the first time you visit a website, all 4 cache checks will fail so a OS recursive query to DNS resolver will be done.
The operating system doesn’t know where
www.cubui.comis and it starts to query a DNS resolver. This query contains a special flag to mark it as a recursive query. When the DNS resolver completes the recursion, it will respond with an IP address or an error.
2. Browser starts TCP handshake with the server
After the OS finds the IP address for the website you’re trying to visit, it sends it to the browser which will then initiate the TCP connection to start loading the page. The process used to establish this connection is known as a TCP/IP three-way handshake.
How does TCP/IP three-way handshake work?
- First, the client sends a SYN(synchronize) packet to the server over the internet asking if it is open for new connections.
- The server will respond with a SYN/ACK packet if the server has open ports ready to accept and initiate a new connection.
- Lastly, the client will receive the SYN/ACK packet from the server and will acknowledge it by sending an ACK packet back to the server.
3. HTTP transaction
Once the TCP connection is established and ready for data transmission, the HTTP transaction begins.
There are 4 stages during a HTTP transaction:
- Connection – The client connects to the server.
- Request – The client requests information from the server.
- Response – The server will send a response back to the client. The response could be the information requested by the client or a rejection. Each response contains a status code which will determine the status of the response.
There are five types of status codes using a numerical code.
- 1xx indicates an informational message only.
- 2xx indicates success of some kind.
- 3xx redirects the client to another URL.
- 4xx indicates an error on the client’s part.
- 5xx indicates an error on the server’s part.
- Close – The transaction is terminated by the client and server or just by one of them.
4. Rendering the HTML begins
The browser starts to render the HTML content in different phases.
- The bare bone HTML skeleton will be rendered first.
- Depending on the HTTP headers returned by the server, the static files can be cached by the browser so it doesn’t have to fetch them again the next time you visit the page.
Even though there are many things happening behind the scenes, in an ideal scenario all these 4 stages should take less than a second. I hope you have a better idea now of what happens when you visit a website.