0

HTTP 1.0 vs HTTP 1.1 – Caching

Hi Friends, as we are discussing about differences between HTTP 1.0 vs HTTP 1.1, I am trying to cover one difference in one article so that it will be easy for us to grasp these. Http 1.1 caching is an important aspect to learn which impacts web behavior.

Previously we discussed about compatibility changes which were done as part of HTTP 1.1.

In this article we are going to discuss about http 1.1 caching changes.

Caching

Caching is a technique through which you can preserve few most required resources on the server\client. Once these resources are preserved, whenever any new client request the same resource from the server then rather than process again, server returns with the preserved resource.

This technique gives few benefits.

  1. As server do not need to process again for the same resource, so server performance improves.
  2. As this is preserved resource so there is no need to add network packets in the response. Due to which response size decreases a bit and network latency improved for other requests.
  3. Due to less network congestion, server is able to handle more and more requests which is cost effective solution.

Caching was there in HTTP 1.0 also but it was not at that much mature level so it causes couple of issues.

Caching in HTTP 1.0

The HTTP 1.0 caching mechanism worked well, but it had many shortcomings. It did not allow either servers or clients to give full and explicit instructions to caches. Therefore, this caching was not well-specified. Because of which we had following issues.

  1. Incorrect caching of some responses that should not have been cached. Due to this responses were unexpected.
  2. Failure to cache some responses that could have been cached. This causes performance problems.

Expires Header:

HTTP/1.0 provided a simple caching mechanism. An origin server may mark a response, using the Expires header, with a time until which a cache could return the response.

After above mentioned date time, cache will expire and will not be served to client. It should be validated with origin server.

If-Modified-Since, Last-Modified headers:

A cache may check the current validity of a response using what is known as a conditional request: it may include an If-Modified-Since header in a request for the resource, specifying the value given in the cached response’s Last-Modified header.

Below is the syntax for this header

 

The If-Modified-Since request HTTP header makes the request conditional: the server will send back the requested resource, with a 200 status, only if it has been last modified after the given date. If the resource has not been modified since, the response will be a 304 (Not modified) without any body; the Last-Modified header will contain the date of last modification. 

Pragma: no-cache header:

The Pragma: no-cache header, for the client to indicate that a request should not be satisfied from a cache.

Every time response will be processed fresh from origin server.

Caching in HTTP/1.1

HTTP 1.1 caching attempts to clarify the concepts behind caching, and to provide more mature mechanisms for caching. It retains the basic HTTP 1.0 caching plus it provide a design with new features and with more careful specifications of the existing features.

In HTTP 1.1, a cache entry is fresh until it reaches its expiration time. Once it expires, it should not be provided in the response but it should not be deleted from the cache also. But it normally must revalidate it with the origin server before returning it in response to a subsequent request. However, the protocol allows both origin servers and end-user clients to override this basic rule.

ETag, If-None-Match Headers:

https://www.slideshare.net/RakeshChaudhary4/advanced-caching-concepts-velocity-ny-2015

The ETag or entity tag is part of HTTP caching, which allows a client to make conditional requests. This allows caches to be more efficient, and saves bandwidth, as a web server does not need to send a full response if the content has not changed. 

When a URL is retrieved, the web server will return the resource’s current representation along with its corresponding ETag value, which is placed in an HTTP response header “ETag” field:

ETag: “686897696a7c8745rt5”

The client may then decide to cache the representation, along with its ETag. Later, if the client wants to retrieve the same URL resource again, it will first determine whether the local cached version of the URL has expired (through the Cache-Control and the Expire headers). If the cache has not expired, it will retrieve the local cached resource. If it determined that the cache has expired (is stale), then the client will contact the server and send its previously saved copy of the ETag along with the request in an “If-None-Match” field.

If-None-Match: “686897696a7c8745rt5”

On this subsequent request, the server may now compare the client’s ETag with the ETag for the current version of the resource. If the ETag values match, meaning that the resource has not changed, then the server may send back a very short response with a HTTP 304 Not Modified status. The 304 status tells the client that its cached version is still good and that it should use that.

However, if the ETag values do not match, meaning the resource has likely changed, then a full response including the resource’s content is returned, just as if ETags were not being used. In this case the client may decide to replace its previously cached version with the newly returned representation of the resource and the new ETag.

The Cache-Control Header:

Standard Cache-Control directives that can be used by the client in an HTTP request. There are many variations for using this directive. Below is directives list for request.

  • Cache-Control: max-age=<seconds>
  • Cache-Control: max-stale [=<seconds>]
  • Cache-Control: min-fresh=<seconds>
  • Cache-Control: no-cache
  • Cache-Control: no-store
  • Cache-Control: no-transform
  • Cache-Control: only-if-cached

Below is directives list for response.

  • Cache-Control: must-revalidate
  • Cache-Control: no-cache
  • Cache-Control: no-store
  • Cache-Control: no-transform
  • Cache-Control: public
  • Cache-Control: private
  • Cache-Control: proxy-revalidate
  • Cache-Control: max-age=<seconds>
  • Cache-Control: s-maxage=<seconds>

In upcoming sessions, we’ll discuss about all these directives in details.

The Vary header:

A cache finds a cache entry by using a key value in a lookup algorithm. HTTP/1.0 uses just the requested URL as the cache key. But this is not a perfect model as sometimes response may vary not only based on the URL, but also based on one or more request-headers (such as Accept-Language and Accept-Charset).

To support this type of caching, HTTP/1.1 includes the Vary response-header. This header field carries a list of the relevant selecting request-header fields that participated in the selection of the response variant. If new request exactly matches with the cached version of request, then only cached resource is returned else server does it normal processing to return the resource.

In addition to above headers, there are few more important headers but those are generally used in more complex scenarios. We’ll cover those headers separately e.g. If-Unmodified-Since and If-Match etc.

That’s all about caching. Please note that this is very simple and basic overview about caching changes in http 1.1. There is a long list of changes which are complex. We’ll cover those in another articles.

 

0

HTTP 1.0 vs HTTP 1.1 – Compatibility

Hi friends, today we are going to discuss about differences between HTTP 1.0 vs HTTP 1.1. However version 1.0 was successful and still getting used for many websites but there were few shortcomings in this version which were fixed in version 1.1. There were major improvements done in new version for performance, bandwidth consumption etc.

As we know that HTTP is application level protocol which works on top of TCP. Http uses TCP connections to transfer the data between client and server. If you are not much familiar with HTTP then you can check here.

Now let’s come to the difference between these two protocols. There is a long list of HTTP 1.0 vs HTTP 1.1. So I am breaking this study into multiple articles so that it’ll be easy to grasp each difference easily.

Let’s discuss about first difference i.e. Compatibility with older versions.

HTTP 1.0 vs HTTP 1.1 – Compatibility

Once version 1.0 was released, it took another 4 years to release version 1.1. In these 4 years many 1.1 drafts were released and people started using these. While working with these draft versions there were constantly issues, Improvement areas were reported and fixed also. By the time final version was released, there were many websites which were already working with few draft versions. However these draft versions were having issues but final version can’t ignore all these shortcomings of draft versions.

It was necessary to have final version compatible with HTTP 1.0 and all draft versions so that nothing should get break.

In addition to this, HTTP 1.1 was made in such a way so that it should be compatible with future versions also.

Couple of changes are listed below.

Version Numbers

AS we know that each Http message has http version associated with itself. These version numbers are hop-to-hop, not end-to-end. For example, a client on Http 1.0 sends a request to a server which is on http 1.1. And this request is coming through multiple hopes in between. Client sent the request with http 1.0 in message but the hop which was just before the server was using http 1.1. So in this case when server will receive request, it will have http 1.1 in the request line.

HTTP 1.0 vs HTTP 1.1 - Compatibility

There is no way for server to know the actual client http version.

To resolve this issue, a new request header was introduced as “via”. This header contains the path of HTTP version getting used in transmission. By this header server should be able to know the HTTP version of end client.

HTTP 1.0 vs HTTP 1.1 - Compatibility

Below is the example of this header.

Via: 1.0 lazy, 1.1 p.example.net

HTTP OPTIONS method

HTTP/1.1 introduces the OPTIONS method, a way for a client to learn about the capabilities of a server without actually requesting a resource.

Below is the example of option request

Below will be the response

 From above response we can see that server supports OPTIONS, GET, HEAD, POST methods.

Upgrading to other protocols

In order to ease the deployment of incompatible future protocols, HTTP/1.1 includes the new Upgrade request-header. By sending the Upgrade header, a client can inform a server of the set of protocols it supports as an alternate means of communication. The server may choose to switch protocols, but this is not mandatory.

The Upgrade header field is a HTTP header field introduced in HTTP/1.1. In the exchange, the client begins by making a clear text request, which is later upgraded to a newer HTTP protocol version or switched to a different protocol.

Connection upgrade must be requested by the client; if the server wants to enforce an upgrade it may send a 426 Upgrade Required response. The client can then send a new request with the appropriate upgrade headers while keeping the connection open.

This way a protocol switching happens.

In next session we’ll cover Caching improvements in HTTP 1.1

0

Keep-Alive header, usage and benefits

Hi friends, today we are going to discuss about keep-alive connections and its significance. First of all, let’s discuss a little about how website works and how connections are maintained. 

How website works?

In general below is the flow of any web request.

Keep-Alive website

  1. User types any address in any browser e.g. www.google.com.
  2. Browser sends request to web server.
  3. Web server creates a new process or assign a thread to process this request.
  4. Web server processes the request and generate the response.
  5. This response is sent back to client.
  6. Now assigned process or thread is free to receive another requests.
  7. Browser displays the response to user.

This is very basic and top view of the process.

 Overview of TCP connection

Whenever two machines communicates with each other (In current example, this communication is between client and the web server), a connection is created. Using this connection, machines communicate with each other. This connection is called as TCP pipe\connection.

TCP connection maintains source, destination IP address and port information so that connection can be created and resource could be transferred.

You can get more information about TCP connection and layers here.

As we know that any webpage can have multiple resources like css files, js files, images etc. When we open any webpage then a new connection is created for each resource. So if a web page contains 10 images, 4 css files, and 2 js files then for each of these 16 files, new connection will be created and then resource will be loaded. Once resource downloading is completed, connection gets closed.

What is Keep-Alive?

As we just read that for each web resource, a new connection is created and resource is transferred. Once resource is transferred then connection gets closed. However this is pretty simple process but a little inefficient.

If request is for same server then why we can’t keep connection open so that all required resources can be transferred using same connection. And once all resources are done then connection may be closed.

Yes, you got me right. Keep-alive does the same thing. Whenever we provide instruction to keep the TCP connection alive, then TCP connection is not terminated after resource transfer. Other files can be transferred using same connection.

Keep-Alive processing

Benefits of using Keep-Alive

Now we know that using keep alive we can transfer multiple resources using one connection. We don’t need to get a new connection for every resource. We can have following benefits using this approach.

CPU Usage:

We know that TCP connection creation is a process which CPU need to do for every resource. By using keep alive we can reduce CPU load so that it can be utilized more efficiently.

HTTPS Connection utilization:

When site is on HTTPS connection then it is beneficial to use one connection as new connection creation and handshakes is an expensive process.

Webpage load speed:

As more files can be transferred using single connection so transfer speed improves and page loads faster. Using multiple connection creation process can slow up website load speed.

How to enable Keep-Alive header?

Let’s see how we can enable this setting for any website.

Open IIS in your web server and select the site for which you want to change the setting.

Change Keep-Alive setting

Double click on “HTTP Response Headers”

Click on “Set Common Headers”. Dialog box will open. Check the setting “Enable HTTP Keep-Alive”.

If we uncheck this box then “Connection: close” header will be sent to client which indicates that this connection will be closed once response is sent.

Any disadvantages for using keep alive?

Now we know that having single connection for all resource files from a particular source to destination is a good idea. But do you really think that we can use this setting blindly?

No, there are few points to consider here while enabling this setting. From client side we are fine if connection remain open all the time. But this can be painful for server.

We know that in normal configuration webserver close the connection once resource is delivered. Due to this nature, web server is able to handle lots of requests as server resources gets freed up after file is done. But if connection remain open then few server resources are occupied to main that connection.

So consider if 1000s of clients are remain their connection open for long time then web server may be impacted with serious issues as lots of server resources will be occupied and web server performance will be reduced .

So think twice while enabling this setting. If you think that enabling this setting will improve your overall experience then go ahead and enable it. But there are few factors which will impact this setting.

Let’s discuss those factors so that you can use this setting efficiently.

Factors which Impact keep-Alive

Now let’s discuss those factor which will impact this setting.

MaxKeepAliveRequests:

It sets the maximum number of requests for every connection.

A value of 100 is normally good enough for almost any scenario. This value, however, can be increased depending on the amount of files within a web page that the server is supposed to deliver. If any webpage contains 100 files to deliver then this value can be increased.

KeepAliveTimeout:

This setting tell about how long not active connection should remain in server. Once any connection is in server more than this timeout value without any work then server should destroy this.

A value between 7 to 10 seconds is usually ideal. With higher traffic this value can go extremely higher to make sure there is no frequent TCP connection re-initiated. If this value goes down too much, Keep-Alive loses its purpose.

Conclusion

All modern browsers use persistent connections till the time server have no objection.

With HTTP/1.0 implementation, this setting was disabled by default. We can enable it. But with HTTP/1.1, Keep-Alive is implemented differently and the connections are kept open by default. HTTP/1.1 connections are always active.

To disable this setting in HTTP/1.1 we need to set the response header “Connection: close”. If we are not sending this close header then connection will remain open. But this connection will not remain open forever.

 

0

TCP IP Layers and role in webpage request

Welcome readers. Today we are going to discuss about TCP IP layers and what is the role of these layers in web communication. As we all know that we are in world of internet where every information is part of a very big network. We connect to the network and get the desired information from the source to our devices. TCP IP Layers is part of this whole system. Pretty simple process, isn’t it?

Not exactly. However this seems a simple process but it involves too many technologies, layers and combination of hardware. All these work hand in hand to get you delivered with the information you requested.

Let’s first discuss about how website works.

Web Request Life Cycle

In general below is the flow of any web request.

tcp ip layers and role

  1. User types any address in any browser e.g. www.google.com.
  2. Browser sends request to web server.
  3. Web server creates a new process or assign a thread to process this request.
  4. Web server processes the request and generate the response.
  5. Client gets this response.
  6. Now assigned process or thread is free to receive another requests.
  7. Browser displays the response to user.

This is very basic and top view of the process. But aim of this article is to get you aware with the complete flow of information from source to destination.

OSI Layers vs TCP IP Layers

As we have studied about OSI model which has seven layers. These layers are a complete picture of information flow from top level application to low level cables and signals. Below is the OSI layers.

tcp ip layers and role

You can get more details about OSI model here.

http://www.omnisecu.com/tcpip/osi-model.php

However OSI layers are the base of communication but we don’t use this in real world. OSI model is now just a conceptual model to represent separation of layers and functionalities.

We use the TCP/IP Network Model.  OSI layers are merged in TCP IP layers to create a real world communication model. Following is the mapping of OSI layers with TCP IP layers. 

tcp ip layers and role

From above image we can see that how TCP IP layers are refactored version of OSI layers. It merges few layers of OSI into one layer.

Web request and communication

Now let’s come to the communication flow when we request any webpage.

  • Whenever we type any address i.e. google.com in address bar of any browser, processing gets started. Web browser uses Hypertext Transfer Protocol which is an Application layer protocol.
  • Web browser is programmed in such a way that it extracts IP address of the URL you typed using Domain Name System. DNS is also an Application layer protocol. Once DNS look up is done, browser gets IP address of the google primary server.

dns look up

  • Browser now creates a HTTP packet having request details of google.com.
  • The packet is still in our PC. Now browser connects to the lower layer. Every layer have some interface exposed to above and below layers so that these layers can communicate with each other. So the browser gives the HTTP packet to TCP process (Transmission Control Protocol) which is a Transport Layer protocol.
  • TCP’s main function is to split request into multiple packets. These packets are having packet level identity. TCP controls the reliability of message transmissions through handshakes and acknowledgements. TCP creates a pipe between source and destinations so that system can transfer the data. This pipe called as TCP connection.
  • TCP now put own info on top of these packets. System needs this information to maintain the session/connection.
  • The packet is still in our PC. TCP now handovers these packets to next layer through its provided interface i.e. IP process (Internet Protocol) which is an Internet Layer protocol.
  • The main job of IP layer is addressing and routing. This layer puts source, destination IP addresses and routing information in packets so that packets can be route to the correct location. 
  • IP now put its own info on top of TCP packet. System needs this information for routing in the internet.
  • The packet is still in our PC. IP now handovers the packet to network access/network interface layer. Network access layer defines the protocols and hardware required to deliver data across some physical network. Most of the PC users Ethernet.
  • Our PC now encapsulates Ethernet header and Ethernet trailer with the IP packet, creates an Ethernet frame. Ethernet contains MAC address which is used to send frame locally (Local area network
  • Now your PC physically transmits the bits of this Ethernet frame, using electricity signals over the Ethernet cabling.
  • The packet is now out of our PC. It reaches Google’s web server. Note that all packets are not transmitted using same route. These may go through different routes using the most efficient routing.
  • The web server physically receives the electrical signal over a cable, and re-creates the same bits by interpreting the meaning of the electrical signals.
  • Web server now de-encapsulates the IP packets from the Ethernet frame by removing and discarding the Ethernet header and trailer. After this it hands over to Internet protocol layer.
  • Internet protocol layer then verifies source and destination info and then hands over to TCP layer.
  • TCP layer reads the TCP information. It merges all incoming packets and create HTTP packet which source created originally.
  • This layer provide acknowledgement about each packet so that if any packet is missing or corrupt then source can transmit that again. This confirms the reliability of the message.
  • And finally TCP hands it over to HTTP process which understands the HTTP get request.

tcp ip layers and role

  • Web server now process the request through running web process hence generates the response. 
  • Server sends this response in the same way to the intended device which initiated the request.
  • Finally information reaches to device’s application layer then browser displays this as web page. 

So this is how all the layers are in action when we request a webpage.