E-Tag — The most underrated HTTP cache
As a developer, I’m always looking for new ways to enforce optimization in my project. I wanted to use a cache mechanism which was smart (has some validation), and E-Tag (A type of HTTP Cache) turns out to be perfect choice for this.
A cache-mechanism, which doesn’t relies on timeout but instead relies on freshness of the resource.
A few opening remarks, I’ll discuss methods of tracking, users via E-Tag. Please, make sure that you follow, all the privacy regulations, where you are based on.
What is E-Tag ?
E-tag are a way to uniquely identify a request, so that it can be cached.
The general idea of using E-tag is to avoid unnecessary payloads, in network request. Checking into, the browser compatibility of E-Tag. We can safely use it as a caching strategy, for most of the use cases.
Before we really dwell into E-Tag. Let’s look at the various HTTP Cache mechanisms in the web API.
How the HTTP Cache works ?
The HTTP Cache’s behavior is controlled by a combination of request headers and response headers.
HTTP Cache can be implemented by any one of the following web API :
3. Last Modified
Cache-Control: Cache-Control header, allows caching of resources, based on expiration time.
Few other expiration-headers for Cache-Control are :
max-age=<seconds>, s-maxage=<seconds>, min-fresh=<seconds> etc.
The discussion about Cache-Control is beyond, the current topic.
The Last-Modified response HTTP header contains the date and time at which the origin server, “resource” was “last modified”.
Future requests, Request-Header will contain : If-Modified-Since
Last-Modified: Tue, 20 Oct 2015 07:28:20 GMT
If-Modified-Since: Tue, 20 Oct 2015 07:28:20 GMT
The E-Tag response HTTP header contains the “id”, which is generated based on the response-data.
Future requests, Request-Header: If-None-Match :
E-Tag : W/”8b-ntEGyXckz3Jon0kvN85y7Cx4GDA”
If-None-Match : W/”8b-ntEGyXckz3Jon0kvN85y7Cx4GDA”
A must read about E-Tag —
E-Tag is HTTP header, it’s an “identifier” for a specific “version of a resource”. E-Tags are like “fingerprints of a resource”, if resource changes the fingerprint or E-Tag also changes. It allows smart-caching, as a web server does not need to resend a full response if the content has not changed.
Thus, an ETag response header, doesn’t provides any information about the resource, thus, it can be used as a strong validator. That means that a HTTP user-agent, such as the browser, does not know what this string represents and can’t predict what its value would be.
If the E-Tag header was part of the response for a resource, the client can issue an If-None-Match in the header of future requests — in order to validate the cached resource.
How E-Tags can be exploited ?
E-Tags are persistent, they persist even when you close your browser or shutdown your computer. The E-tag is stored in the browser cache, and it remains until, you delete your browser cache. Opera recommends “This data should be cleared periodically.” but sadly most users don’t clear it that often.
A Practical Use Case On Tracking Users via E-Tag :
E-Tag tracking strategies :
1. Tagging the Image.
2. Tagging the API Response.
Tagging the image: Generating a unique image, for every new request. If the E-Tag is there in the If-None-Match, then we know that the user has already visited the site, and need not create a new-one, else we will create a new unique image.
Alright, looks quite easy to implement it, and it is. The most trickiest part is generating unique images. Actually, we won’t be generating any unique images. We will simply overwrite the E-Tag of the image with some random-string. This works quite fine in most of the use-cases.
Tagging the API Response (JSON): Generating a unique “id”, for every API call, will do the job.
While E-Tags, cannot identify user’s reliably. But they are a great facilitator in device-fingerprinting.
Device fingerprinting is a way to combine certain attributes of a device — like what operating system it is on, the type and version of web browser being used, the browser’s language setting and the device’s IP address — to identify it as a unique device.
E-Tags are technically not cookies, also known as cookie-less-cookie, allows tracking.
The analytics guys, are constantly looking for methods, to track users and device fingerprinting is one of them. There are many other methods like Favicons tracking etc. Here’s a link to the article.
I’m not against tracking users, with their consent. But some sites, do track without their consent, which is wrong and I’m against that.