Methods of HTTP Caching

Specific Caches

User Agent Caches

Modern web browsers maintain several different client-side caches for different purposes per installation and user of the software, an HTTP resource cache obeying HTTP caching semantics as described above is one of them. Other caches that the client may utilize in addition and that are subject to different semantics are:

  • A cache for the value of Cookies, organized by domain name or name wildcard of authoritative server, by name of the Cookie, by usability on secure connections and by expiry date. Management of state using Cookies is specified by RFC 6265. Maintenance of such data on application level is implementation-specific, and efforts by W3C to establish a normative reference for such mechanisms are currently work in progress.
  • A cache for the data transceived by service workers, a system of event-driven programmable modules, available on popular web browsers. Service workers can implement complex, algorithmic and dynamic caching strategies on a per-application basis. Browsers offer separately addressable storage for data cached by service workers. This storage is compartmentalized by the origin of the application that registers the worker with the browser, defined by scheme, domain and port of the URL of the application. Normative reference for service workers, including their caching semantics is currently approaching W3C recommendation status.
  • A cache called „web storage“ which can be further separated into „local storage“ and „session storage“. Both offer a programmable interface for web applications that behaves like a key-value store, keys and values are strings of text, complex data structures can be stored as values using JSON. Session storage is expired and deleted automatically at the end of a session, an event that occurs if a specific display of a specific website is terminated by the user agent. Local storage does not have an explicit expiry. As with the service worker cache, access to the storage is compartmentalized on a per-origin basis.

Service workers and web storage are only accessible to web applications by programmatic means, using them involves executing Javascript code inside a user agent.

There are limits on the maximum single and overall size of elements that can be stored in each of the private caches of a user agent, but they are implementation-specific, and can be – depending on the implementation – configurable by the user. There is no normative reference on size and element count restrictions of private user agent caches.

While the basic behavior of validation and expiration of Cookie storage is implicitly governed by HTTP semantics, there are no normative references on distributed caching involving web server and client when it comes to service worker cache and web storage, and it is up to the web applications to implement such mechanisms.

In [Osmani 2017] there is mention of a „memory cache“ in the popular web browser „chrome“ that caches prefetched and preloaded resources that are marked as non-cacheable.

In the popular web browser „firefox“ there is a mode of operating called „private browsing“ that, among other features, modifies the validation of cached resources and disables their persistent storage in the private cache. See [Mozilla 2018] for a discussion of goals and implementation details of „private browsing“.

Reverse Proxy Caches

A web proxy is said to operate „reversely“ if its intention is to provide access onto a single authoritative source formed from resources from a multitude of „back-end“ webservers to its users, as opposed to a „forward“ cache that provides access onto a multitude of authoritative sources to its users. Larger installations may operate a multitude of reverse proxies and distribute users onto them (facilitating load-balancing and other session-aware or location-based distribution functions).

The combination of many individual sources into one authoritative source and thus into one consistently named origin can make it easier for complex web applications to satisfy same-origin access restrictions imposed by modern user agents onto resources in general, Cookies and specific private caches such as the service worker cache and local storage.

Operating the authoritative source as a reverse proxy can provide for a simplified organisation of the network towards the back-end server. It can also become easier to impose security- or privacy-motivated access restrictions to certain resource-back-ends if back-end network traffic is organised towards a single or a few reverse proxies. However, the effect can also be detrimental, if careless implementation of reverse proxying obscures or removes relevant access restrictions of the back-end origins that would otherwise have been in place.

A modern reverse proxy software can provide tools for implementing elaborate, complex and dynamic policies of back-end retrieval. Such proxies, implementing complex features, become a logical component of the web application even though they act transparently towards their user clients.

Reverse proxies can also be capable of performing caching to reduce load on the back-end webservers and the network towards them and can apply filtering, transforming and redirecting requests. These operations are often performed as part of a dynamic policies, and there is a demand by web developers for programmatic control over these features of the reverse proxy.

„Varnish“ offers a procedural programming language, Varnish Configuration Language (VCL) to implement complex transformation schemes for requests and responses. [Forehand 2011] gives a detailed account on how VCL was used for an application-specific implementation of optimized HTTP range caching.

The web proxy software „Traefik“ can optionally provide a REST API on a dedicated TCP port to provide dynamic reconfiguration to management systems or web applications.

Imprint RSS