Abstract: A Toolkit for Caching and Prefetching in the Context of Web Application Platforms

In this thesis, I design and implement a toolkit for caching documents retrieved from URLs as well as any other content. Both file and memory caches are used; the toolkit allows the integration of generic parsers and can store documents in parsed state. The caching procedure follows the expiration and validation rules of HTTP 1.1.

The architecture of the toolkit is pluggable, so different replacement strategies can be integrated. The LRU (Least Recently Used) strategy is provided as a reference implementation.

The toolkit is integrated with the MUNDWERK voice gateway, a Web application platform for voice applications based on the VoiceXML markup language.

The attributes and properties defined by the VoiceXML 1.0/2.0 standard for controlling caching and prefetching are fully supported. The toolkit can be integrated with other platforms without changes to the basic architecture.

The toolkit also supports prefetching in a transparent way. The implemented prefetching strategy is based on the PPM (Prediction by Partial Match) strategy, which utilizes higher-order finite-context Markov models. It is extended to consider the time spans between requests. A blending algorithm is used to combine the results of Markov predictors of different orders.

The pluggable architecture permits complementing or replacing the PPM algorithm by other strategies. Prefetching is handled asynchronously, using primarily system resources that are not needed for other purposes.

In test runs, prefetching did increase the PRR (perceived retrieval rate) by up to 175%, without causing a significant increase in network traffic.

While refined prefetching strategies so far have been mainly a research topic, this toolkit brings them to usage in production.

[Last generated: 2024-09-21]