© 2019 by Santanu Bhattacharjee.

Predictive Web Caching

Travel & Tourism

Use Cases

Association Model

Replaced LRU algorithm with association based model for predicting future search criteria and pre-populating cache with search results.

Summary

The travel and tourism company was facing issue with low cache hit ratio in their website and they wanted to harness the power of machine learning to see whether it can be improved. The project was aimed to provide more reliable solution which can predict next search items from its online customers in regular intervals. We achieved 44% improvements in hit ratio at the end of the project.

Solution

The project was all about finding a better solution to the problem of low hit ratio. LRU is generally a widely used method for web caching. Initially, different machine learning algorithms were evaluated for predicting the future search criteria that their online customers may try to find in next few hours. Association based rules were generated from one such algorithm that gave better result than others. In addition, window validation mechanism was used to know the refreshing window for Cache. 

Tools & Technologies

  • Anaconda Distribution

  • Python 3.x

  • TSV data format

  • NPM HTTP Server

  • Web Application

Python Libraries & Frameworks

  • Apriori

  • Numpy

  • Pandas

  • Pickle

  • Flask

  • Json