PKM3: an optimal Markov model for predicting future navigation sequences of the web surfers

Abstract

Predicting the browsing behavior of the user on the web has gained significant importance, as it improves the productivity of the website owners and also raises the interest of web users. The Markov model has been used immensely for user’s web navigation prediction. To enhance the coverage and accuracy of the Markov model, higher order Markov models are integrated with lower order models. However, this integration results in large state-space complexity. To reduce the state-space complexity, this paper proposes a novel technique, namely Pruned all-Kth modified Markov model (PKM3). PKM3 eliminates the irrelevant states from a higher order model, which have a negligible contribution toward prediction. The proposed model is evaluated on four standard weblogs: BMS, MSWEB, CTI and MSNBC. PKM3 performance was optimal for the website in which pages were closely placed and share high interlinking. This pruning-based optimal model achieves a significant reduction in state-space complexity while maintaining comparable accuracy.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

References

  1. 1.

    Yang Q, Fan J, Wang J, Zhou L (2010) Personalizing web page recommendation via collaborative filtering and topic-aware Markov model. In: Data mining (ICDM), pp 1145–1150

  2. 2.

    Jindal H, Sardana N (2017) An empirical analysis of web navigation prediction techniques. J Cases Inf Technol (JCIT) 19(1):1–14

    Article  Google Scholar 

  3. 3.

    Pierrakos D, Paliouras G (2010) Personalizing web directories with the aid of web usage data. IEEE Trans Knowl Data Eng 22(9):1331–1344

    Article  Google Scholar 

  4. 4.

    Shirgave S, Kulkarni P, Borges J (2010) Semantically enriched Web usage mining for personalization. Int J Comput Control, Quant Inf Eng 8(1):249–257

    Google Scholar 

  5. 5.

    Abrisham S, Naghibzadeh M, Jalali M (2012) Web page recommendation based on semantic web usage mining. Soc Inf 393–405

  6. 6.

    Xue AY, Qi J, Xie X, Zhang R, Huang J, Li Y (2015) Solving the data sparsity problem in destination prediction. VLDB J 24(2):219–243

    Article  Google Scholar 

  7. 7.

    Xie Y, Tang S (2012) Online anomaly detection based on web usage mining. In: Parallel and distributed processing symposium workshops & PhD Forum (IPDPSW), pp 1177–1182

  8. 8.

    Nguyen N. Facebook filed a patent to calculate your future location. https://www.buzzfeednews.com/article/nicolenguyen/facebook-location-data-prediction-patent

  9. 9.

    Lakshmanan R. How Facebook and Google are using algorithms to predict your next thought. https://thenextweb.com/tech/2019/05/02/how-facebook-and-google-are-using-algorithms-to-predict-your-next-thought/. Accessed on 19 Aug 2019

  10. 10.

    Ganjoo S. Now Facebook wants to predict where you are going next. https://www.indiatoday.in/technology/news/story/now-facebook-wants-to-predict-where-you-are-going-next-1408077-2018-12-12. Accessed on 20 Aug 2019

  11. 11.

    Smith A (2019) Why the future of social media will depend on artificial intelligence. https://www.smartdatacollective.com/future-social-media-depend-artificial-intelligence/. Accessed on 19 Aug 2019

  12. 12.

    Deahl D (2019) Here’s how to use Gmail’s new smart compose. https://www.theverge.com/2018/5/10/17340224/google-gmail-how-to-use-smart-compose-io-2018. Accessed on 17 Aug 2019

  13. 13.

    Kumar S, Gupta S, Gupta A (2014) A survey on Markov model. International Journal of Computer Science & Information Technology 4:29–33

    Google Scholar 

  14. 14.

    Awad MA, Khalil I (2012) Prediction of user’s web-browsing behavior: application of Markov model. IEEE Trans Syst Man Cybern B (Cybernetics) 42(4):1131–1142

    Article  Google Scholar 

  15. 15.

    Awad MA, Khan LR (2007) Web navigation prediction using multiple evidence combination and domain knowledge. IEEE Tran Syst Man Cybern A: Syst Hum 37(6):1054–1062

    Article  Google Scholar 

  16. 16.

    Awad M, Khan L, Thuraisingham B (2008) Predicting WWW surfing using multiple evidence combination. VLDB J 17(3):401–417

    Article  Google Scholar 

  17. 17.

    Pirolli PL, Pitkow JE (1999) Distributions of surfers’ paths through the World Wide Web: empirical characterizations. World Wide Web 2(1–2):29–45

    Article  Google Scholar 

  18. 18.

    Pitkow J, Pirolli P (1999) Mining longest repeating subsequences to predict world wide web surfing. In: Proceedings of USENIX symposium on internet technologies and systems, p 1

  19. 19.

    Singh B, Singh HK (2010) Web data mining research: a survey’. In: IEEE international conference on computational intelligence and computing research (ICCIC), pp 1–10

  20. 20.

    Facca FM, Lanzi PL (2005) Mining interesting knowledge from weblogs: a survey. Data Knowl Eng 53(3):225–241

    Article  Google Scholar 

  21. 21.

    Deshpande M, Karypis G (2004) Selective markov models for predicting web page accesses. ACM Trans Internet Technol (TOIT) 4(2):163–184

    Article  Google Scholar 

  22. 22.

    Nigam B, Jain S (2010) Generating a new model for predicting the next accessed web page in web usage mining. In: Emerging trends in engineering and technology (ICETET), pp 485–490

  23. 23.

    Vishwakarma S, Lade S, Suman M, Patel D (2013) Web user prediction by: integrating Markov model with different features. Int J Eng Res Sci Technol 2(4):74–83

    Google Scholar 

  24. 24.

    Anitha A (2010) A new web usage mining approach for next page access prediction. Int J Comput Appl 8(11):7–10

    Google Scholar 

  25. 25.

    Jindal H, Sardana N (2016) Web navigation prediction using Markov-based models: an experimental study. Int J Web Eng Technol 11(4):310–334

    Article  Google Scholar 

  26. 26.

    Henríqueza PA, Ruza GA (2018) A non-iterative method for pruning hidden neurons in neural networks with random weights. Appl Soft Comput 70:1109–1121

    Article  Google Scholar 

  27. 27.

    Dai Q, Liu Z (2013) ModEnPBT: a modified backtracking ensemble pruning algorithm. Appl Soft Comput 13(11):4292–4302

    Article  Google Scholar 

  28. 28.

    Liu H et al (2011) A fast pruning redundant rule method using Galois connection. Appl Soft Comput 11(1):130–137

    Article  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Honey Jindal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Jindal, H., Sardana, N. PKM3: an optimal Markov model for predicting future navigation sequences of the web surfers. Pattern Anal Applic (2020). https://doi.org/10.1007/s10044-020-00892-7

Download citation

Keywords

  • Web
  • All-Kth modified
  • Markov model
  • Error
  • Pruned
  • State
  • Path
  • Accuracy
  • Navigation
  • Prediction