Abstract
Sequential dataset is a collection of records written and read in sequential order. Information from the sequential dataset is very useful in understanding the sequential patterns and finally making an appropriate decision. However, generating of sequential dataset from log file is quite complicated and difficult. Therefore, in this study we proposed a sequential preprocessing model (SPM) and sequential preprocessing tool (SPT) as an attempt to generate the sequential dataset. The result shows that SPT can be used in generating the sequential dataset. We evaluated the performance of the developed model against the log activities captured from UMT’s e-Learning System called myLearn. With the minimum modification of the dataset, it can be used by other data mining tool for further sequential patterns analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abdullah, Z., Herawan, T., Deris, M.M.: Detecting Definite Least Association Rules in Medical Databases. In: Herawan, T., Deris, M.M., Abawajy, J. (eds.) Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013). LNEE, vol. 285, pp. 127–134. Springer, Singapore (2014)
Abdullah, Z., Herawan, T., Deris, M.M.: Mining Indirect Least Association Rule. In: Herawan, T., Deris, M.M., Abawajy, J. (eds.) Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013). LNEE, vol. 285, pp. 159–166. Springer, Singapore (2014)
Abdullah, Z., Herawan, T., Deris, M.M.: Discovering Interesting Association Rules from Student Admission Dataset. In: Herawan, T., Deris, M.M., Abawajy, J. (eds.) Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013). LNEE, vol. 285, pp. 135–142. Springer, Singapore (2014)
Herawan, T., Vitasari, P., Abdullah, Z.: Mining critical least association rules of student suffering language and social anxieties. International Journal of Continuing Engineering Education and Life-Long Learning 23(2), 128–146 (2013)
Abdullah, Z., Herawan, T., Deris, M.M.: Tracing significant association rules using critical least association rules model. International Journal of Innovative Computing and Applications 5(1), 3–17 (2013)
Herawan, T., Noraziah, A., Abdullah, Z., Deris, M.M., Abawajy, J.H.: IPMA: Indirect patterns mining algorithm. In: Nguyen, N.T., Trawiński, B., Katarzyniak, R., Jo, G.-S. (eds.) Adv. Methods for Comput. Collective Intelligence. SCI, vol. 457, pp. 187–196. Springer, Heidelberg (2013)
Abdullah, Z., Herawan, T., Deris, M.M.: Mining Highly-Correlated of Least Association Rules using Scalable Trie-based Algorithm. Journal of Chinese Institute of Engineers 35(5), 547–554 (2012)
Herawan, T., Vitasari, P., Abdullah, Z.: Mining interesting association rules on student suffering study anxieties using SLP-Growth algorithm. IGI-Global - International Journal of Knowledge and Systems Science 3(2), 24–41 (2012)
Martinez-Maldonado, R., Yacef, K., Kay, J., Kharrufa, A., Al-Qaraghuli, A.: Analysing frequent sequential patterns of collaborative learning activity around an interactive tabletop. In: 4th International Conference on Educational Data Mining (EDM 2011), pp. 111–120 (2011)
Agrawal, R., Srikant, R.: Mining Sequential Patterns. In: Proceedings of the 11th International Conference on Data Engineering (ICDE), pp. 3–14. IEEE Computer Society (1995)
Pei, J., Han, J., Wang, W.: Constraint-based Sequential Pattern Mining: the Pattern-Growth Methods. Journal of Intelligence and Information System 28(2), 133–160 (2007)
Minos, G., Hill, M., Rastogi, R., Shim, K.: Mining sequential patterns with regular expression constraints. IEEE Transactions on Knowledge and Data Engineering 14(3), 530–552
Kettner, J., Ebbers, M., O’Brien, W., Ogden, B.: Introduction to the New Mainframe: z/OS Basics. IBM Redbooks, NY (2011)
Bharadwaj, B.K., Pal, S.: Mining Educational Data to Analyze Students Performance. International Journal of Computer Science and Information Security (IJACSA) 6(2), 63–69 (2011)
Cocea, M., Weibelzahl, S.: Eliciting Motivation Knowledge from Log Files Towards Motivation Diagnosis for Adaptive Systems. In: Conati, C., McCoy, K., Paliouras, G. (eds.) UM 2007. LNCS (LNAI), vol. 4511, pp. 197–206. Springer, Heidelberg (2007)
Masseglia, F., Tanasa, D., Trousse, B.: Web Usage Mining: Sequential Pattern Extraction with a Very Low Support. In: Yu, J.X., Lin, X., Lu, H., Zhang, Y. (eds.) APWeb 2004. LNCS, vol. 3007, pp. 513–522. Springer, Heidelberg (2004)
Romero, C., Ventura, S., García, E.: Data mining in course management systems: Moodle case study and tutorial. Computers & Education 51(1), 368–384 (2008)
Lile, A.: Analyzing E-Learning Systems Using Educational Data Mining Techniques. Mediterranean Journal of Social Sciences 2(3), 403–419 (2011)
Wahab, M.H.A., Mohd, M.N., Hanafi, H.F., Mohsin, M.F.M.: Data Pre-processing on Web Server Logs for Generalized. Proceedings of World Academic of Science, Engineering and Technology 26, 970–977 (2008)
Castellano, G., Fanelli, A.M., Torsello, M.A.: Log Data Preparation for Mining Web Usage Patterns. In: IADIS International Conference Applied Computing, pp. 371–378 (2007)
Salama, S.E., Marie, M.I., El-Fangary, L.M., Helmy, Y.K.: Web Server Logs for Preprocessing for Web Intrusion Detection. Computer and Information Science 4(4), 123–133 (2011); Canadian Center of Science and Education
Li, Y., Feng, B., Mao, Q.: Research on Path Completion Technique in Web Usage Mining. In: IEEE International Symposium on Computer Science and Computational Technology, pp. 554–559 (2008)
Patil, P., Patil, U.: Preprocessing of web server log file for web mining. World Journal of Science and Technology 2(3), 14–18 (2012)
Zhang, G., Zhang, M.: The Algorithm of Data Preprocessing in Web Log Mining Based on Cloud Computing. In: Proc. of International Conference on Information Technology and Management Science, pp. 468–474 (2012)
Valsamidis, S., Kontogiannis, S., Kazanidis, I., Theodosiou, T., Karakos, A.: A Clustering Methodology of Web Log Data for Learning Management Systems. Educational Technology & Society 15(2), 154–167 (2012)
Marija Blagojevic, M., Micic, Z.: Contribution to the Creation Of DMX Queries in Mining Student Data. Int. J. Emerg. Sci. 2(3), 334–344 (2012)
Romero, C., Porras, A., Ventura, S., Hervas, C., Zafra, A.: Using Sequential Pattern Mining for Links Recommendation in Adaptive Hypermedia Educational Systems. In: International Conference Current Developments in Technology-Assisted Educations, pp. 1015–1020 (2006)
Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., Hsu, M.-C.: FreeSpan: Frequent Pattern-Projected Sequential Pattern Mining. In: Proc. 2000 ACM SIGKDD Int’l Conf. Knowledge Discovery in Databases (KDD 2000), pp. 355–359 (2000)
Agrawal, R., Srikant, R.: Mining Sequential Patterns: Generalizations and Performance Improvements. In: Proceedings of the Fifth Int. Conference on Extending Database Technology, pp. 3–17. Avignon, France (1996)
Pei, J., Han, J., Mortazavi-Asl, B., Zhu, H.: Mining Access Patterns Efficiently from Web Logs. In: Proceedings of the 4th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PADKK 2000), Current Issues and New Applications, pp. 396–407 (2000)
Zaki, M.: SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning 40, 31–60 (2001)
Pei, J., Han, J., Mortazavi-Asl, W.J., Pinto, H., Chen, Q., Dayal, U., Hsu, M.-C.: Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach. IEEE Transactions on Knowledge and Data Engineering 16(10), 1–17 (2004)
Shie, B.-E., Hsiao, H.-F., Tseng, V.S., Yu, P.S.: Mining high utility mobile sequential patterns in mobile commerce environments. In: Yu, J.X., Kim, M.H., Unland, R. (eds.) DASFAA 2011, Part I. LNCS, vol. 6587, pp. 224–238. Springer, Heidelberg (2011)
Ahmed, C.F., Tanbeer, S.K., Jeong, B.S.: Mining High Utility Web Access Sequences in Dynamic Web Log Data. In: Proceeding of: 11th ACIS International Conference on Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing, SNPD 2010, pp. 76–81 (2010)
Yin, J., Zheng, Z., Cao, L.: USpan: An Efficient Algorithm for Mining High Utility Sequential Patterns. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012, pp. 660–668 (2012)
Kalia, H., Dehuri, S., Ghosh, A.: A Survey on Fuzzy Association Rule Mining. International Journal of Data Warehousing and Mining 9(1), 1–27 (2013)
Priya, R.V., Vadivel, A.: User Behaviour Pattern Mining from Weblog. International Journal of Data Warehousing and Mining 8(2), 1–22 (2012)
Taniar, D., Goh, J.: On Mining Movement Pattern from Mobile Users. International Journal of Distributed Sensor Networks 3(1), 69–86 (2007)
Daly, O., Taniar, D.: Exception Rules Mining Based on Negative Association Rules. In: Laganá, A., Gavrilova, M.L., Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds.) ICCSA 2004. LNCS, vol. 3046, pp. 543–552. Springer, Heidelberg (2004)
Taniar, D., Rahayu, W., Lee, V.C.S., Daly, O.: Exception rules in association rule mining. Applied Mathematics and Computation 205(2), 735–750 (2008)
Ashrafi, M.Z., Taniar, D., Smith, K.A.: Redundant association rules reduction techniques. International Journal of Business Intelligence and Data Mining 2(1), 29–63 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Abdullah, Z., Herawan, T., Chiroma, H., Deris, M.M. (2014). A Sequential Data Preprocessing Tool for Data Mining. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2014. ICCSA 2014. Lecture Notes in Computer Science, vol 8581. Springer, Cham. https://doi.org/10.1007/978-3-319-09150-1_54
Download citation
DOI: https://doi.org/10.1007/978-3-319-09150-1_54
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09149-5
Online ISBN: 978-3-319-09150-1
eBook Packages: Computer ScienceComputer Science (R0)