Skip to main content

A Machine Learning Algorithm for Analyzing String Patterns Helps to Discover Simple and Interpretable Business Rules from Purchase History

  • Chapter
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2281))

Abstract

This paper presents a new application for discovering useful knowledge from purchase history that can be helpful to create effective marketing strategy, using a machine learning algorithm, BONSAI, proposed by Shimozono et al. in 1994 which was originally developed for analyzing string patterns developed for knowledge discovery from amino acid sequences. In order to adapt BONSAI to our purpose, we translate purchase history of customers into character strings such that each symbol represents a brand purchased by a customer. For our purpose, we extend BONSAI in the following aspects; 1) While original BONSAI generates a decision tree over regular patterns which are limited to substrings, we extend it to subsequences. 2) We generate rules which contain not only regular patterns but numerical attributes such as age, the number of visits, profit and etc. 3) We extend regular expression so that we can consider whether a certain pattern occurs in some latter part of the whole string. 4) We implement majority voting based on 1-D and 2-D region rules on top of decision trees.

Applying the BONSAI extended in this manner to real customers’ purchase history of drugstore chain in Japan, we have succeeded in generating interesting business rules which practitioners have not yet recognized.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. R. Agrawal, T. Imielinski,and A. Swami, Database mining: A performance perspective, IEEE Transactions on Knowledge and Data Engineering, Vol.5, pp. 914–925, 1993.

    Article  Google Scholar 

  2. S. Arikawa, S. Miyano, A. Shinohara, S. Kuhara, Y. Mukouchi and T. Shinohara, A machine discovery from amino acid sequences by decision trees over regular patterns, New Generation Computing, Vol.11, pp. 361–375, 1993.

    Article  MATH  Google Scholar 

  3. P.B. Chou, E. Grossman, D. Gunopulos, and P. Kamesam, Identifying Prospective Customers, Proc. KDD 2000, pp. 447–456, 2000.

    Google Scholar 

  4. C. Fishman, This is a Marketing Revolution, Fast Company, pp. 206–218, 1999.

    Google Scholar 

  5. Y. Hamuro, N. Katoh, Y. Matsuda and K. Yada, Mining Pharmacy Data Helps to Make Profits, Data Mining and Knowledge Discovery, Vol.2 No.4, pp. 391–398, 1998.

    Article  Google Scholar 

  6. M. Hirao, H. Hoshino, A. Shinohara, M. Takeda, and S. Arikawa, A Practical Algorithm to Find the Best Subsequence Patterns, Proc. of 3rd International Conference on Discovery Science, LNAI 1967, pp. 141–154, 2000.

    Google Scholar 

  7. N. Horiguchi, K. Yada, Y. Hamuro, N. Katoh, and Y. Kambayashi, An Optimized Weighted Majority Decision, Proc. of INFORMS-KORMS Seoul 2000, pp. 1663–1669, 2000.

    Google Scholar 

  8. E. Ip K. Yada, Y. Hamuro, and N. Katoh, A Data Mining System for Managing Customer Relationship, Proc. of the 2000 Americas Conference on Information Systems, pp. 101–105, 2000.

    Google Scholar 

  9. B. Kitts, D. Freed, and M. Vrieze, Cross-sell: A Fast Promotion-Tunable Customeritem Recommendation Method Based on Conditionally Independent Probabilities, Proc. KDD 2000, pp. 437–446, 2000.

    Google Scholar 

  10. A. Nakaya, H. Furukawa, and S. Morishita, Weighted Majority Decision among Several Region Rules for Scientific, Proc. of Second International Conference on Discovery Science, LNAI 1721, Springer-Verlag, pp. 17–29, 1999.

    Google Scholar 

  11. J. R. Quinlan, Induction of Decision Trees, Machine Learning, Vol.1, pp. 81–106, 1986.

    Google Scholar 

  12. J. R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufman, 1993.

    Google Scholar 

  13. J. R. Quinlan, See5/C5.0, http://www.rulequest.com, Rulequest Research, 1999.

  14. G. Piatetsky-Shapiro (Editor), Knowledge Discovery in Databases, AAAI Press, 1991.

    Google Scholar 

  15. S. Shimozono, A. Shinohara, T. Shinohara, S. Miyano, S. Kuhara and S. Arikawa, Knowledge Acquisition from Amino Acid Sequences by Machine Learning System BONSAI, Trans. Information Processing Society of Japan, Vol.35, pp. 2009–2018, 1994.

    Google Scholar 

  16. W. E. Spangler, J. H. May and L. G. Vargas, Choosing Data-mining Methods for Multiple Classification: Representational and Performance Measurement Implications for Decision Support, Journal of Management Information System, Vol.16 No.1, pp. 37–62, 1999.

    Google Scholar 

  17. T. K. Sung, H. M. Chung and P. Gray, Special Section: Data Mining, Journal of Management Information System, Vol.16 No.1, pp. 11–16, 1999.

    Google Scholar 

  18. R. Uthurusamy, U.M. Fayyad, and S. Spangler, Learning Useful Rules from Inconclusive Data, In [14], pp. 141–157, 1991.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Hamuro, Y., Kawata, H., Katoh, N., Yada, K. (2002). A Machine Learning Algorithm for Analyzing String Patterns Helps to Discover Simple and Interpretable Business Rules from Purchase History. In: Arikawa, S., Shinohara, A. (eds) Progress in Discovery Science. Lecture Notes in Computer Science(), vol 2281. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45884-0_43

Download citation

  • DOI: https://doi.org/10.1007/3-540-45884-0_43

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43338-5

  • Online ISBN: 978-3-540-45884-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics