Skip to main content

Advertisement

Log in

Robust learning in expert networks: a comparative analysis

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

Human experts as well as autonomous agents in a referral network must decide whether to accept a task or refer to a more appropriate expert, and if so to whom. In order for the referral network to improve over time, the experts must learn to estimate the topical expertise of other experts. This article extends concepts from Multi-agent Reinforcement Learning and Active Learning to referral networks for distributed learning in referral networks. Among a wide array of algorithms evaluated, Distributed Interval Estimation Learning (DIEL), based on Interval Estimation Learning, was found to be superior for learning appropriate referral choices, compared to 𝜖-Greedy, Q-learning, Thompson Sampling and Upper Confidence Bound (UCB) methods. In addition to a synthetic data set, we compare the performance of the stronger learning-to-refer algorithms on a referral network of high-performance Stochastic Local Search (SLS) SAT solvers where expertise does not obey any known parameterized distribution. An evaluation of overall network performance and a robustness analysis is conducted across the learning algorithms, with an emphasis on capacity constraints and evolving networks, where experts with known expertise drop off and new experts of unknown performance enter — situations that arise in real-world scenarios but were heretofore ignored.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. Image source: https://en.wikipedia.org/wiki/Fauvism

References

  • Abdallah, S., & Lesser, V.R. (2006). Learning the task allocation game. In Proc. of AAMAS ’06 (pp. 850–857). ACM.

  • Agrawal, R. (1995). Sample mean based index policies with O(log n) regret for the multi-armed bandit problem. Advances in Applied Probability pp. 1054–1078.

  • Agrawal, S., & Goyal, N. (2012). Analysis of thompson sampling for the multi-armed bandit problem. In COLT (pp. 39–1).

  • Applegate, D.L., Bixby, R.E., Chvatal, V., Cook, W.J. (2011). The traveling salesman problem: a computational study. Princeton: Princeton University Press.

    MATH  Google Scholar 

  • Audibert, J.Y., & Bubeck, S. (2010). Regret bounds and minimax policies under partial monitoring. Journal of Machine Learning Research, 11(Oct), 2785–2836.

    MathSciNet  MATH  Google Scholar 

  • Audibert, J.Y., Munos, R., Szepesvári, C. (2007). Tuning bandit algorithms in stochastic environments. In International conference on algorithmic learning theory (pp. 150–165). Springer.

  • Auer, P., Cesa-Bianchi, N., Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3), 235–256.

    Article  MATH  Google Scholar 

  • Axelrod, R. (2003). Advancing the art of simulation in the social sciences. Journal of the Japanese and International Economies, 12(3), 16–22.

    Google Scholar 

  • Barabási, A.L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.

    Article  MathSciNet  MATH  Google Scholar 

  • Bernstein, M.S., Little, G., Miller, R.C., Hartmann, B., Ackerman, M.S., Karger, D.R., Crowell, D., Panovich, K. (2010). Soylent: a word processor with a crowd inside. In Proc. of UIST ’10 (pp. 313–322). ACM.

  • Berry, D.A., & Fristedt, B. (1985). Bandit problems: sequential allocation of experiments (Monographs on statistics and applied probability) Vol. 12. Berlin: Springer.

    Book  MATH  Google Scholar 

  • Biere, A., Cimatti, A., Clarke, E.M., Fujita, M., Zhu, Y. (1999). Symbolic model checking using SAT procedures instead of BDDs. In Proceedings of the 36th annual ACM/IEEE design automation conference (pp. 317–320). ACM.

  • Biere, A., Heule, M., van Maaren, H. (2009). Handbook of satisfiability Vol. 185. Amsterdam: IOS Press.

    MATH  Google Scholar 

  • Blum, A., & Mansour, Y. (2007). From external to internal regret. Journal of Machine Learning Research, 8(Jun), 1307–1324.

    MathSciNet  MATH  Google Scholar 

  • Brinker, K. (2003). Incorporating diversity in active learning with support vector machines. In Proceedings of the 20th international conference on machine learning (ICML-03) (pp. 59–66).

  • Chakrabarti, D., Kumar, R., Radlinski, F., Upfal, E. (2009). Mortal multi-armed bandits. In Advances in neural information processing systems (pp. 273–280).

  • Chapelle, O., & Li, L. (2011). An empirical evaluation of thompson sampling. In Advances in neural information processing systems (pp. 2249–2257).

  • Cheng, J., & Bernstein, M.S. (2015). Flock: hybrid crowd-machine learning classifiers. In Proc. of CSCW 2015 (pp. 600–611). ACM.

  • Cook, S.A. (1971). The complexity of theorem-proving procedures. In Proceedings of the third annual ACM symposium on theory of computing (pp. 151–158). ACM.

  • Crawford, J.M., & Baker, A.B. (1994). Experimental results on the application of satisfiability algorithms to scheduling problems. In AAAI, vol. 2 (pp. 1092–1097).

  • Donmez, P., & Carbonell, J.G. (2008). Proactive learning: cost-sensitive active learning with multiple imperfect oracles. Proceedings of CIKM ’08, 08, 619–628.

    Article  Google Scholar 

  • Donmez, P., Carbonell, J.G., Schneider, J. (2009). Efficiently learning the accuracy of labeling sources for selective sampling. In Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 259–268). ACM.

  • Donmez, P., Carbonell, J.G., Schneider, J. (2010). A probabilistic framework to learn from multiple annotators with Time-Varying accuracy. In Proceedings of the SIAM international conference on data mining (SDM 2010) (pp 826–837).

  • Foner, L.N. (1997). Yenta: a multi-agent, referral-based matchmaking system. In Proceedings of the first international conference on autonomous agents (pp. 301–307). ACM.

  • Fraenkel, A.S. (1993). Complexity of protein folding. Bulletin of Mathematical Biology, 55(6), 1199–1210.

    Article  MATH  Google Scholar 

  • Freund, Y., Schapire, R.E., Singer, Y., Warmuth, M.K. (1997). Using and combining predictors that specialize. In Proceedings of the twenty-ninth annual ACM symposium on theory of computing (pp. 334–343). ACM.

  • Garivier, A., & Cappé, O. (2011). The KL-UCB algorithm for bounded stochastic bandits and beyond. In Proceedings of the 24th annual conference on learning theory (pp. 359–376).

  • Gelder, A.V. (2008). Another look at graph coloring via propositional satisfiability. Discrete Applied Mathematics, 156(2), 230–243.

    Article  MathSciNet  MATH  Google Scholar 

  • Guo, Y., & Schuurmans, D. (2008). Discriminative batch mode active learning. In Advances in neural information processing systems (pp. 593–600).

  • Heimerl, K., Gawalt, B., Chen, K., Parikh, T., Hartmann, B. (2012). Communitysourcing: engaging local crowds to perform expert work via physical kiosks. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 1539–1548). ACM.

  • Hoi, S., Jin, R., Lyu, M.R. (2006). Large-scale text categorization by batch mode active learning. In Proceedings of the 15th international conference on World Wide Web (pp. 633–642). ACM.

  • Hoi, S., Jin, R., Zhu, J., Lyu, M.R. (2006). Batch mode active learning and its application to medical image classification. In Proceedings of the 23rd international conference on machine learning (pp. 417–424). ACM.

  • Holme, P., & Kim, B.J. (2002). Growing scale-free networks with tunable clustering. Physical Review E, 65(2), 026,107.

    Article  Google Scholar 

  • Jaakkola, T., Jordan, M.I., Singh, S.P. (1994). Convergence of stochastic iterative dynamic programming algorithms. In Advances in neural information processing systems (pp. 703–710).

  • Jensen, D., & Neville, J. (2002). Data mining in social networks. In National academy of sciences symposium on dynamic social network modeling and analysis.

  • Kaelbling, L.P. (1993). Learning in embedded systems. Cambridge: MIT Press.

    Google Scholar 

  • Kaelbling, L.P., Littman, M.L., Moore, A.P. (1996). Reinforcement learning: a survey. Journal of Artificial Intelligence Research, 4, 237–285.

    Article  Google Scholar 

  • Kandasamy, K., Krishnamurthy, A., Schneider, J., Poczos, B. (2017). Asynchronous parallel bayesian optimisation via thompson sampling. arXiv:1705.09236.

  • Kapoor, A., Horvitz, E., Basu, S. (2007). Selective supervision: guiding supervised learning with decision-theoretic active learning. In IJCAI, vol. 7 (pp. 877–882).

  • Kaufmann, E., Cappé, O., Garivier, A. (2012). On bayesian upper confidence bounds for bandit problems. In Artificial intelligence and statistics (pp. 592–600).

  • Kautz, H., & Selman, B. (1996). Pushing the envelope: planning, propositional logic, and stochastic search. In Proceedings of the national conference on artificial intelligence (pp. 1194–1201).

  • Kautz, H., & Selman, B. (1999). Unifying SAT-based and graph-based planning. In IJCAI, vol. 99 (pp. 318–325).

  • Kautz, H., Selman, B., Milewski, A. (1996). Agent amplified communication pp. 3–9.

  • KhudaBukhsh, A.R., Xu, L., Hoos, H.H., Leyton-brown, K. (2009). SATenstein: automatically building local search SAT solvers from components. In IJCAI, vol. 9 (pp. 517–524).

  • KhudaBukhsh, A.R., Xu, L., Hoos, H.H., Leyton-brown, K. (2016). SATenstein: automatically building local search SAT solvers from components. Artificial Intelligence, 232, 20–42.

    Article  MathSciNet  MATH  Google Scholar 

  • KhudaBukhsh, A.R., Carbonell, J.G., Jansen, P.J. (2016). Proactive-DIEL in evolving referral networks. In European conference on multi-agent systems (pp. 148–156). Springer.

  • KhudaBukhsh, A.R., Carbonell, J.G., Jansen, P.J. (2016a). Proactive skill posting in referral networks. In Australasian joint conference on artificial intelligence (pp. 585–596). Springer.

  • KhudaBukhsh, A.R., Jansen, P.J., Carbonell, J.G. (2016b). Distributed learning in expert referral networks. In European conference on artificial intelligence (ECAI), 2016 (pp. 1620–1621).

  • KhudaBukhsh, A.R., Carbonell, J.G., Jansen, P.J. (2017). Incentive compatible proactive skill posting in referral networks. In European conference on multi-agent systems, p. [to appear]. Springer.

  • King, R.D., Whelan, K.E., Jones, F.M., Reiser, P., Bryant, C.H., Muggleton, S.H., Kell, D.B., Oliver, S.G. (2004). Functional genomic hypothesis generation and experimentation by a robot scientist. Nature, 427(6971), 247–252.

    Article  Google Scholar 

  • Kleinberg, R., Niculescu-Mizil, A., Sharma, Y. (2010). Regret bounds for sleeping experts and bandits. Machine Learning, 80(2-3), 245–272.

    Article  MathSciNet  MATH  Google Scholar 

  • Lai, T.L., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1), 4–22.

    Article  MathSciNet  MATH  Google Scholar 

  • Lewis, D.D., & Catlett, J. (1994). Heterogeneous uncertainty sampling for supervised learning. In Proceedings of the eleventh international conference on machine learning (pp. 148–156).

  • Lewis, D.D., & Gale, W.A. (1994). A sequential algorithm for training text classifiers. In Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval (pp. 3–12). New York: Springer.

  • Lin, S., Hong, W., Wang, D., Li, T. (2017). A survey on expert finding techniques. Journal of Intelligent Information Systems pp. 1–25.

  • Littman, M.L., & Szepesvári, C. (1996). A generalized reinforcement-learning model: convergence and applications. In ICML (pp. 310–318).

  • Manavalan, P., & Singh, M.P. (2012). Emerging properties of knowledge sharing referral networks: considerations of effectiveness and fairness. Lecture Notes in Computer Science pp. 13–23.

  • May, B.C., Korda, N., Lee, A., Leslie, D.S. (2012). Optimistic bayesian sampling in contextual-bandit problems. Journal of Machine Learning Research, 13 (Jun), 2069–2106.

    MathSciNet  MATH  Google Scholar 

  • McDonald, D.W., & Ackerman, M.S. (2000). Expertise recommender: a flexible recommendation system and architecture. In CSCW ’00 Proceedings of the 2000 ACM conference on computer supported cooperative work (pp 231–240).

  • Nallapati, R., Peerreddy, S., Singhal, P. (2012). Skierarchy: extending the power of crowdsourcing using a hierarchy of domain experts, crowd and machine learning. Tech. rep., DTIC Document.

  • Pop, M., Salzberg, S.L., Shumway, M. (2002). Genome sequence assembly: algorithms and issues. Computer, 35(7), 47–54.

    Article  Google Scholar 

  • Pushpa, S., Easwarakumar, K.S., Elias, S., Maamar, Z. (2010). Referral based expertise search system in a time evolving social network. In Proceedings of the Third annual ACM bangalore conference on - COMPUTE ’10 (pp 1–8).

  • Qi, G.J., Hua, X.S., Rui, Y., Tang, J., Zhang, H.J. (2008). Two-dimensional active learning for image classification. In IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008 (pp. 1–8). IEEE.

  • Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., Moy, L. (2010). Learning from crowds. Journal of Machine Learning Research, 11(Apr), 1297–1322.

    MathSciNet  Google Scholar 

  • Reichart, R., Tomanek, K., Hahn, U., Rappoport, A. (2008). Multi-task active learning for linguistic annotations. In ACL, vol. 8 (pp. 861–869).

  • Settles, B. (2010). Active learning literature survey. University of Wisconsin, Madison, 52(55-66), 11.

    Google Scholar 

  • Sheng, V.S., & Ling, C.X. (2006). Feature value acquisition in testing: a sequential batch test algorithm. In Proceedings of the 23rd international conference on Machine learning (pp. 809–816). ACM.

  • Sheng, V.S., Provost, F., Ipeirotis, P.G. (2008). Get another label? improving data quality and data mining using multiple, noisy labelers. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 614–622). ACM.

  • Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y. (2008). Cheap and fast—but is it good?: evaluating non-expert annotations for natural language tasks. In Proceedings of the conference on empirical methods in natural language processing (pp. 254–263). Association for Computational Linguistics.

  • Sorokin, A., & Forsyth, D. (2008). Utility data annotation with amazon mechanical turk. In: IEEE computer society conference on computer vision and pattern recognition workshops, 2008. CVPRW’08 (pp. 1–8). IEEE.

  • Stephan, P., Brayton, R.K., Sangiovanni-Vincentelli, A.L. (1996). Combinational test generation using satisfiability. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 15(9), 1167–1176.

    Article  Google Scholar 

  • Thompson, W.R. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3/4), 285–294.

    Article  MATH  Google Scholar 

  • Tsitsiklis, J.N. (1994). Asynchronous stochastic approximation and Q-learning. Machine Learning, 16(3), 185–202.

    MATH  Google Scholar 

  • Vijayanarasimhan, S., & Grauman, K. (2011). Cost-sensitive active visual category learning. International Journal of Computer Vision, 91(1), 24–44.

    Article  MATH  Google Scholar 

  • van Hasselt, H. (2010). Double Q-learning. In Advances in neural information processing systems (pp. 2613–2621).

  • Watkins, C., & Dayan, P. (1992). Q-learning. Machine Learning, 8(3-4), 279–292.

    Article  MATH  Google Scholar 

  • Watts, D.J., & Strogatz, S.H. (1998). Collective dynamics of `small-world’networks. Nature, 393(6684), 440–442.

    Article  MATH  Google Scholar 

  • Whitehill, J., Wu, T., Bergsma, J., Movellan, J.R., Ruvolo, P.L. (2009). Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In Advances in neural information processing systems (pp. 2035–2043).

  • Wiering, M., & Schmidhuber, J. (1998). Efficient model-based exploration. In Proceedings of the Fifth international conference on simulation of adaptive behavior (SAB’98) (pp. 223–228).

  • Xu, Z., Akella, R., Zhang, Y. (2007). Incorporating diversity and density in active learning for relevance feedback. In ECIr, vol. 7 (pp. 246–257). Springer.

  • Yang, L., & Carbonell, J.G. (2013). Buy-in-bulk active learning. In Advances in neural information processing systems (pp. 2229–2237).

  • Yolum, P., & Singh, M.P. (2003). Dynamic communities in referral networks. Web Intelligence and Agent Systems, 1(2), 105–116.

    Google Scholar 

  • Yu, B. (2002). Emergence and evolution of agent-based referral networks. Ph.D. thesis: North Carolina State University.

    Google Scholar 

  • Yu, B., & Singh, M.P. (2003). Searching social networks. In Proceedings of the second international joint conference on Autonomous agents and multiagent systems AAMAS 03.

  • Yu, B., Venkatraman, M., Singh, M.P. (2003). An adaptive social network for information access: theoretical and experimental results. Applied Artificial Intelligence, 17, 21–38.

    Article  Google Scholar 

  • Yu, L. (2011). Crowd creativity through combination. In Proc. of creativity and cognition 2015 (pp. 471–472). ACM.

  • Yu, L., & Nickerson, J.V. (2013). An internet-scale idea generation system. ACM Transactions on Interactive Intelligent Systems (TiiS), 3(1), 2.

    Google Scholar 

  • Zhang, H., & Lesser, V.R. (2007). A reinforcement learning based distributed search algorithm for hierarchical peer-to-peer information retrieval systems. In Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems (p. 47). ACM.

  • Zhang, C., Lesser, V.R., Shenoy, P. (2009). A multi-agent learning approach to online distributed resource allocation. In Proc. of IJCAI-09, vol. 1 (pp. 361–366). Pasadena. http://mas.cs.umass.edu/paper/467.

  • Zhang, C., Lesser, V.R., Abdallah, S. (2010). Self-organization for coordinating decentralized reinforcement learning. In van der Hoek, K. (Ed.) Proc. of AAMAS ’10 (pp. 739–746). Toronto. http://mas.cs.umass.edu/paper/482.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ashiqur R. KhudaBukhsh.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

KhudaBukhsh, A.R., Carbonell, J.G. & Jansen, P.J. Robust learning in expert networks: a comparative analysis. J Intell Inf Syst 51, 207–234 (2018). https://doi.org/10.1007/s10844-018-0515-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-018-0515-6

Keywords

Navigation