Robust learning in expert networks: a comparative analysis

KhudaBukhsh, Ashiqur R.; Carbonell, Jaime G.; Jansen, Peter J.

doi:10.1007/s10844-018-0515-6

Robust learning in expert networks: a comparative analysis

Published: 28 June 2018

Volume 51, pages 207–234, (2018)
Cite this article

Journal of Intelligent Information Systems Aims and scope Submit manuscript

Ashiqur R. KhudaBukhsh ORCID: orcid.org/0000-0003-2394-7902¹,
Jaime G. Carbonell¹ &
Peter J. Jansen¹

338 Accesses
7 Citations
Explore all metrics

Abstract

Human experts as well as autonomous agents in a referral network must decide whether to accept a task or refer to a more appropriate expert, and if so to whom. In order for the referral network to improve over time, the experts must learn to estimate the topical expertise of other experts. This article extends concepts from Multi-agent Reinforcement Learning and Active Learning to referral networks for distributed learning in referral networks. Among a wide array of algorithms evaluated, Distributed Interval Estimation Learning (DIEL), based on Interval Estimation Learning, was found to be superior for learning appropriate referral choices, compared to 𝜖-Greedy, Q-learning, Thompson Sampling and Upper Confidence Bound (UCB) methods. In addition to a synthetic data set, we compare the performance of the stronger learning-to-refer algorithms on a referral network of high-performance Stochastic Local Search (SLS) SAT solvers where expertise does not obey any known parameterized distribution. An evaluation of overall network performance and a robustness analysis is conducted across the learning algorithms, with an emphasis on capacity constraints and evolving networks, where experts with known expertise drop off and new experts of unknown performance enter — situations that arise in real-world scenarios but were heretofore ignored.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

Image source: https://en.wikipedia.org/wiki/Fauvism

References

Abdallah, S., & Lesser, V.R. (2006). Learning the task allocation game. In Proc. of AAMAS ’06 (pp. 850–857). ACM.
Agrawal, R. (1995). Sample mean based index policies with O(log n) regret for the multi-armed bandit problem. Advances in Applied Probability pp. 1054–1078.
Agrawal, S., & Goyal, N. (2012). Analysis of thompson sampling for the multi-armed bandit problem. In COLT (pp. 39–1).
Applegate, D.L., Bixby, R.E., Chvatal, V., Cook, W.J. (2011). The traveling salesman problem: a computational study. Princeton: Princeton University Press.
MATH Google Scholar
Audibert, J.Y., & Bubeck, S. (2010). Regret bounds and minimax policies under partial monitoring. Journal of Machine Learning Research, 11(Oct), 2785–2836.
MathSciNet MATH Google Scholar
Audibert, J.Y., Munos, R., Szepesvári, C. (2007). Tuning bandit algorithms in stochastic environments. In International conference on algorithmic learning theory (pp. 150–165). Springer.
Auer, P., Cesa-Bianchi, N., Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2-3), 235–256.
Article MATH Google Scholar
Axelrod, R. (2003). Advancing the art of simulation in the social sciences. Journal of the Japanese and International Economies, 12(3), 16–22.
Google Scholar
Barabási, A.L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.
Article MathSciNet MATH Google Scholar
Bernstein, M.S., Little, G., Miller, R.C., Hartmann, B., Ackerman, M.S., Karger, D.R., Crowell, D., Panovich, K. (2010). Soylent: a word processor with a crowd inside. In Proc. of UIST ’10 (pp. 313–322). ACM.
Berry, D.A., & Fristedt, B. (1985). Bandit problems: sequential allocation of experiments (Monographs on statistics and applied probability) Vol. 12. Berlin: Springer.
Book MATH Google Scholar
Biere, A., Cimatti, A., Clarke, E.M., Fujita, M., Zhu, Y. (1999). Symbolic model checking using SAT procedures instead of BDDs. In Proceedings of the 36th annual ACM/IEEE design automation conference (pp. 317–320). ACM.
Biere, A., Heule, M., van Maaren, H. (2009). Handbook of satisfiability Vol. 185. Amsterdam: IOS Press.
MATH Google Scholar
Blum, A., & Mansour, Y. (2007). From external to internal regret. Journal of Machine Learning Research, 8(Jun), 1307–1324.
MathSciNet MATH Google Scholar
Brinker, K. (2003). Incorporating diversity in active learning with support vector machines. In Proceedings of the 20th international conference on machine learning (ICML-03) (pp. 59–66).
Chakrabarti, D., Kumar, R., Radlinski, F., Upfal, E. (2009). Mortal multi-armed bandits. In Advances in neural information processing systems (pp. 273–280).
Chapelle, O., & Li, L. (2011). An empirical evaluation of thompson sampling. In Advances in neural information processing systems (pp. 2249–2257).
Cheng, J., & Bernstein, M.S. (2015). Flock: hybrid crowd-machine learning classifiers. In Proc. of CSCW 2015 (pp. 600–611). ACM.
Cook, S.A. (1971). The complexity of theorem-proving procedures. In Proceedings of the third annual ACM symposium on theory of computing (pp. 151–158). ACM.
Crawford, J.M., & Baker, A.B. (1994). Experimental results on the application of satisfiability algorithms to scheduling problems. In AAAI, vol. 2 (pp. 1092–1097).
Donmez, P., & Carbonell, J.G. (2008). Proactive learning: cost-sensitive active learning with multiple imperfect oracles. Proceedings of CIKM ’08, 08, 619–628.
Article Google Scholar
Donmez, P., Carbonell, J.G., Schneider, J. (2009). Efficiently learning the accuracy of labeling sources for selective sampling. In Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 259–268). ACM.
Donmez, P., Carbonell, J.G., Schneider, J. (2010). A probabilistic framework to learn from multiple annotators with Time-Varying accuracy. In Proceedings of the SIAM international conference on data mining (SDM 2010) (pp 826–837).
Foner, L.N. (1997). Yenta: a multi-agent, referral-based matchmaking system. In Proceedings of the first international conference on autonomous agents (pp. 301–307). ACM.
Fraenkel, A.S. (1993). Complexity of protein folding. Bulletin of Mathematical Biology, 55(6), 1199–1210.
Article MATH Google Scholar
Freund, Y., Schapire, R.E., Singer, Y., Warmuth, M.K. (1997). Using and combining predictors that specialize. In Proceedings of the twenty-ninth annual ACM symposium on theory of computing (pp. 334–343). ACM.
Garivier, A., & Cappé, O. (2011). The KL-UCB algorithm for bounded stochastic bandits and beyond. In Proceedings of the 24th annual conference on learning theory (pp. 359–376).
Gelder, A.V. (2008). Another look at graph coloring via propositional satisfiability. Discrete Applied Mathematics, 156(2), 230–243.
Article MathSciNet MATH Google Scholar
Guo, Y., & Schuurmans, D. (2008). Discriminative batch mode active learning. In Advances in neural information processing systems (pp. 593–600).
Heimerl, K., Gawalt, B., Chen, K., Parikh, T., Hartmann, B. (2012). Communitysourcing: engaging local crowds to perform expert work via physical kiosks. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 1539–1548). ACM.
Hoi, S., Jin, R., Lyu, M.R. (2006). Large-scale text categorization by batch mode active learning. In Proceedings of the 15th international conference on World Wide Web (pp. 633–642). ACM.
Hoi, S., Jin, R., Zhu, J., Lyu, M.R. (2006). Batch mode active learning and its application to medical image classification. In Proceedings of the 23rd international conference on machine learning (pp. 417–424). ACM.
Holme, P., & Kim, B.J. (2002). Growing scale-free networks with tunable clustering. Physical Review E, 65(2), 026,107.
Article Google Scholar
Jaakkola, T., Jordan, M.I., Singh, S.P. (1994). Convergence of stochastic iterative dynamic programming algorithms. In Advances in neural information processing systems (pp. 703–710).
Jensen, D., & Neville, J. (2002). Data mining in social networks. In National academy of sciences symposium on dynamic social network modeling and analysis.
Kaelbling, L.P. (1993). Learning in embedded systems. Cambridge: MIT Press.
Google Scholar
Kaelbling, L.P., Littman, M.L., Moore, A.P. (1996). Reinforcement learning: a survey. Journal of Artificial Intelligence Research, 4, 237–285.
Article Google Scholar
Kandasamy, K., Krishnamurthy, A., Schneider, J., Poczos, B. (2017). Asynchronous parallel bayesian optimisation via thompson sampling. arXiv:1705.09236.
Kapoor, A., Horvitz, E., Basu, S. (2007). Selective supervision: guiding supervised learning with decision-theoretic active learning. In IJCAI, vol. 7 (pp. 877–882).
Kaufmann, E., Cappé, O., Garivier, A. (2012). On bayesian upper confidence bounds for bandit problems. In Artificial intelligence and statistics (pp. 592–600).
Kautz, H., & Selman, B. (1996). Pushing the envelope: planning, propositional logic, and stochastic search. In Proceedings of the national conference on artificial intelligence (pp. 1194–1201).
Kautz, H., & Selman, B. (1999). Unifying SAT-based and graph-based planning. In IJCAI, vol. 99 (pp. 318–325).
Kautz, H., Selman, B., Milewski, A. (1996). Agent amplified communication pp. 3–9.
KhudaBukhsh, A.R., Xu, L., Hoos, H.H., Leyton-brown, K. (2009). SATenstein: automatically building local search SAT solvers from components. In IJCAI, vol. 9 (pp. 517–524).
KhudaBukhsh, A.R., Xu, L., Hoos, H.H., Leyton-brown, K. (2016). SATenstein: automatically building local search SAT solvers from components. Artificial Intelligence, 232, 20–42.
Article MathSciNet MATH Google Scholar
KhudaBukhsh, A.R., Carbonell, J.G., Jansen, P.J. (2016). Proactive-DIEL in evolving referral networks. In European conference on multi-agent systems (pp. 148–156). Springer.
KhudaBukhsh, A.R., Carbonell, J.G., Jansen, P.J. (2016a). Proactive skill posting in referral networks. In Australasian joint conference on artificial intelligence (pp. 585–596). Springer.
KhudaBukhsh, A.R., Jansen, P.J., Carbonell, J.G. (2016b). Distributed learning in expert referral networks. In European conference on artificial intelligence (ECAI), 2016 (pp. 1620–1621).
KhudaBukhsh, A.R., Carbonell, J.G., Jansen, P.J. (2017). Incentive compatible proactive skill posting in referral networks. In European conference on multi-agent systems, p. [to appear]. Springer.
King, R.D., Whelan, K.E., Jones, F.M., Reiser, P., Bryant, C.H., Muggleton, S.H., Kell, D.B., Oliver, S.G. (2004). Functional genomic hypothesis generation and experimentation by a robot scientist. Nature, 427(6971), 247–252.
Article Google Scholar
Kleinberg, R., Niculescu-Mizil, A., Sharma, Y. (2010). Regret bounds for sleeping experts and bandits. Machine Learning, 80(2-3), 245–272.
Article MathSciNet MATH Google Scholar
Lai, T.L., & Robbins, H. (1985). Asymptotically efficient adaptive allocation rules. Advances in Applied Mathematics, 6(1), 4–22.
Article MathSciNet MATH Google Scholar
Lewis, D.D., & Catlett, J. (1994). Heterogeneous uncertainty sampling for supervised learning. In Proceedings of the eleventh international conference on machine learning (pp. 148–156).
Lewis, D.D., & Gale, W.A. (1994). A sequential algorithm for training text classifiers. In Proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval (pp. 3–12). New York: Springer.
Lin, S., Hong, W., Wang, D., Li, T. (2017). A survey on expert finding techniques. Journal of Intelligent Information Systems pp. 1–25.
Littman, M.L., & Szepesvári, C. (1996). A generalized reinforcement-learning model: convergence and applications. In ICML (pp. 310–318).
Manavalan, P., & Singh, M.P. (2012). Emerging properties of knowledge sharing referral networks: considerations of effectiveness and fairness. Lecture Notes in Computer Science pp. 13–23.
May, B.C., Korda, N., Lee, A., Leslie, D.S. (2012). Optimistic bayesian sampling in contextual-bandit problems. Journal of Machine Learning Research, 13 (Jun), 2069–2106.
MathSciNet MATH Google Scholar
McDonald, D.W., & Ackerman, M.S. (2000). Expertise recommender: a flexible recommendation system and architecture. In CSCW ’00 Proceedings of the 2000 ACM conference on computer supported cooperative work (pp 231–240).
Nallapati, R., Peerreddy, S., Singhal, P. (2012). Skierarchy: extending the power of crowdsourcing using a hierarchy of domain experts, crowd and machine learning. Tech. rep., DTIC Document.
Pop, M., Salzberg, S.L., Shumway, M. (2002). Genome sequence assembly: algorithms and issues. Computer, 35(7), 47–54.
Article Google Scholar
Pushpa, S., Easwarakumar, K.S., Elias, S., Maamar, Z. (2010). Referral based expertise search system in a time evolving social network. In Proceedings of the Third annual ACM bangalore conference on - COMPUTE ’10 (pp 1–8).
Qi, G.J., Hua, X.S., Rui, Y., Tang, J., Zhang, H.J. (2008). Two-dimensional active learning for image classification. In IEEE conference on computer vision and pattern recognition, 2008. CVPR 2008 (pp. 1–8). IEEE.
Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., Moy, L. (2010). Learning from crowds. Journal of Machine Learning Research, 11(Apr), 1297–1322.
MathSciNet Google Scholar
Reichart, R., Tomanek, K., Hahn, U., Rappoport, A. (2008). Multi-task active learning for linguistic annotations. In ACL, vol. 8 (pp. 861–869).
Settles, B. (2010). Active learning literature survey. University of Wisconsin, Madison, 52(55-66), 11.
Google Scholar
Sheng, V.S., & Ling, C.X. (2006). Feature value acquisition in testing: a sequential batch test algorithm. In Proceedings of the 23rd international conference on Machine learning (pp. 809–816). ACM.
Sheng, V.S., Provost, F., Ipeirotis, P.G. (2008). Get another label? improving data quality and data mining using multiple, noisy labelers. In Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 614–622). ACM.
Snow, R., O’Connor, B., Jurafsky, D., Ng, A.Y. (2008). Cheap and fast—but is it good?: evaluating non-expert annotations for natural language tasks. In Proceedings of the conference on empirical methods in natural language processing (pp. 254–263). Association for Computational Linguistics.
Sorokin, A., & Forsyth, D. (2008). Utility data annotation with amazon mechanical turk. In: IEEE computer society conference on computer vision and pattern recognition workshops, 2008. CVPRW’08 (pp. 1–8). IEEE.
Stephan, P., Brayton, R.K., Sangiovanni-Vincentelli, A.L. (1996). Combinational test generation using satisfiability. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 15(9), 1167–1176.
Article Google Scholar
Thompson, W.R. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3/4), 285–294.
Article MATH Google Scholar
Tsitsiklis, J.N. (1994). Asynchronous stochastic approximation and Q-learning. Machine Learning, 16(3), 185–202.
MATH Google Scholar
Vijayanarasimhan, S., & Grauman, K. (2011). Cost-sensitive active visual category learning. International Journal of Computer Vision, 91(1), 24–44.
Article MATH Google Scholar
van Hasselt, H. (2010). Double Q-learning. In Advances in neural information processing systems (pp. 2613–2621).
Watkins, C., & Dayan, P. (1992). Q-learning. Machine Learning, 8(3-4), 279–292.
Article MATH Google Scholar
Watts, D.J., & Strogatz, S.H. (1998). Collective dynamics of `small-world’networks. Nature, 393(6684), 440–442.
Article MATH Google Scholar
Whitehill, J., Wu, T., Bergsma, J., Movellan, J.R., Ruvolo, P.L. (2009). Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In Advances in neural information processing systems (pp. 2035–2043).
Wiering, M., & Schmidhuber, J. (1998). Efficient model-based exploration. In Proceedings of the Fifth international conference on simulation of adaptive behavior (SAB’98) (pp. 223–228).
Xu, Z., Akella, R., Zhang, Y. (2007). Incorporating diversity and density in active learning for relevance feedback. In ECIr, vol. 7 (pp. 246–257). Springer.
Yang, L., & Carbonell, J.G. (2013). Buy-in-bulk active learning. In Advances in neural information processing systems (pp. 2229–2237).
Yolum, P., & Singh, M.P. (2003). Dynamic communities in referral networks. Web Intelligence and Agent Systems, 1(2), 105–116.
Google Scholar
Yu, B. (2002). Emergence and evolution of agent-based referral networks. Ph.D. thesis: North Carolina State University.
Google Scholar
Yu, B., & Singh, M.P. (2003). Searching social networks. In Proceedings of the second international joint conference on Autonomous agents and multiagent systems AAMAS 03.
Yu, B., Venkatraman, M., Singh, M.P. (2003). An adaptive social network for information access: theoretical and experimental results. Applied Artificial Intelligence, 17, 21–38.
Article Google Scholar
Yu, L. (2011). Crowd creativity through combination. In Proc. of creativity and cognition 2015 (pp. 471–472). ACM.
Yu, L., & Nickerson, J.V. (2013). An internet-scale idea generation system. ACM Transactions on Interactive Intelligent Systems (TiiS), 3(1), 2.
Google Scholar
Zhang, H., & Lesser, V.R. (2007). A reinforcement learning based distributed search algorithm for hierarchical peer-to-peer information retrieval systems. In Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems (p. 47). ACM.
Zhang, C., Lesser, V.R., Shenoy, P. (2009). A multi-agent learning approach to online distributed resource allocation. In Proc. of IJCAI-09, vol. 1 (pp. 361–366). Pasadena. http://mas.cs.umass.edu/paper/467.
Zhang, C., Lesser, V.R., Abdallah, S. (2010). Self-organization for coordinating decentralized reinforcement learning. In van der Hoek, K. (Ed.) Proc. of AAMAS ’10 (pp. 739–746). Toronto. http://mas.cs.umass.edu/paper/482.

Download references

Author information

Authors and Affiliations

Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, 15213, USA
Ashiqur R. KhudaBukhsh, Jaime G. Carbonell & Peter J. Jansen

Authors

Ashiqur R. KhudaBukhsh
View author publications
You can also search for this author in PubMed Google Scholar
Jaime G. Carbonell
View author publications
You can also search for this author in PubMed Google Scholar
Peter J. Jansen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ashiqur R. KhudaBukhsh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

KhudaBukhsh, A.R., Carbonell, J.G. & Jansen, P.J. Robust learning in expert networks: a comparative analysis. J Intell Inf Syst 51, 207–234 (2018). https://doi.org/10.1007/s10844-018-0515-6

Download citation

Received: 01 February 2018
Revised: 11 June 2018
Accepted: 13 June 2018
Published: 28 June 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s10844-018-0515-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Robust learning in expert networks: a comparative analysis

Abstract

Access this article

Similar content being viewed by others