Abstract
Taxonomies are great for organizing and searching web content. As such, many popular classes of web applications, utilize them. However, their manual generation and maintenance by experts is a time-costly procedure, resulting in static taxonomies. On the other hand, mining and statistical approaches may produce low quality taxonomies. We thus propose a drastically new approach, based on the proven, increased human involvement and desire to tag/annotate web content. We define the required input from humans in the form of explicit structural, e.g., supertype-subtype relationships between concepts. Hence we harvest, via common annotation practices, the collective wisdom of users with respect to the (categorization of) web content they share and access. We further define the principles upon which crowdsourced taxonomy construction algorithms should be based. The resulting problem is NP-Hard. We thus provide and analyze heuristic algorithms that aggregate human input and resolve conflicts. We evaluate our approach with synthetic and real-world crowdsourcing experiments and on a real-world taxonomy.
This work was partially funded by the EIKOS research project, within the THALES framework, administered by the Greek Ministry for Education, Life Long Learning, and Religious Affairs.
Chapter PDF
Similar content being viewed by others
References
Endeca, http://www.endeca.com/
Facetmap, http://facetmap.com/
Alonso, O., Lease, M.: Crowdsourcing 101: Putting the wsdm of crowds to work for you: A tutorial. In: International Conference on WSDM (February 2011)
Au Yeung, C.-M., Gibbins, N., Shadbolt, N.: User-induced links in collaborative tagging systems. In: Proceeding of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009 (2009)
Barla, M., Bieliková, M.: On deriving tagsonomies: Keyword relations coming from crowd. In: Computational Collective Intelligence. Semantic Web, Social Networks and Multiagent Systems (2009)
Brooks, C.H., Montanez, N.: Improved annotation of the blogosphere via autotagging and hierarchical clustering. In: Proceedings of the 15th International Conference on World Wide Web, WWW 2006 (2006)
Doan, A., Ramakrishnan, R., Halevy, A.Y.: Crowdsourcing systems on the world-wide web. Commun. ACM (2011)
Franklin, M.J., Kossmann, D., Kraska, T., Ramesh, S., Xin, R.: Crowddb: answering queries with crowdsourcing. In: ACM SIGMOD Conference (2011)
Garey, M., Johnson, D.: Computers and Intractability: A Guide to the Theory of NP-Completeness
Halpin, H., Robu, V., Shepherd, H.: The complex dynamics of collaborative tagging. In: 16th WWW Conference (2007)
Heymann, P., Garcia-Molina, H.: Collaborative creation of communal hierarchical taxonomies in social tagging systems. Technical report (2006)
Heymann, P., Paepcke, A., Garcia-Molina, H.: Tagging human knowledge. In: Third ACM International Conference on Web Search and Data Mining, WSDM 2010 (2010)
Ipeirotis, P.: Managing crowdsourced human computation: A tutorial. In: International Conference on WWW (March 2011)
Liu, K., Fang, B., Zhang, W.: Ontology emergence from folksonomies. In: 19th ACM CIKM (2010)
Marlow, C., Naaman, M., Boyd, D., Davis, M.: Position Paper, Tagging, Taxonomy, Flickr, Article, ToRead. In: Collaborative Web Tagging Workshop at WWW 2006 (2006)
Plangprasopchok, A., Lerman, K.: Constructing folksonomies from user-specified relations on flickr. In: 18th WWW Conference (2009)
Plangprasopchok, A., Lerman, K., Getoor, L.: Growing a tree in the forest: constructing folksonomies by integrating structured metadata. In: 6th ACM SIGKDD Conference (2010)
Sanderson, M., Croft, B.: Deriving concept hierarchies from text. In: 22nd ACM SIGIR Conference (1999)
Schmitz, P.: WWW 2006 (2006)
Shapira, A., Yuster, R., Zwick, U.: All-pairs bottleneck paths in vertex weighted graphs. In: 18th ACM-SIAM SODA Symposium (2007)
Triantafillou, P.: Anthropocentric data systems. In: 37th VLDB Conference (Visions and Challenges) (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Karampinas, D., Triantafillou, P. (2012). Crowdsourcing Taxonomies. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds) The Semantic Web: Research and Applications. ESWC 2012. Lecture Notes in Computer Science, vol 7295. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30284-8_43
Download citation
DOI: https://doi.org/10.1007/978-3-642-30284-8_43
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30283-1
Online ISBN: 978-3-642-30284-8
eBook Packages: Computer ScienceComputer Science (R0)