Abstract
We propose a lazy approach to build a content-based recommender system for parliamentary documents. Given a new document to be recommended, the system will decide what Members of the Parliament could find interesting such a document, in order to deliver it to them. Our approach is lazy because we do not build an elaborated profile of each deputy, but collect all the text of his/her speeches within the parliament debates and generate a document collection where we can search through queries. In this way we transform a recommender system problem into an information retrieval problem. Our proposals are tested using the documents of the regional Parliament of Andalusia at Spain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
An initiative is the literal transcription of the discussion in the parliament of a petition presented by the MPs or groups.
- 2.
A type of binary classification problem where we have a set of positive examples and another larger set of unlabeled examples, but there is no set of negative examples.
- 3.
- 4.
An example of the title of an initiative is “Non-legislative proposal on social and employment situation of women in Andalusia”.
- 5.
Using the MoreLikeThis facility in Lucene.
- 6.
- 7.
Note that we have set to 200 the number of documents returned by the search engine.
- 8.
Although it is not important for our purposes, the best performing model is BM25.
References
Aslam, J.A., Montague, M.: Models for metasearch. In: Proceedings. of the 24th Annual International ACM SIGIR Conference, pp. 24–37 (2003)
Belkin, N.J., Croft, W.B.: Information filtering and information retrieval: two sides of the same coin? Commun. ACM 35, 29–38 (1992)
Billsus, D., Pazzani, M., Chen, J.: A learning agent for wireless news access. In: Proceedings of the International Conference on Intelligent User Interfaces, pp. 33–36 (2002)
Busby, A., Belkacem, K.: Coping with the Information Overload: An Exploration of Assistants’ Backstage Role in the Everyday Practice of European Parliament Politics. European Integration online Papers. vol. 17 (2013)
de Campos, L.M., Fernández-Luna, J.M., Huete, J.F., Martin-Dancausa, C.J., Tur-Vigil, C., Tagua, A.: An integrated system for managing the andalusian parliament’s digital library. Program Electron. Libr. Inf. Syst. 43, 121–139 (2009)
Fox, E.A., Shaw, J.A.: Combination of multiple searches. In: Proccedings of the Second Text REtrieval Conference (TREC-2), pp. 243–252 (1994)
Hanani, U., Shapira, B., Shoval, P.: Information filtering: overview of issues, research and systems. User Model. User-Adap. Inter. 11, 203–259 (2001)
Jarvelin, K., Kekalainen, J.: Cumulative gain-based evaluation of ir techniques. ACM Trans. Inf. Syst. 20, 422–446 (2002)
Kim, J., Lee, B., Shaw, M., Chang, H., Nelson, W.: Application of decision-tree induction techniques to personalized advertisements on internet storefronts. Int. J. Electron. Commer. 5, 45–62 (2001)
Lantz, B.: Machine Learning with R. Packt Publishing Ltd, Birmingham (2013)
Li, X., Liu, B.: Learning to classify texts using positive and unlabeled data. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence, pp. 587–592 (2003)
Lops, P., de Gemmis, M., Semerano, G.: Content-based recommender systems: state of the art and trends. In: Ricci, F., Rokach, L., Shapira, B., Kantor, P.B. (eds.) Recommender Systems Handbook, pp. 73–105. Springer, New York (2011)
Marchionini, G., Samet, H., Brandt, L.: Digital government. Commun. ACM 46, 25–27 (2003)
Montague, M., Aslam, J.A.: Relevance score normalization for metasearch. In: Proceedings of the 2001 ACM CIKM International Conference on Information and Knowledge Management, pp. 427–433 (2001)
Palvia, S.C.J., Sharma, S.S.: E-government and e-governance: definitions/domain framework and status around the world wide web. foundations of e-government. In: 5th International Conference on E-Governance, pp. 1–12 (2007)
Pazzani, M., Billsus, D.: Learning and revising user profiles: the identification of interesting web sites. Mach. Learn. 27, 313–331 (1997)
Pazzani, M.J., Billsus, D.: Content-based recommendation systems. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) Adaptive Web 2007. LNCS, vol. 4321, pp. 325–341. Springer, Heidelberg (2007)
Soucy, P., Mineau, G.W.: A simple KNN algorithm for text categorization. In: Proceedings of the IEEE International Conference on Data Mining, pp. 647–648 (2001)
Wu, S.: Data Fusion in Information Retrieval. Adaptation, Learning, and Optimization, vol. 13. Springer, Heidelberg (2012)
Wu, S., Crestani, F.: Data fusion with estimated weights. In: Proceedings of the 2002 ACM CIKM International Conference on Information and Knowledge Management, pp. 648–651 (2002)
Acknowledgements
Paper supported by the Spanish “Ministerio de Ciencia e Innovación” and “Ministerio de Economía y Competitividad” under the projects TIN2011-28538-C02-02 and TIN2013-42741-P.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
de Campos, L.M., Fernández-Luna, J.M., Huete, J.F. (2015). A Lazy Approach for Filtering Parliamentary Documents. In: Kő, A., Francesconi, E. (eds) Electronic Government and the Information Systems Perspective. EGOVIS 2015. Lecture Notes in Computer Science, vol 9265. Springer, Cham. https://doi.org/10.1007/978-3-319-22389-6_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-22389-6_26
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22388-9
Online ISBN: 978-3-319-22389-6
eBook Packages: Computer ScienceComputer Science (R0)