Skip to main content

Calibrating for Specific Domains

  • Chapter
  • First Online:
Learning Analytics in R with SNA, LSA, and MPIA
  • 2147 Accesses

Abstract

Eigenspace-based models were shown to exhibit greater effectiveness than their simple vector space counterparts in settings that benefit from fuzziness (such as information retrieval or recommender systems). In settings that require precision in representation structure (such as in essay scoring or for conceptual relationship mining), however, improved means to predict model behaviour from parameter settings could ease applicability and increase efficiency by reducing tuning times.

This chapter reports experiences and experiment results from a systematic investigation of tuning parameters, their potential settings, and interdependencies between them. This includes studying the influence of sanitising operations, sampling, dimensionality changes, and degrees of specialisation. Trends indicate that the smaller the corpus, the more domain-specific documents are required. Moreover, recommendations for vocabulary filtering can be derived, dependent on the size of the corpus.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    It turned out that the collected data was too big to analyse on a machine with 32 GB memory.

References

  • Berry, M., Drmac, Z., Jessup, E.R.: Matrices, vector spaces, and information retrieval. SIAM. Rev. 41(2), 335–362 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  • Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. J. Am. Soc. Inf. Sci. 41(6), 391–407 (1990)

    Article  Google Scholar 

  • Landauer, T., Foltz, P., Laham, D.: An introduction to latent semantic analysis. Discourse. Processes. 25(2–3), 259–284 (1998)

    Article  Google Scholar 

  • Leydesdorff, L.: Similarity measures, author cocitation analysis, and information theory. J. Am. Soc. Inf. Sci. 56(7), 69–772 (2005)

    Article  Google Scholar 

  • Quesada, J.: Creating your own LSA space. In: Landauer, T.K., McNamara, D.S., Dennis, S., Kintsch, W. (eds.) Handbook of Latent Semantic Analysis. Lawrence Erlbaum Associates, Mahwah, NJ. (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Wild, F. (2016). Calibrating for Specific Domains. In: Learning Analytics in R with SNA, LSA, and MPIA. Springer, Cham. https://doi.org/10.1007/978-3-319-28791-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-28791-1_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-28789-8

  • Online ISBN: 978-3-319-28791-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics