Mining for the Evil—Or But Poking in Shades of Grey?
The paper deals with the removal of unwanted comments from social networks and on-line portals. It is purely conceptual and discusses the following subproblems from a data-analytic point of view: What targets should be aimed at—and which not? How to extract the relevant information out of a comment, how to make it to the framework of supervised classification? Which ternary classifiers are suitable, how to measure their merits? First, it stresses the need to deal with the problem in sober linguistical constructs and advocates the juridical aspect as the peg on which to hang analysis. Secondly, the approach is in the spirit of sentiment analysis. As for the classification inherent in that approach, a multi-layer filter is suggested to overcome complexity, each layer specialised for detecting one special kind of evil. Thirdly, it discusses classification risks and suggests stability and transparency as subordinate selection criteria. Many authors in the field relate the questions raised here to the information retrieval problem. That idea is discussed as well. The paper’s structure corresponds to the above subproblems.
KeywordsMedia mediation Sentiment analysis Infringement mining Ternary classification
- Huff, D. (1954). How to lie with statistics. W.W. Norton & Company.Google Scholar
- Köffer, S., Riehle, D., Höhenberger, S., & Becker, J. (2018). What is abusive language? - Discussing different views on abusive language for machine learning.Google Scholar
- Liu, B. (2011). Web data mining: Exploring hyperlinks, contents, and usage data (2nd ed.). Springer.Google Scholar