Table 8 The Cora Research Paper Classification dataset. Top: The good (r p ) and bad ratios (for τ=2), using the two linear SVM classifiers trained on Citations or Authors only for the 11 top level classes. The percentage of positive instances is shown in parentheses for each class. Bottom: Ranking performance (recall at two precision thresholds and Max F1), using SVM classifiers, averaged over the 11 problems

From: On using nearly-independent feature families for high precision and confidence

  r p r fp   r p r fp
AI (35 %) 1.7 3.3 HW Arch (4 %) 10 11
IR (2 %) 17 9.2 Theory (10 %) 5.5 8
DB (4 %) 11 15 Prog. (13 %) 4.5 6.7
Encr. (4 %) 12 10 HCI (5 %) 11.1 11.7
OS (8 %) 5.9 7 Data (8 %) 8.1 8.7
Netw. (5 %) 6.1 6.9    
  Prec.
99 % 95 % Max F1
Author only 0.02 0.03 0.54
Citations only 0.03 0.13 0.71
Append 0.04 0.19 0.73
AVG 0.09 0.19 0.73
NoisyOR 0.08 0.18 0.72
NoisyOR+AVG 0.09 0.21 0.73