Causality in Databases
A causal rule between two variables, X → Y , captures the relation- ship that the presence of X causes the appearance of Y . Because of its usefulness (in comparison with association rules), the techniques for mining causal rules are beginning to be developed. However, the effectiveness of existing methods, such as LCD and CU-path algo- rithms, is limited for mining causal rules among invariable items. These techniques are not adequate for the discovery and representa- tion of causal rules among multi-value variables. In this chapter, we propose techniques for mining causality between the variables X and Y by partitioning, where causality is represented in the form X → Y , with the conditional probability matrix M Y❘X . These techniques are also applied to find causal rules in probabilistic databases. This chapter begins by stating the problems faced in Section 4.1. Some necessary basic concepts are defined in Section 4.2. In Section 4.3 we first define a ‘good partition’ for generating item variables from items, and we then present a method of mining causality of interest from large databases. In Section 4.4 we advocate an approach for finding dependencies among variables. In Section 4.5 we apply the proposed causality mining techniques to mining probabilistic databases. And finally, we conclude in Section 4.6.
KeywordsAssociation Rule Good Partition Item Variable Valid Rule Probabilistic Database
Unable to display preview. Download preview PDF.