Bayesian Theory of Decision

Camastra, Francesco; Vinciarelli, Alessandro

doi:10.1007/978-1-4471-6735-8_5

Francesco Camastra¹⁴ &
Alessandro Vinciarelli¹⁵

Part of the book series: Advanced Information and Knowledge Processing ((AI&KP))

4631 Accesses

Abstract

What the reader should know to understand this chapter $\bullet $ Basic notions of statistics and probability theory (see Appendix A). $\bullet $ Calculus notions are an advantage.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Hardcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
$\mathbf {I}_{\alpha (\mathbf {x}) = 1}$ is 1 if ${\alpha (\mathbf {x}) = 1}$; 0 otherwise.
2.
Since $\mathbf {I}_{\alpha ^{\star }(\mathbf {x}) = 1}$ is 1, the term must be nonnegative.
3.
Since $\mathbf {I}_{\alpha ^{\star }(\mathbf {x}) = 1}$ is 0, the term must be nonpositive.

References

T. Bayes. An essay towards solving a problem in the doctrine of chances. Philosophical Transactions of the Royal Society, 1763.
Google Scholar
J. O. Berger. Statistical Decision Theory and Bayesian Analysis. Springer-Verlag, 1985.
Google Scholar
J. M. Bernardo and A. F. M. Smith. Bayesian Theory. John Wiley, 1986.
Google Scholar
P. Comon. Independent component analysis: A new concept? Signal Processing, 36(1):287–314, 1994.
Google Scholar
M. H. De Groot. Optimal Statistical Decisions. McGraw-Hill, 1970.
Google Scholar
L. Devroye, L. Gyorfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer-Verlag, 1996.
Google Scholar
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. John Wiley, 2001.
Google Scholar
T. S. Ferguson. Mathematical Statistics: A Decision-Theoretic Approach. Academic Press, 1967.
Google Scholar
R. A. Fisher. The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2):179–188, 1936.
Google Scholar
K. Fukunaga. Introduction to Statistical Pattern Recognition. Academic Press, 1990.
Google Scholar
D. Green and J.A. Swets. Signal Detection Theory and Psychophysics. Wiley, 1974.
Google Scholar
A. Hyvarinen. Survey on independent component analysis. Neural Computing Surveys, 2(1):94–128, 1999.
Google Scholar
I. T. Jolliffe. Principal Component Analysis. Springer-Verlag, 1986.
Google Scholar
G. A. Korn and T. M. Korn. Mathematical Handbook for Scientists and Engineers. Dover, 1961.
Google Scholar
P. M. Lee. Bayesian Statistics: An Introduction. Edward Arnold, 1989.
Google Scholar
D. V. Lindley. Making Decisions. John Wiley, 1991.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Science and Technology, Parthenope University of Naples, Naples, Italy
Francesco Camastra
School of Computing Science and the Institute of Neuroscience and Psychology, University of Glasgow, Glasgow, UK
Alessandro Vinciarelli

Authors

Francesco Camastra
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Vinciarelli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francesco Camastra .

Problems

5.1

Given a normal distribution $\mathcal {N}(\sigma ,\mu )$, show that the percentage of samples that assume values in $[-3\sigma , 3\sigma ]$ exceeds 99 %.

5.2

Consider the function $f(x)= \frac{a}{1+x^2}$ where $a \in \mathbb {R}$. Find the value a such that f(x) is a probability density. Besides, compute the expected value of x.

5.3

Consider the Geometric distribution [14] defined by:

$$ p(x)= \theta (1-\theta )^x \quad (x=0,1,2,\dots , 0\le \theta \le 1). $$

Prove that its mean is $\mathcal {E}[x]= \frac{1-\theta }{\theta }$.

5.4

Given a probability density f(x), the moment of fourth order [14] is defined by

$$ \frac{1}{\sigma ^4} \int _{-\infty }^{\infty } f(x) (x-\mu )^4 \textit{dx} $$

where $\mu $ and $\sigma ^2$ are, respectively, the mean and the variance.

Prove that the moment of fourth-order of a normal distribution $\mathcal {N}(\mu ,\sigma )$ is 3.

5.5

Let $x=(x_1,\dots ,x_{\ell })$ and $y=(y_1,\dots ,y_{\ell })$ be two variables. Prove that if they are statistically independent their covariance is null.

5.6

Suppose we have two classes $\mathcal {C}_1$ and $\mathcal {C}_2$ with a priori probabilities $p(\mathcal {C}_1)= \frac{1}{3}$ and $p(\mathcal {C}_2)= \frac{2}{3}$. Suppose that their likelihoods are $p(x|\mathcal {C}_1)= \mathcal {N}(1,1)$ and $p(x|\mathcal {C}_2)= \mathcal {N}(1,0)$. Find numerically the value of x such that the posterior probabilities $p(\mathcal {C}_1|x)$, $p(\mathcal {C}_2|x)$ are equal.

5.7

Suppose we have two classes $\mathcal {C}_1$ and $\mathcal {C}_2$ with a priori probabilities $p(\mathcal {C}_1)= \frac{2}{5}$ and $p(\mathcal {C}_2)= \frac{3}{5}$. Suppose that their likelihoods are $p(x|\mathcal {C}_1)= \mathcal {N}(1,0)$ and $p(x|\mathcal {C}_2)= \mathcal {N}(1,1)$. Compute the joint probability such that both points $x_1= -0.1$, $x_2= 0.2$ belong to $\mathcal {C}_1$.

5.8

Suppose we have two classes $\mathcal {C}_1$ and $\mathcal {C}_2$ with a priori probabilities $p(\mathcal {C}_1)= \frac{1}{4}$ and $p(\mathcal {C}_2)= \frac{3}{4}$. Suppose that their likelihoods are $p(x|\mathcal {C}_1)= \mathcal {N}(2,0)$ and $p(x|\mathcal {C}_2)= \mathcal {N}(0.5,1)$. Compute the likelihood ratio and write the discriminant function.

5.9

Suppose we have three classes $\mathcal {C}_1$, $\mathcal {C}_2$ and $\mathcal {C}_3$ with a priori probabilities $p(\mathcal {C}_1)= \frac{1}{6}$, $p(\mathcal {C}_2)= \frac{1}{3}$ and $p(\mathcal {C}_2)= \frac{1}{2}$. Suppose that their likelihoods are respectively $p(x|\mathcal {C}_1)= \mathcal {N}(0.25,0)$, $p(x|\mathcal {C}_2)= \frac{a}{1+x^2}$ and $p(x|\mathcal {C}_3)= \frac{1}{b+(x-1)^2}$. Find the values a and b such that likelihoods are density functions and write three discriminant functions.

5.10

Implement the whitening transform. Test your implementation transforming Iris Data [9], which can be downloaded by ftp.ics.uci.edu/pub/machine-learning-databases/iris. Verify that the covariance matrix of the transformed data is the identity matrix.

5.11

Suppose that the features are statistically independent and that they have the same variance $\sigma $. In this case where the discriminant function is a linear classifier. Given two adjacent decision regions $\mathcal {D}_1$ and $\mathcal {D}_2$, show that their separating hyperplane is orthogonal to the line connecting the means $\mu _1$ and $\mu _2$.

5.12

Suppose that the covariance matrix is the same for all the classes. The discriminant function is a linear classifier. Given two adjacent decision regions $\mathcal {D}_1$ and $\mathcal {D}_2$ show that their separating hyperplane is not orthogonal to the line connecting the means $\mu _1$ and $\mu _2$.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Camastra, F., Vinciarelli, A. (2015). Bayesian Theory of Decision. In: Machine Learning for Audio, Image and Video Analysis. Advanced Information and Knowledge Processing. Springer, London. https://doi.org/10.1007/978-1-4471-6735-8_5

Download citation

DOI: https://doi.org/10.1007/978-1-4471-6735-8_5
Published: 22 July 2015
Publisher Name: Springer, London
Print ISBN: 978-1-4471-6734-1
Online ISBN: 978-1-4471-6735-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Bayesian Theory of Decision

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Problems

Problems

5.1

5.2

5.3

5.4

5.5

5.6

5.7

5.8

5.9

5.10

5.11

5.12

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation