Abstract
We attempted to predict activity/dominance for soccer games, where activity is defined as the degree of activity of the game as perceived by the viewer, whereas dominance is the degree at which the viewer perceives a particular team to dominate over the other team. Such activity/dominance information would help a layman viewer understand the game. It would also enable construction of an automatic digest creation system that extracts scenes having high activity/dominance. There are two facets of this study: 1. The main part of the underlying prediction model consists of a Stick-Breaking Hidden Markov Model, where the data automatically estimates the number of states of the Markov process behind the data. 2. The data used in this paper is vector time-series data consisting of player, referee, and ball positions, together with team information, acquired by a set of fixed cameras. The problem was approached with a Bayesian framework where learning and prediction were implemented by three different methods: Markov Chain Monte Carlo, Expectation Maximization, and Variational Bayes. The proposed method was tested using a dataset consisting of 10 professional soccer games and was compared against standard regression methods.
Similar content being viewed by others
Notes
There is one instance, however, when we partly use broadcasted TV video to obtain activity/dominance training data from evaluators, which will be described in Section 4.5. Such video data, however, will not be necessary in the prediction phase.
A preliminary experiment showed that an HMM with the Stick-Breaking prior performed better than an ordinary Bayesian HMM.
There would be no impact on the reliability of the final model.
4 The cameras are set so that all parts of the field can be viewed from at least two cameras. However, the details of the image tracking system cannot be presented due to a non-disclosure agreement.
Typically in MCMC, initial samples are not used since the operation is in an initializing phase. This initializing phase is often called the “burn-in” period. After the “burn-in”, a designated number of samples is used for prediction. There are no standard methods for optimally setting the sample size and burn-in period, so most of the time they are set in an empirical manner.
Note that λ presented here is different from λ in the main text.
References
Aigrain P, Zhang H, Petkovic D (1996) Content-based representation and retrieval of visual media: a state-of-the-art review. Representation and retrieval of visual media in multimedia systems pp 3–26
Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Second international symposium on information theory, vol 1. Springer, pp 267–281
Assfalg J, Bertini M, Colombo C, Del Bimbo A, Nunziati W (2003) Semantic annotation of soccer videos: automatic highlights identification. Comp Vision Image Underst 92(2):285–305
Babaguchi N, Kawai Y, Ogura T, Kitahashi T (2004) Personalized abstraction of broadcasted American football video by highlight selection. IEEE Trans Multimed 6(4):575–586
Barnard M, Odobez J, Bengio S (2003) Multi-modal audio-visual event recognition for football analysis. In: Neural networks for signal processing, 2003. IEEE 13th Workshop on NNSP’03. 2003. IEEE, pp 469–478
Beal M (2004) Variational algorithms for approximate Bayesian inference. Ph.D. thesis, University College London
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27. Software available at, http://www.csie.ntu.edu.tw/cjlin/libsvm
Cheng C, Hsu C (2006) Fusion of audio and motion information on HMM-based highlight extraction for baseball games. IEEE Trans Multimed 8(3):585–599
Comon P (1994) Independent component analysis, a new concept? Sig Process 36(3):287–314
DataStadium Inc. (2012). http://www.datastadium.co.jp/
Ding Y, Fan G (2007) Segmental hidden Markov models for view-based sport video analysis. In: Computer Vision and Pattern Recognition, 2007. IEEE Conference on CVPR’07. IEEE, pp 1–8
Duan L, Xu M, Chua T, Tian Q, Xu C (2003) A mid-level representation framework for semantic sports video analysis. In: Proceedings of the eleventh ACM international conference on Multimedia. ACM, pp 33–44
Duan L, Xu M, Tian Q, Xu C, Jin J (2005) A unified framework for semantic shot classification in sports video. IEEE Trans Multimed 7(6):1066–1083
Durbin R (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press
Ferguson T (1973) A Bayesian analysis of some nonparametric problems. Ann Stat:209–230
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22. http://www.jstatsoft.org/v33/i01/
Goldwater S, Griffiths T (2007) A fully Bayesian approach to unsupervised part-of-speech tagging. Proc ACL:744–751
Green P (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82(4):711–732
Huang C, Shih H, Chao C (2006) Semantic analysis of soccer video using dynamic Bayesian network. IEEE Trans Multimed 8(4):749–760
Ishwaran H, James L (2001) Gibbs sampling methods for stick-breaking priors. J Am Stat Assoc 96(453):161–173
Johnson M (2007) Why doesn’t EM find good HMM POS-taggers? In: Proceedings on EMNLP-CoNLL
Kaburagi T, Matsumoto T (2008) A generalized hidden Markov model approach to transmembrane region prediction with Poisson distribution as state duration probabilities. Inf Media Technol 3(2):327–340
Kimura T, Tokuda T, Nakada Y, Nokajima T, Matsumoto T, Doucet A (2013) Expectation-maximization algorithms for inference in Dirichlet processes mixture. Pattern Anal Applic 16(1):55–67
Kokaram A, Rea N, Dahyot R, Tekalp M, Bouthemy P, Gros P, Sezan I (2006) Browsing sports video: trends in sports-related indexing and retrieval work. IEEE Signal Proc Mag 23(2):47–58
Kon H, Haseyama M, Kitajima H (2005) A measurement of team advantage for soccer video analysis. IEIC Technical Report. Institute of Electronics, Information and Communication Engineers 104(646):35–40
Kupiec J (1992) Robust part-of-speech tagging using a hidden Markov model. Comput Speech Lang 6(3):225–242
Mannens E, Troncy R, Braeckman K, Van Deursen D, Van Lancker W, De Sutter R, Van de Walle R (2009) Automatic metadata enrichment in news production. In: Image Analysis for Multimedia Interactive Services, 2009. 10th Workshop on WIAMIS’09. IEEE, pp 61–64
Money AG, Agius H (2010) Elvis: entertainment-led video summaries. ACM Trans Multimed Comput Commun Appl (TOMCCAP) 6(3):17
Motoi S, Misu T, Nakada Y, Yazaki T, Kobayashi G, Matsumoto T, Yagi N (2012) Bayesian event detection for sport games with hidden Markov model. Pattern Anal Applic:1–14
Paisley J, Carin L (2009) Hidden Markov models with stick-breaking priors. IEEE Trans Signal Process 57(10):3905–3917
Rabiner L (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286
Romero LP, Traub MC, Leyssen H, Hardman L (2013) Second screen interactions for automatically web-enriched broadcast video. In: Submitted to: ACM SIGCHI Conference on Human Factors in Computing Systems (CHI 2013) Exploring And Enhancing the User Experience for Television Workshop, Paris, France
Sadlier D, O’Connor N (2005) Event detection in field sports video using audio-visual features and a support vector machine. IEEE Trans Circ Syst Video Technol 15(10):1225–1233
Sasaki H, Nakada Y, Kaburagi T, Matsumoto T (2007) Bayesian angle information HMM with a von Mises distribution and its implementation using a Bayesian Monte Carlo method. In: Proceedings European Symposium on Time Series Prediction, pp 29–38
Schliep A, Schönhuth A, Steinhoff C (2003) Using hidden Markov models to analyze gene expression time course data. Bioinformatics 19(suppl 1):i255–i263
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Scott SL (2002) Bayesian methods for hidden Markov models: recursive computing in the 21st century. J Am Stat Assoc 97:337–351
Shih H, Huang C (2005) Msn: statistical understanding of broadcasted baseball video using multi-level semantic network. IEEE Trans Broadcast 51(4):449–459
Takahashi M, Misu T, Naemura M, Fujii M, Yagi N (2007) Enrichment system for live sports broadcasts using real-time motion analysis and computer graphics. In: International Conference Broadcast Asia (BroadcastAsia2007)
Tracab (2012). http://www.tracab.com/
Ueda N (2002) Bayesian learning [iii] : Basics of variational Bayesian learning. The Institute of Electronics Information and Comunication Engineers (IEICE) 85(7):504–509
Yasuda H, Takahashi K, Matsumoto T (2000) A discrete HMM for online handwriting recognition. Int J Pattern Recognit Artif Intell 14(05):675–688
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Series B 67(2):301–320
Author information
Authors and Affiliations
Corresponding author
Additional information
We thank Professor A. Doucet at Oxford University and Mr. F. Akazawa for discussions. We also thank the editor and reviewers for their comments.
Appendices
Appendix I: Details of Gibbs Samling
For simplicity, we present the case where the training data is a single sequence. One can easily obtain a case where the training data is multiple sequences. First, let us recall the the function Ψ t (⋅) used in the (42). This function can be set as
where y −{t}=(y 1:t−1,y t+2:T ), z −{t}=(z 1:t−1,z t+2:T ), and z −{1,t,t+1}=(z 2:t−1,z t+2:T ). By using this setting, we can decrease (drastically) computational cost for the full conditional distribution P(z t |z −{t},y,ϕ).
More concretely, we can write (42)
The computations of these three terms \(\frac {P(y|z, \beta )}{{{\Psi }^{1}_{t}}(y_{-\{t\}},z_{-\{t\}};\beta )}\), \(\frac {P(z_{-\{1\}}|z_{1},\alpha )}{{{\Psi }^{2}_{t}}(z_{-\{t\}}; \alpha )}\), and \(\frac {P(z_{1}|\gamma )}{{{\Psi }^{3}_{t}}(z_{1};\gamma )}\) are much cheaper than those of the terms P(y|z,β), P(z −{1}|z 1,α), and P(z 1|γ) for an evaluation of P(y,z|ϕ).
For the later arguments, let the function n i (⋅) be defined as
for a discrete sequence \(\zeta =(\zeta _{\tau })_{\tau \in {\mathcal T}}\). The function n i k (⋅) is defined as
for two discrete sequences \(\zeta ^{1}=(\zeta ^{1}_{\tau })_{\tau \in \mathcal T}\) and \(\zeta ^{2}=(\zeta ^{2}_{\tau })_{\tau \in \mathcal T}\). The function ν i j (⋅) is defined by
for a discrete sequence \(\zeta =(\zeta _{\tau })_{\tau \in \mathcal T}\).
-
1.
When t=1
$$\begin{array}{llll} &P\left(z_{t}|z_{-\{t\}},y, \phi \right) \propto \frac{\gamma^{1}_{z_{1}}}{\gamma^{1}_{z_{1}} + \gamma^{2}_{z_{1}}} \cdot \left( \prod\limits_{i=1}^{z_{1}-1} \frac{{\gamma_{i}^{2}} }{{\gamma_{i}^{1}} + {\gamma_{i}^{2}} }\right) \\ & \quad\times \frac{\nu_{z_{1}z_{2}}(z_{-\{1\}})+\alpha_{z_{1}z_{2}}^{1}}{{\sum}_{j^{\prime}=z_{2}}^{U} \nu_{z_{1}j^{\prime}}(z_{-\{1\}})+\alpha_{z_{1}z_{2}}^{1} + \alpha_{z_{1}z_{2}}^{2}} \cdot \left(\prod\limits_{j=1}^{z_{2}-1} \frac{{\sum}_{j^{\prime}=j+1}^{U} \nu_{z_{1}j^{\prime}}(z_{-\{1\}})+\alpha_{z_{1}j}^{2}} {{\sum}_{j^{\prime}=j}^{U} \nu_{z_{1}j^{\prime}}(z_{-\{1\}})+\alpha_{z_{1}j}^{1} + \alpha_{z_{1}j}^{2} } \right)\\ & \quad\times \frac{n_{z_{1}e_{1}}(z_{-\{1\}}, e_{-\{1\}})+\beta_{e,z_{1}e_{1}}} {n_{z_{1}}(z_{-\{1\}})+{\sum}_{m=1}^{M+1} \beta_{e,z_{1}k}} \cdot \left(\prod\limits_{l=1}^{L} \frac{n_{z_{1}f_{l,1}}(z_{-\{1\}},f_{l,-\{1\}})+\beta_{f_{l},z_{1}f_{l,1}}} { n_{z_{1}}(z_{-\{1\}})+ {\sum}_{k=1}^{K_{l}} \beta_{f_{l},z_{1}k} }\right). \end{array} $$ -
2.
When 1<t<T and z t ≠z t−1,
$$\begin{array}{lllll} &P\left( z_{t}|z_{-\{t\}},y, \phi \right) \\ &\quad\propto \frac{\nu_{z_{t-1}z_{t}}(z_{-\{t\}})+\alpha_{z_{t-1}z_{t}}^{1}} {{\sum}_{j^{\prime}=z_{t}}^{U} \nu_{z_{t-1}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t-1}z_{t}}^{1} + \alpha_{z_{t-1}z_{t}}^{2}} \cdot \left( \prod\limits_{j=1}^{z_{t}-1} \frac{{\sum}_{j^{\prime}=j+1}^{U} \nu_{z_{t-1}k}(z_{-\{t\}})+\alpha_{z_{t-1}j}^{2}} {{\sum}_{j^{\prime}=j}^{U} \nu_{z_{t-1}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t-1}j}^{1} + \alpha_{z_{t-1}j}^{2}} \right) \\ & \quad\times \frac{\nu_{z_{t}z_{t+1}}(z_{-\{t\}})+\alpha_{z_{t}z_{t+1}}^{1}} {{\sum}_{j^{\prime}=z_{t+1}}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}z_{t+1}}^{1} + \alpha_{z_{t}z_{t+1}}^{2}} \cdot \left( \prod\limits_{j=1}^{z_{t+1}-1} \frac{{\sum}_{j^{\prime}=j+1}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}j}^{2}} {{\sum}_{j^{\prime}=j}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}j}^{1} + \alpha_{z_{t}j}^{2}} \right)\\ & \quad\times \frac{n_{z_{t}e_{t}}(z_{-\{t\}},e_{-\{t\}})+\beta_{e,z_{t}e_{t}}} {n_{z_{t}}(z_{-\{t\}})+{\sum}_{m=1}^{M+1} \beta_{e,z_{t}m}} \cdot \left( \prod\limits_{l=1}^{L} \frac{n_{z_{t}f_{l,t}}(z_{-\{t\}},f_{l,-\{t\}})+\beta_{f_{l},z_{t}f_{l,t}}} {n_{z_{t}}(z_{-\{t\}})+{\sum}_{k=1}^{K_{l}} \beta_{f_{l},z_{t}k}} \right). \end{array} $$ -
3.
When 1<t<T, z t =z t−1, and z t+1<z t ,
$$\begin{array}{lllll} &P\left( z_{t}|z_{-\{t\}},y, \phi \right) \\ &\quad\propto\left(\prod\limits_{j=1}^{z_{t+1}-1} \frac{{\sum}_{j^{\prime}=j+1}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+ \alpha_{z_{t}j}^{2}} {{\sum}_{j^{\prime}=j}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+ \alpha_{z_{t}j}^{1} + \alpha_{z_{t}j}^{2}} \cdot \frac{{\sum}_{j^{\prime}=j+1}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+ 1 +\alpha_{z_{t}j}^{2}} {{\sum}_{j^{\prime}=j}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+1+\alpha_{z_{t}j}^{1} + \alpha_{z_{t}j}^{2} } \right)\\ & \quad\times \frac{\nu_{z_{t}z_{t+1}}(z_{-\{t\}})+\alpha_{z_{t}z_{t+1}}^{1}} {{\sum}_{j^{\prime}=z_{t+1}}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}z_{t+1}}^{1} + \alpha_{z_{t}z_{t+1}}^{2}} \cdot \frac{{\sum}_{j^{\prime}=z_{t+1}+1}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}z_{t+1}}^{2}} {{\sum}_{j^{\prime}=z_{t+1}}^{U} \nu_{z_{t}j^{\prime}}.(z_{-\{t\}})+ 1+\alpha_{z_{t}z_{t+1}}^{1} + \alpha_{z_{t}z_{t+1}}^{2}}\\ & \quad\times \left(\prod\limits_{j=z_{t+1}+1}^{z_{t}-1} \frac{{\sum}_{j^{\prime}=j+1}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}j}^{2}} {{\sum}_{j^{\prime}=j}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+ \alpha_{z_{t}j}^{1} + \alpha_{z_{t}j}^{2}} \right) \cdot \frac{\nu_{z_{t}z_{t}}(z_{-\{t\}})+\alpha_{z_{t}z_{t}}^{1}} {{\sum}_{j^{\prime}=z_{t}}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}z_{t}}^{1} + \alpha_{z_{t}z_{t}}^{2}} \\ & \quad\times \frac{n_{z_{t}e_{t}}(z_{-\{t\}},e_{-\{t\}})+\beta_{e,z_{t}e_{t}}} {n_{z_{t}}(z_{-\{t\}})+{\sum}_{m=1}^{M+1} \beta_{e,z_{t}m}} \cdot \left( \prod\limits_{l=1}^{L} \frac{n_{z_{t}f_{l,t}}(z_{-\{t\}},f_{l,-\{t\}})+\beta_{f_{l},z_{t}f_{l,t}}} {n_{z_{t}}(z_{-\{t\}})+{\sum}_{k=1}^{K_{l}} \beta_{f_{l},z_{t}k}} \right). \end{array} $$ -
4.
When 1<t<T, z t =z t−1, and z t+1>z t ,
$$\begin{array}{lll} &P\left( z_{t}|z_{-\{t\}},y, \phi \right) \\ &\quad\propto\left( \prod\limits_{j=1}^{z_{t}-1} \frac{{\sum}_{j^{\prime}=j+1}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}}) + \alpha_{z_{t}j}^{2}} {{\sum}_{j^{\prime}=j}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+ \alpha_{z_{t}j}^{1} + \alpha_{z_{t}j}^{2}} \cdot \frac{{\sum}_{j^{\prime}=j+1}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+1 +\alpha_{z_{t}j}^{2}} {{\sum}_{j^{\prime}=j}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+1+\alpha_{z_{t}j}^{1} + \alpha_{z_{t}j}^{2}} \right) \\ & \quad\times \frac{\nu_{z_{t}z_{t}}(z_{-\{t\}})+\alpha_{z_{t}z_{t}}^{1}} {{\sum}_{j^{\prime}=z_{t}}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}z_{t}}^{1} + \alpha_{z_{t}z_{t}}^{2}} \cdot \frac{{\sum}_{j^{\prime}=z_{t}+1}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}z_{t}}^{2}} {{\sum}_{j^{\prime}=z_{t}}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+1+\alpha_{z_{t}z_{t}}^{1} + \alpha_{z_{t}z_{t}}^{2}} \\ & \quad\times \left( \prod\limits_{j=z_{t}+1}^{z_{t+1}-1} \frac{{\sum}_{j^{\prime}=j+1}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}j}^{2}} {{\sum}_{j^{\prime}=j}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}j}^{1} + \alpha_{z_{t}j}^{2}}\right) \cdot \frac{\nu_{z_{t}z_{t+1}}(z_{-\{t\}})+\alpha_{z_{t}z_{t+1}}^{1}} {{\sum}_{j^{\prime}=z_{t}}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}z_{t+1}}^{1} + \alpha_{z_{t}z_{t+1}}^{2}} \\ & \quad\times \frac{n_{z_{t}e_{t}}(z_{-\{t\}},e_{-\{t\}})+\beta_{e,z_{t}e_{t}}} {n_{z_{t}}(z_{-\{t\}})+{\sum}_{m=1}^{M+1} \beta_{e,z_{t}m}} \cdot \left( \prod\limits_{l=1}^{L} \frac{n_{z_{t}f_{l,t}}(z_{-\{t\}},f_{l,-\{t\}})+\beta_{f_{l},z_{t}f_{l,t}}} {n_{z_{t}}(z_{-\{t\}})+{\sum}_{k=1}^{K_{l}} \beta_{f_{l},z_{t}k}} \right). \end{array} $$ -
5.
When 1<t<T, z t =z t−1, and z t+1=z t ,
$$\begin{array}{lllll} &P\left( z_{t}|z_{-\{t\}},y, \phi \right) \\ & \quad\propto\left(\prod\limits_{j=1}^{z_{t}-1} \frac{{\sum}_{j^{\prime}=j+1}^{U} \nu_{z_{t}j^{\prime}}\left(z_{-\{t\}}\right)+\alpha_{z_{t}j}^{2}} {{\sum}_{j^{\prime}=j}^{U}\nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}j}^{1} + \alpha_{z_{t}j}^{2}} \right.\\ &\qquad\left.\times \frac{{\sum}_{j^{\prime}=j+1}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+1+\alpha_{z_{t}j}^{2}} {{\sum}_{j^{\prime}=j}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+1+\alpha_{z_{t}j}^{1} + \alpha_{z_{t}j}^{2}} \right) \\ &\qquad \times \frac{\nu_{z_{t}z_{t}}(z_{-\{t\}})+\alpha_{z_{t}z_{t}}^{1}} { {\sum}_{j^{\prime}=z_{t}}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+\alpha_{z_{t}z_{t}}^{1} + \alpha_{z_{t}z_{t}}^{2}} \cdot \frac{\nu_{z_{t}z_{t}}(z_{-\{t\}})+1+\alpha_{z_{t}z_{t}}^{1}} {{\sum}_{j^{\prime}=z_{t}}^{U} \nu_{z_{t}j^{\prime}}(z_{-\{t\}})+1+\alpha_{z_{t}z_{t}}^{1} + \alpha_{z_{t}z_{t}}^{2}} \\ & \qquad\times \frac{n_{z_{t}e_{t}}(z_{-\{t\}},e_{-\{t\}})+\beta_{e,z_{t}e_{t}}} {n_{z_{t}}(z_{-\{t\}})+{\sum}_{m=1}^{M+1} \beta_{e,z_{t}m}} \cdot \left( \prod\limits_{l=1}^{L} \frac{n_{z_{t}f_{l,t}}(z_{-\{t\}},f_{l,-\{t\}})+\beta_{f_{l},z_{t}f_{l,t}}} {n_{z_{t}}(z_{-\{t\}})+{\sum}_{k=1}^{K_{l}} \beta_{f_{l},z_{t}k}} \right). \end{array} $$ -
6.
When t=T
$$\begin{array}{lllll} &P\left( z_{t}|z_{-\{t\}},y, \phi \right) \\ &\propto \left( \prod\limits_{j=1}^{z_{t}-1} \frac{{\sum}_{j^{\prime}=j+1}^{U} \nu_{z_{T-1}j^{\prime}}(z_{-\{T\}})+\alpha_{z_{T-1}j}^{2}} {{\sum}_{j^{\prime}=j}^{U} \nu_{z_{T-1}j^{\prime}}(z_{-\{T\}})+\alpha_{z_{T-1}j}^{1} + \alpha_{z_{T-1}j}^{2}} \cdot \frac{\nu_{z_{T-1}z_{T}}(z_{-\{T\}})+\alpha_{z_{T-1}z_{T}}^{1}} {{\sum}_{j^{\prime}=z_{T}}^{U} \nu_{z_{T-1}j^{\prime}}(z_{-\{T\}})+\alpha_{z_{T-1}z_{T}}^{1} + \alpha_{z_{T-1}z_{T}}^{2}} \right) \\ & \quad\times \frac{n_{z_{T}e_{T}}(z_{-\{T\}},e_{-\{T\}})+\beta_{e,z_{T}e_{T}}} {n_{z_{T}}(z_{-\{T\}})+{\sum}_{m=1}^{M+1} \beta_{e,z_{T}m}} \cdot \left( \prod\limits_{l=1}^{L} \frac{n_{z_{T}f_{l,T}}(z_{-\{T\}},f_{l,-\{T\}})+\beta_{f_{l},z_{T}f_{l,T}}} {n_{z_{T}}(z_{-\{T\}})+{\sum}_{k=1}^{K_{l}} \beta_{f_{l},z_{T}k}} \right). \end{array} $$
Appendix II: Derivation of MAP EM for SB-HMM
For simplicity, we present the case where the training data is a single sequence. We first note that if the state is truncated at U, the Q-function for the Stick-Breaking HMM is given by
where
We will discuss the state transition parameter a output probability b and initial state probability π, separately.
-
Parameters π
In the Q-function, those terms that depend on π reads:
$$\begin{array}{@{}rcl@{}} F(\pi) = \sum\limits_{i=1}^{U}\mu_{1}^{(r)}(i)\log \pi_{i} + \log P(\pi). \end{array} $$(66)We rewrite (66) in terms of V π . By noting that V π,U =1, we have
$$\begin{array}{@{}rcl@{}} F(V_{\pi}) &=& \mu^{(r)}_{1}(1) \log V_{\pi,1} + \sum\limits_{i=2}^{U-1} \mu^{(r)}_{1}(i) \left( \log V_{\pi,i} + \sum\limits_{i^{\prime}=1}^{i-1} \log (1-V_{\pi,i^{\prime}})\right) \\ && \hspace{2cm} + \sum\limits_{i=1}^{U-1} \left({\gamma^{1}_{i}} -1\right)\log V_{\pi,i} + \sum\limits_{i=1}^{U-1} \left({\gamma^{2}_{i}} -1\right) \log(1-V_{\pi,i}) \\ &=& \sum\limits_{i=1}^{U-1} \left(\mu^{(r)}_{1}(i)+{\gamma^{1}_{i}}-1 \right)\log V_{\pi,i} \\ && \hspace{2cm} + \sum\limits_{i=1}^{U-1} \left( \sum\limits_{i^{\prime}=i+1}^{U}\mu^{(r)}_{1}(i^{\prime}) + {\gamma^{2}_{i}}-1 \right) \log(1-V_{\pi,i}). \end{array} $$Let
$$\begin{array}{@{}rcl@{}} A_{i}^{(r)}=\mu_{1}^{(r)}(i)+{\gamma^{1}_{i}}-1,\hspace{0.5cm} B_{i}^{(r)}=\sum\limits_{i^{\prime}=i+1}^{U}\mu_{1}^{(r)}(i^{\prime})+{\gamma^{2}_{i}}-1 \end{array} $$for i=1, ⋯, U−1. Then we have
$$\begin{array}{@{}rcl@{}} \frac{\partial}{\partial V_{\pi,i} } F(V_{\pi}) &=& \sum\limits_{i^{\prime}=1}^{U-1} \frac{\partial}{\partial V_{\pi,i}} \left(A_{i^{\prime}}^{(r)}\log V_{\pi,i^{\prime}}+B_{i^{\prime}}^{(r)} \log (1- V_{\pi,i^{\prime}}) \right)\\ &=&\frac{A_{i}^{(r)}}{V_{\pi,i}}-\frac{B_{i}^{(r)}}{1-V_{\pi,i}} =\frac{A_{i}^{(r)}-\left(A_{i}^{(r)}+B_{i}^{(r)}\right)V_{\pi,i}}{V_{\pi,i}(1-V_{\pi,i})} \end{array} $$(67)for i=1, ⋯, U−1. By setting \(\frac {\partial }{\partial V_{\pi ,i} } F(V_{\pi })=0\), we have
$$\begin{array}{@{}rcl@{}} V_{\pi,i}&=&\frac{A_{i}^{(r)}}{A_{i}^{(r)}+B_{i}^{(r)}}, \end{array} $$which implies
$$\begin{array}{@{}rcl@{}} V_{\pi,i} = \frac{\mu_{1}^{(r)}(i)+{\gamma^{1}_{i}}-1}{{\sum}_{i^{\prime}=i}^{U}\mu^{(r)}_{1}(i^{\prime})+{\gamma^{1}_{i}}+{\gamma^{2}_{i}}-2} \end{array} $$(68)for i=1, ⋯, U−1. Note V π,i satisfies 0<V π,i <1, since we selected hyperparameters \({\gamma ^{1}_{i}}\) and \({\gamma ^{2}_{i}}\) to satisfy \({\gamma ^{1}_{i}}+{\gamma ^{2}_{i}} > 2\) for i=1, ⋯, U−1.
-
Parameters
a An update formula for state transition parameter a can be derived in a manner similar to that for π:
$$\begin{array}{@{}rcl@{}} V_{a,ij} = \frac{{\sum}_{t=1}^{T-1}\epsilon_{t}^{(r)}(i,j)+\alpha^{1}_{ij}-1}{ {\sum}_{t=1}^{T-1}{\sum}_{j^{\prime}=j}^{U}\epsilon_{t}^{(r)}(i,j^{\prime})+\alpha^{1}_{ij}+\alpha^{2}_{ij}-2} \end{array} $$(69)for i=1, ⋯, U and j=1, ⋯, U−1. In this paper, we selected hyperparameters \(\alpha ^{1}_{ij}\) and \(\alpha ^{2}_{ij}\) to satisfy \(\alpha ^{1}_{ij}+\alpha ^{2}_{ij} > 2\) for i=1, ⋯, U and j=1, ⋯, U−1.
-
Parameters b
Noticing that the prior for b is a finite-dimensional Dirichlet, we consider the terms containing \(b_{f_{l}}\):
$$\begin{array}{@{}rcl@{}} \begin{array}{lll} F(b_{f_{l}}, \lambda) &=& \sum\limits_{t=1}^{T} \sum\limits_{i=1}^{U} \sum\limits_{k=1}^{K_{l}} \mu_{t}^{(r)}(i) I(f_{l,t}=k) \log b_{f_{l}, ik} \\ &&+ \sum\limits_{i=1}^{U}\sum\limits_{k=1}^{K_{l}} (\beta_{f_{l}, ik}-1) \log b_{f_{l}, ik} - \lambda \left( \sum\limits_{k=1}^{K_{l}} b_{f_{l}, ik}-1\right) , \end{array} \end{array} $$(70)where λ is a Lagrange multiplier.Footnote 6
Taking the derivative of this function with respect to \(b_{f_{l}, ik}\) yields
for i=1, ⋯, U, k=1, ⋯, K l , and l=1⋯L. By setting \(\frac {\partial }{\partial b_{f_{l},ik}} F(b_{f_{l}}, \lambda )=0\), we have
for i=1, ⋯, U, k=1, ⋯, K l , and l=1⋯L. It follows from the constraints on \(b_{f_{l}, ik}\) that
for i=1, ⋯, U and l=1⋯L. Then we have
for i=1, ⋯, U and l=1⋯L. Therefore, the update formula for \(b_{f_{l},ik}\) is given by
for i=1, ⋯, U, k=1, ⋯, K l , and l=1⋯L.
An update formula for b e is derived in a similar manner as
for i=1, ⋯, U and m=1, ⋯, M+1.
Rights and permissions
About this article
Cite this article
Kobayashi, G., Hatakeyama, H., Ota, K. et al. Predicting viewer-perceived activity/dominance in soccer games with stick-breaking HMM using data from a fixed set of cameras. Multimed Tools Appl 75, 3081–3119 (2016). https://doi.org/10.1007/s11042-014-2425-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-014-2425-0