URI | http://purl.tuc.gr/dl/dias/8BD8DCF4-6F56-4CA0-819D-1608076900A2 | - |
Identifier | http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=242489&url=http%3A%2F%2Fieeexplore.ieee.org%2Fiel4%2F89%2F6235%2F00242489.pdf%3Farnumber%3D242489 | - |
Identifier | https://doi.org/10.1109/89.242489 | - |
Language | en | - |
Extent | 12 pages | en |
Title | ML estimation of a stochastic linear system with the EM algorithm and its application to speech recognition | en |
Creator | Digalakis Vasilis | en |
Creator | Διγαλακης Βασιλης | el |
Creator | Rohlicek J. R. | en |
Creator | Ostendorf M. | en |
Publisher | Institute of Electrical and Electronics Engineers | en |
Content Summary | A nontraditional approach to the problem of estimating the parameters of a stochastic linear system is presented. The method is based on the expectation-maximization algorithm and can be considered as the continuous analog of the Baum-Welch estimation algorithm for hidden Markov models. The algorithm is used for training the parameters of a dynamical system model that is proposed for better representing the spectral dynamics of speech for recognition. It is assumed that the observed feature vectors of a phone segment are the output of a stochastic linear dynamical system, and it is shown how the evolution of the dynamics as a function of the segment length can be modeled using alternative assumptions. A phoneme classification task using the TIMIT database demonstrates that the approach is the first effective use of an explicit model for statistical dependence between frames of speech | en |
Type of Item | Peer-Reviewed Journal Publication | en |
Type of Item | Δημοσίευση σε Περιοδικό με Κριτές | el |
License | http://creativecommons.org/licenses/by/4.0/ | en |
Date of Item | 2015-11-02 | - |
Date of Publication | 1993 | - |
Subject | Speech recognition | en |
Bibliographic Citation | V. Digalakis, J. R. Rohlicek and M. Ostendorf, "ML estimation of a stochastic linear system with the EM algorithm and its application to speech recognition," IEEE Trans. Speech Audio Process., vol. 1, no. 4, pp. 431-442, Oct. 1993. doi:10.1109/89.242489 | en |