Psychoacoustic testing of Searle's model based on human audition

Sureshchandran, Gnanasi Romoni (1984) Psychoacoustic testing of Searle's model based on human audition. Masters thesis, Memorial University of Newfoundland.

[img] [English] PDF (Migrated (PDF/A Conversion) from original format: (application/pdf)) - Accepted Version
Available under License - The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.

Download (25MB)
  • [img] [English] PDF - Accepted Version
    Available under License - The author retains copyright ownership and moral rights in this thesis. Neither the thesis nor substantial extracts from it may be printed or otherwise reproduced without the author's permission.
    (Original Version)

Abstract

A software model was built to investigate the feasibility of reducing the dimensionality of the representation of speech for speech recognition. The model was based on a model of human audition developed by Searle in hardware, the characteristics of which closely match physiological characteristics obtained experimentally from the human auditory system. The model consisted of 16, one third octave bandpass filters followed by envelope detectors, the outputs of which were then subjected to the linear discrete cosine transform in an attempt to reduce the data to a small number of perceptually important dimensions. Several signal processing techniques were investigated to reconstruct the filtered, detected and transformed speech to permit effective intelligibility testing. Most of the work done was qualitative and the methodology concentrated on testing the model rather than the theory. A test sentence produced by a male and female speaker was transformed using the model and the transformed speech was reconstructed. Informal testing with a few experienced listeners suggests that it might be possible to recognize speech with as few as three out of the sixteen channels. This study confirms the value of realizing the models in software. Although the model is not real time it was far easier to modify than the comparable hardware models.

Item Type: Thesis (Masters)
URI: http://research.library.mun.ca/id/eprint/1656
Item ID: 1656
Additional Information: Bibliography: leaves 148-151
Department(s): Engineering and Applied Science, Faculty of
Date: 1984
Date Type: Submission
Library of Congress Subject Heading: Automatic speech recognition; Data reduction; Psychoacoustics; Signal processing--Digital techniques; Speech synthesis

Actions (login required)

View Item View Item

Downloads

Downloads per month over the past year

View more statistics