TuttoCongressi
E-mail
Password
Resta Collegato

Cerca per parole nell'archivio:


In archivio:
  • N° Relazioni: 792
  • N° Accessi Relaz.: 155894
Web Video
Portale Giovani Medici

   home   |   progetto   |   chi siamo   |   servizi   |   contatti   |   note legali Live Demo

INTERSPEECH 2011

Palazzo Congressi e Palazzo Affari (FI) 
26/08/2011 - 31/08/2011
 
Programma Evento
Video: 1
Info
   Mostra tutti | Mostra contenuti EasyRec   
27/08/2011
Botticelli
14:30 - 18:00
T-A5: AUTOMATIC SUMMARIZATION
Nenkova A.
Automatic Summarization
 
Leonardo
14:30 - 18:00
T-A1: FUNCTIONAL DATA ANALYSIS FOR SPEECH RESEARCH
Gubian M.
T-A1: Functional Data Analysis for Speech Research
 
Michelangelo
14:30 - 18:00
T-A2: REGISTERS AND RESONANCES IN SINGING
Wolfe J.
Registers and Resonances in Singing
 
28/08/2011
Auditorium
10:00 - 11:00
OPENING CEREMONY
Opening Ceremony 
Trancoso Isabel
11:00 - 12:00
ISCA SCIENTIFIC ACHIEVEMENT MEDALLIST FOR 2011
Julia Hirschberg 
Speaking More Like You: Entrainment in Conversational Speech
Logistics 
13:30 - 15:30
SPEAKER RECOGNITION ? MODELING
Y. Bistritz 
Skew Gaussian Mixture Models for Speaker Recognition
Orith Toledo-Ronen 
Towards Goat Detection in Text-Dependent Speaker Verification
Jean-François Bonastre 
Speaker Modeling Using Local Binary Decisions
Hagai Aronowitz 
New Developments in Voice Biometrics for User Authentication
Miranti Indar Mandasari 
Evaluation of i-vector Speaker Recognition Systems for Forensic Application
16:00 - 18:00
SPEAKER RECOGNITION ? MODELING, AUTOMATIC PROCEDURES, ANALYSIS I
Rong Zheng 
Restoring the Residual Speaker Information in Total Variability Modeling for Speaker Verification
Hagai Aronowitz 
New Developments in Joint Factor Analysis for Speaker Verification
Joaquin Gonzalez-Rodriguez 
Speaker Recognition Using Temporal Contours in Linguistic Units: The Case of Formant and Formant-Bandwidth Trajectories
Ond¡rej Glembek 
Discriminatively Trained i-vector Extractor for Speaker Verification
Michelle Hewlett Sanchez 
Constrained Cepstral Speaker Recognition Using Matched UBM and JFA Training
Alan McCree 
A New Perspective on GMM Subspace Compensation Based on PPCA and Wiener Filtering
 
Botticelli
13:30 - 15:30
SECOND LANGUAGE ACQUISITION, DEVELOPMENT AND LEARNING I
Mikhail Ordin 
Acquisition of Timing Patterns in Second Language
Hongyan Li 
Context-Dependent Duration Modeling with Backoff Strategy and Look-Up Tables for Pronunciation Assessment and Mispronunciation Detection
Mee Sonu 
Perceptual Training of Vowel Length Contrast of Japanese by L2 Listeners: Effects of an Isolated Word versus a Word Embedded in Sentences
E-Chin Wu 
Similar Vowels in L1/L2 Production: Confused or Discerned in Early L2 English Learners with Different Amount of Exposure
Lya Meister 
Production and Perception of Estonian Vowels by Native and Non-Native Speakers
Hiroshi Kibishi 
New Feature Parameters for Pronunciation Evaluation in English Presentations at International Conferences
Gérard Bailly 
Synchronous Reading: Learning French Orthography by Audiovisual Training
Christos Koniaris 
Phoneme Level Non-Native Pronunciation Analysis by an Auditory Model-Based Native Assessment Scheme
Pavel ?turm 
The Open Front Vowel /æ/ in the Production and Perception of Czech Students of English
Catia Cucchiarini 
Error Selection for ASR-Based English Pronunciation Training in ?My Pronunciation Coach'
Tomoko Nariai 
An Experimental Analysis of Pitch Patterns in Japanese Speakers of English with Verification by Speech Re-Synthesis
Tomoko Nariai 
An Analysis of Word Duration in Native Speakers and Japanese Speakers of English
16:00 - 18:00
PROSODIC STRUCTURE
Joseph Tepperman 
Where Should Pitch Accents and Phrase Breaks Go? A Syntax Tree Transducer Solution
Giuliano Bocci 
Phrasal Prominences do not need Pitch Movements: Postfocal Phrasal Heads in Italian
David Le Gac 
Intonation of Left Dislocated Topics in ModernGreek
Laura Thompson 
Phrases, Pitch and Perceived Prominence in Maori
Tomá? Dub¡eda 
Perceptual Sensitivity to Prenuclear and Nuclear Intonational Patterns
Raya Kalaldeh 
Tonal Alignment Defined: The Case of Southern Irish English
Andrew Rosenberg 
Using Mutual Information to Identify Regions of Analysis for Prosodic Analysis
Chiu-yu Tseng 
Prosodic Highlights in Mandarin Continuous Speech ? Cross-Genre Attributes and Implications
Simone Sulpizio 
When Two Newly-Acquired Words are One: New Words Differing in Stress Alone are not Automatically Represented Differently
Shehui Bu 
Automatic Determination of the Standard Chinese Prosodic Phrase Boundaries by F0 Generation Model
Céline De Looze 
Measuring Speakers' Similarity in Speech by Meansn of Prosodic Cues: Methods and Potential
Li-chiung Yang 
Tonal Variations in Mandarin: New Evidence fromSpontaneous and Read Speech
 
Brunelleschi
13:30 - 15:30
SPEECH REPRESENTATION AND MODELLING
Faten Ben Ali 
A Long-Term Harmonic Plus Noise Model for Speech Signals
Alan Ó Cinnéide 
A Frequency Domain Approach to ARX-LF Voiced Speech Parameterization and Synthesis
Vikram Ramanarayanan 
Automatic Data-Driven Learning of Articulatory Primitives from Real-Time MRI Data Using Convolutive NMF with Sparseness Constraints
Ravi Vipperla 
Online Pattern Learning for Non-Negative Convolutive Sparse Coding
Nicolas Malyska 
Sinewave Representations of Nonmodality
Chandra Seelamantula 
Time-Varying Signal Adaptive Transform and IHT Recovery of Compressive Sensed Speech
16:00 - 18:00
SPEECH ANALYSIS
C.F. Pedersen 
Adaptive Estimation of Zeros of Time-Varying Z-Transforms
John Kane 
Identifying Regions of Non-Modal Phonation Using Features of the Wavelet Transform
Keith Godin 
Acoustic Analysis of Whispered Speech for Phoneme and Speaker Dependency
Afsaneh Asaei 
Multi-Party Speech Recovery Exploiting Structured Sparsity Models
Sri Harish Mallidi 
Modulation Spectrum Analysis for Recognition of Reverberant Speech
Petko N. Petkov 
Discrete Choice Models for Non-Intrusive Quality Assessment
 
Caravaggio
13:30 - 14:30
SPEECH AND LANGUAGE PROCESSING-BASED ASSISTIVE TECHNOLOGIES AND HEALTH APPLICATIONS I
Douglas Sturim 
Automatic Detection of Depression in Speech Using Gaussian Mixture Modeling with Factor Analysis
H. Timothy Bunnell 
Utterance Verification for Automating the Hearing in Noise Test (HINT)
Chi-Chun Lee 
Analyzing the Nature of ECA Interactions in Children with Autism
14:30 - 15:30
SPEECH AND LANGUAGE PROCESSING-BASED ASSISTIVE TECHNOLOGIES AND HEALTH APPLICATIONS II
Theologos Athanaselis 
Incorporating Speech Recognition Engine into an Intelligent Assistive Reading System for Dyslexic Students
Nicholas Cummins 
An Investigation of Depressed Speech Detection: Features and Normalization
Michelle Hewlett Sanchez 
Using Prosodic and Spectral Features in Detecting Depression in Elderly Males
Catherine Middag 
Combining Phonological and Acoustic ASR-Free Features for Pathological Speech Intelligibility Assessment
Robin Hofe 
Speech Synthesis Parameter Generation for the Assistive Silent Speech Interface MVOCA
Peter A. Heeman 
Computer-Assisted Disfluency Counts for Stuttered Speech
Richard Hummel 
Spectral Features for Automatic Blind Intelligibility Estimation of Spastic Dysarthric Speech
Emily T. Prud'hommeaux 
Extraction of Narrative Recall Patterns for Neuropsychological Assessment
Aki Kunikoshi 
Gesture Design of Hand-to-Speech Converter Derived from Speech-to-Hand Converter Based on Probabilistic Integration Model
Akira Sasou 
Powered Wheelchair Control Using Acoustic-Based Recognition of Head Gesture Accompanying Speech
José Luis Blanco 
Analyzing Training Dependencies and Posterior Fusion in Discriminant Classification of Apnea Patients Based on Sustained and Connected Speech
16:00 - 17:00
CROWDSOURCING FOR SPEECH PROCESSING I
Gabriel Parent - M.Eskenazi 
Speaking to the Crowd: Looking at Past Achievements in Using Crowdsourcing for Speechand Predicting Future Challenges
17:00 - 18:00
CROWDSOURCING FOR SPEECH PROCESSING II
Chia-ying Lee 
A Transcription Task for Crowdsourcing with Automatic Quality Control
Kartik Audhkhasi 
Reliability-Weighted Acoustic Model Adaptation Using Crowd Sourced Transcriptions
Martin Cooke 
Crowdsourcing for Word Recognition in Noise
Sabine Buchholz 
Crowdsourcing Preference Tests, and How to Detect Cheating
Ian McGraw 
Growing a Spoken Language Interface on Amazon Mechanical Turk
F. Jur¡cí¡cek 
Real User Evaluation of Spoken Dialogue Systems Using Amazon Mechanical Turk
Hadrien Gelas 
Quality Assessment of Crowdsourcing Transcriptions for African Languages
Keelan Evanini 
Using Crowdsourcing to Provide Prosodic Annotations for Non-Native Speech
Masataka Goto 
PodCastle: Recent Advances of a Spoken Document Retrieval Service Improved by Anonymous User Contributions
 
Leonardo
13:30 - 15:30
SPEECH PERCEPTION ? SPEECH INTELLIGIBILITY
Nandini Iyer 
Segregation of Whispered Speech Interleaved with Noise or Speech Maskers
Roi Kliper 
Monaural Azimuth Localization Using Spectral Dynamics of Speech
Jan Rennies 
Prediction of Binaural Intelligibility Level Differences in Reverberation
Aurore Gautreau 
Let's All Speak Together! Exploring the Impact of Various Languages on the Comprehension of Speech in Multi-Linguistic Babble
Valeriy Shafiro 
Cross-Rate Variation in the Intelligibility of Dual-Rate Gated Speech in Older Listeners
Chia-ying Lee 
An Efferent-Inspired Auditory Model Front-End for Speech Recognition
16:00 - 18:00
SPEECH PERCEPTION ? PERCEPTUAL LEARNING AND CROSS-LANGUAGE PERCEPTION
Odette Scharenborg 
Perceptual Learning of Liquids
Anne Cutler 
The Efficiency of Cross-Dialectal Word Recognition
Minoru Tsuzaki 
Estimation of Perceptual Spaces for Speaker Identities Based on the Cross-Lingual Discrimination Task
Sharon Peperkamp 
The Relation Between Perception and Production in L2 Phonological Processing
Jan Volin 
The Role of Word-Initial Glottal Stops in Recognizing English Words
Caicai Zhang 
Effect of Language Experience on the Categorical Perception of Cantonese Vowel Duration
 
Michelangelo
13:30 - 15:30
EMOTION, SPEAKING STYLE, AND SOCIAL BEHAVIOR
Martin Wöllmer 
Acoustic-Linguistic Recognition of Interest in Speech with Bottleneck-BLSTM Nets
Mustafa Erden 
Automatic Detection of Anger in Human-Human Call Center Dialogs
Keng-hao Chang 
Improved Classification of Speaking Styles for Mental Health Monitoring Using Phoneme Dynamics
Matthew P. Black 
"You made me do it": Classification of Blame in Married Couples' Interactions by Fusing Automatically Derived Speech and Language Information
Martijn Goudbeek 
Context and Priming Effects in the Recognition of Emotion of Old and Young Listeners
Agustín Gravano 
Acoustic and Prosodic Correlates of Social Behavior
16:00 - 18:00
SPEECH ENHANCEMENT ANDDEREVERBERATION
Keisuke Kinoshita 
Single Channel Dereverberation Using Example-Based Speech Enhancement with Uncertainty Decoding Technique
Jan S. Erkelens 
A Statistical Room Impulse Response Model with Frequency Dependent Reverberation Time for Single-Microphone Late Reverberation Suppression
Chan W.
An Assessment of the Improvement Potential of Time-Frequency Masking for Speech Dereverberation
A. Lima 
Perceptual Improvement of a Two-Stage Algorithm for Speech Dereverberation
Friedrich Faubel 
A Model-Based Spectral Envelope Wiener Filter for Perceptually Motivated Speech Enhancement
Jorge I. Marin-Hurtado 
Binaural Noise-Reduction Method Based on BlindSource Separation and Perceptual Post Processing
 
Raffaello
13:30 - 15:30
HMM-BASED SPEECH SYNTHESIS I
Logo Logo
Kyung Hwan Oh 
Decision Tree-Based Clustering with Outlier Detection for HMM-Based Speech Synthesis
Hanna Silén 
Prediction of Voice Aperiodicity Based on Spectral Representations in HMM Speech Synthesis
Takashi Nose 
A Perceptual Expressivity Modeling Technique for Speech Synthesis Based on Multiple-Regression HSMM
Kei Hashimoto 
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis
Zhen-Hua Ling 
Feature-Space Transform Tying in Unified Acoustic-Articulatory Modelling for Articulatory Control of HMM-Based Speech Synthesis
Matt Shannon 
The Effect of Using Normalized Models in Statistical Speech Synthesis
16:00 - 18:00
ASR ? FEATURE EXTRACTION II
Logo Logo
Tim Ng 
Region Dependent Transform on MLP Features forSpeech Recognition
Martin Heckmann 
Discriminant Sub-Space Projection of Spectro-Temporal Speech Features Based on Maximizing Mutual Information
Takashi Fukuda 
Combining Feature Space Discriminative Training with Long-Term Spectro-Temporal Features for Noise-Robust Speech Recognition
Patrick Haffner 
Combining Frame and Segment Level Processing via Temporal Pooling for Phonetic Classification
M. Seltzer 
Improved Bottleneck Features Using Pretrained Deep Neural Networks
Yuan-Fu Liao 
Minimum Classification Error Based Spectro-Temporal Feature Extraction for Robust Audio Classification
 
29/08/2011
Auditorium
08:30 - 09:30
NEURAL REPRESENTATIONS OF WORD MEANINGS
Eat Italian in Tuscany 
Tom M. Mitchell 
Neural Representations of Word Meanings
Logistics 
10:00 - 12:00
SPEAKER RECOGNITION ? MODELING, AUTOMATIC PROCEDURES, ANALYSIS II
Rong Zheng 
Data-Driven Gaussian Component Selection for Fast GMM-Based Speaker Verification
Daniel Garcia-Romero 
Analysis of i-vector Length Normalization in Speaker Recognition Systems
Weiwu Jiang 
An Analysis Framework Based on Random Subspace Sampling for Speaker Verification
Nicolas Scheffer 
Factor Analysis Back Ends for MLLR Transforms in Speaker Recognition
Craig S. Greenberg 
Report on Performance Results in the NIST 2010 Speaker Recognition Evaluation
Marcel Kockmann 
iVector Fusion of Prosodic and Cepstral Features for Speaker Verification
13:30 - 15:30
SPEAKER RECOGNITION ? ANALYSIS AND STATISTICS I
Frank Soong 
Improvements in Speaker Characterization With Spectral Subband Energy Based on Harmonic plus Noise Model
Yosef A. Solewicz 
Implicit Segmentation in Two-Wire Speaker Recognition
Omar Mohamed
Boosting Speaker Recognition Performance with Compact Representations
Carlos Vaquero 
Partitioning of Two-Speaker Conversation Datasets
16:00 - 18:00
SPEAKER RECOGNITION ? ANALYSIS AND STATISTICS II
Pierre-Michel Bousquet 
Intersession Compensation and Scoring Methods in the i-vectors Space for Speaker Recognition
Szymon Drgas 
Kernel Alignment Maximization for Speaker Recognition Based on High-Level Features
Zotkin Dmitry
Kernel Partial Least Squares for Speaker Recognition
Mohamed Kamal Omar 
Conversational-Side-Specific Inter-Session Variability Compensation
David A. van Leeuwen 
A Speaker Line-Up for the Likelihood Ratio
Jesús Villalba 
Towards Fully Bayesian Speaker Recognition: Integrating Out the Between-Speaker Covariance
 
Brunelleschi
10:00 - 12:00
ACOUSTIC EVENT DETECTION
Jürgen T. Geiger 
Learning New Acoustic Events in an HMM-Based System Using MAP Adaptation
Yi Ren Leng 
Alternative Frequency Scale Cepstral Coefficient for Robust Sound Event Recognition
Akinori Ito 
Evaluation of Abnormal Sound Detection using Multi-Stage GMM in Various Environments
Joerg Schmalenstroeer 
Unsupervised Learning of Acoustic Events Using Dynamic Time Warping and Hierarchical K-Means++ Clustering
Ascension Gallardo Antolin 
Feature Extraction Assessment for an Acoustic-Event Classification Task Using the Entropy Triangle
Stavros Tsakalidis 
Unsupervised Audio Analysis for Categorizing Heterogeneous Consumer Domain Videos
13:30 - 15:30
SPEECH SEGMENTATION
Yih-Ru Wang 
A Two-Stage Sample-Based Phone Boundary Detector Using Segmental Similarity Features
Qiang Huang 
Iterative Improvement of Speaker Segmentation in a Noisy Environment Using High-Level Knowledge
Diego Castán 
Hierarchical Audio Segmentation with HMM and Factor Analysis in Broadcast News Domain
Ozlem Kalinli 
Syllable Segmentation of Continuous Speech Using Auditory Attention Cues
Alan W Black 
Exploiting Phone-Class Specific Landmarks forn Refinement of Segment Boundaries in TTS Databases
Juan José Burred 
Phoneme-Level Text to Audio Synchronization on Speech Signals with Background Music
16:00 - 18:00
ASR ? LEXICAL, PROSODIC AND MULTI-LINGUAL MODELS
Logo Logo
Sravana Reddy 
Learning from Mistakes: Expanding Pronunciation Lexicons Using Word Recognition Errors
David Imseng 
Improving Non-Native ASR Through Stochastic Multilingual Phoneme Space Transformations
Scott Novotney 
Unsupervised Arabic Dialect Adaptation with Self-Training
Van Compernolle D.
Template-Based Automatic Speech Recognition Meets Prosody
Ian McGraw 
Pronunciation Learning from Continuous Speech
Yanmin Qian 
State-Level Data Borrowing for Low-Resource Speech Recognition Based on Subspace GMMs
Liao Y.
 
Donatello
13:30 - 15:30
SHOW & TELL DEMONSTRATION ? SPEECH SYSTEMS AND APPLICATIONS
Felix Burkhardt 
An Affective Spoken Storyteller
Lijuan Wang 
Text Driven 3D Photo-Realistic Talking Head
Takayuki Arai 
Physical Models Producing Vowels with Pitch Variation
Margot Mieskes 
An Engine-Independent Text-to-Speech Workplace
Simone Carcone 
An Application to Test the Emotion Conveyed by Vocal and Musical Signals
Mariusz Zió³ko 
Automatic Speech Recognition System Dedicated for Polish
Kong Aik Lee 
Joint Application of Speech and Speaker Recognition for Automation and Security in Smart Home
Staffan Larsson 
Adding a Speech Cursor to a Multimodal Dialogue System
S. Thomas Christie 
Prosody Toolkit: Integrating HTK, Praat and WEKA
F. Francesconi 
Collecting Life Logs for Experience-Based Corpora
16:00 - 18:00
SPEECH SYNTHESIS ? SELECTEDTOPICS
Alok Parlikar 
A Grammar Based Approach to Style Specific Phrase Prediction
Oliver Watts 
Unsupervised Features from Text for SpeechSynthesis in a Speech-to-Speech Translation System
Oliver Watts 
Unsupervised Continuous-Valued Word Features for Phrase-Break Prediction without a Part-of-Speech Tagger
Francisco Campillo 
Albayzín 2010: A Spanish Text to Speech Evaluation
Binbin Shen 
Combining Active and Semi-Supervised Learning for Homograph Disambiguation in Mandarin Text-to-Speech Synthesis
Thomas Ewender 
Automatically Creating a Diphone Set from a Speech Database
Wesley Mattheyses 
Automatic Viseme Clustering for Audiovisual Speech Synthesis
Florian Hinterleitner 
Perceptual Quality Dimensions of Text-to-Speech Systems
Shinsuke Mori 
A Pointwise Approach to Pronunciation Estimationfor a TTS Front-End
Mohamed Abou-Zleikha 
Correlating Text with Prosody
Andrew Rosenberg 
"What is. . . Dengue Fever?" ? Modeling and Predicting Pronunciation Errors in a Text-to-SpeechSystem
Christoph Norrenbrock 
Aperiodicity Analysis for Quality Estimation of Text-to-Speech Signals
 
Leonardo
10:00 - 12:00
SPEECH PRODUCTION ? ARTICULATORY MEASUREMENTS
Yoon-Chul Kim 
Visualization of Vocal Tract Shape Using Interleaved Real-Time MRI of Multiple Scan Planes
Perrier Pascal
Biomechanical Tongue Models: An Approach to Studying Inter-Speaker Variability
Jun Wang 
Quantifying Articulatory Distinctiveness of Vowels
Michael Proctor 
Direct Estimation of Articulatory Kinematics from Real-Time Magnetic Resonance Image Sequences
Peter Birkholz 
Combined Optical Distance Sensing and Electropalatography to Measure Articulation
Santitham Prom-on 
Simulating Post-L F0 Bouncing by Modeling Articulatory Dynamics
13:30 - 15:30
SPEECH PRODUCTION ? COARTICULATION AND SPEECH TIMING
Benu Stefano
Jaw Movement in Vowels and Liquids Forming the Syllable Nucleus
Stella Antonio
Coarticulation Across Prosodic Domains in Italian: An Ultrasound Investigation
Juraj ?imko 
Investigating the Stability of Intergestural Timing Relations
Claudio Zmarich 
Speech Timing Organization for the Phonological Length Contrast in Italian Consonants
Chiara Celata 
Timing in Italian VNC Sequences at Different Speech Rates
Christina Hagedorn 
Automatic Analysis of Singleton and Geminate Consonant Articulation Using Real-Time Magnetic Resonance Imaging
16:00 - 18:00
PHYSIOLOGY AND PATHOLOGY OF SPOKEN LANGUAGE
Hemant A. Patil 
Novel VTEO Based Mel Cepstral Features for Classification of Normal and Pathological Voices
Eiji Shimura 
Temporal Performance of Dysarthric Patients in Speech and Tapping Tasks
Xinhui Zhou 
A Comparative Acoustic Study on Speech of Glossectomy Patients and Normal Subjects
Francis Grenez 
Dysperiodicity Analysis of Perceptually Assessed Synthetic Speech Stimuli
Alain Ghio 
Is the Perception of Voice Quality Language-Dependant? A Comparison of French and Italian Listeners and Dysphonic Speakers
J.R. Orozco-Arroyave 
Automatic Selection of Acoustic and Non-Linear Dynamic Features in Voice Signals for Hypernasality Detection
 
Michelangelo
10:00 - 12:00
SPEECH SYNTHESIS ? UNITSELECTION AND HYBRID APPROACHES
Vivek Kumar Rangarajan Sridhar 
Enriching Text-to-Speech Synthesis Using Automatic Dialog Act Tags
Lukas Latacz 
Joint Target and Join Cost Weight Training for Unit Selection Synthesis
Andreas Windmann 
Prominence-Based Prosody Prediction for Unit Selection Speech Synthesis
Schroder Marc 
Evaluating the Meaning of Synthesized Listener Vocalizations
Iñaki Sainz 
A Hybrid TTS Approach for Prosody and Acoustic Modules
Alexander Sorin 
Uniform Speech Parameterization for Multi-Form Segment Synthesis
13:30 - 15:30
ASR ? ACOUSTIC MODELS II
Logo Logo
Frank Seide 
Conversational Speech Transcription Using Context-Dependent Deep Neural Networks
Guangsen Wang 
Sequential Classification Criteria for NNs in Automatic Speech Recognition
Herve Bourlard 
Grapheme-Based Automatic Speech Recognition Using KL-HMM
Joseph Keshet 
Direct Error Rate Minimization of Hidden Markov Models
Xie Sun 
On the Effectiveness of Statistical Modeling Based Template Matching Approach for Continuous Speech Recognition
Guangsen Wang 
Comparison of Smoothing Techniques for Robust Context Dependent Acoustic Modelling in Hybrid NN/HMM Systems
16:00 - 18:00
SOURCE SEPARATION
O'Shaughnessy Douglas
Blind Speech Separation in Multiple Environments Using a Frequency Oriented PCA Method for Convolutive Mixtures
Zbyn¡ek Koldovský 
Blind Speech Separation in Time-Domain Using Block-Toeplitz Structure of Reconstructed Signal Matrices
Auxiliadora Sarmiento 
Generalized Method for Solving the Permutation Problem in Frequency-Domain Blind Source Separation of Convolved Speech Signals
Emad M. Grais 
Adaptation of Speaker-Specific Bases in Non-Negative Matrix Factorization for Single Channel Speech-Music Separation
Girin Laurent 
An Informed Source Separation System for Speech Signals
Ngoc Thuy Tran 
Adaptive Blocking Beamformer for Speech Separation
 
Raffaello
10:00 - 12:00
SPEECH ENHANCEMENT ANALYSIS AND EVALUATION
Ryoichi Miyazaki 
Theoretical Analysis of Musical Noise and Speech Distortion in Structure-Generalized Parametric Blind Spatial Subtraction Array
Yan Tang 
Subjective and Objective Evaluation of Speech Intelligibility Enhancement Under Constant Energy and Duration Constraints
Chandra Sekhar Seelamantula 
A Risk-Estimation-Based Comparison of Mean Square Error and Itakura-Saito Distortion Measures for Speech Enhancement
Mahdi Triki 
On Noise Tracking for Noise Floor Estimation
Ben Milner 
Maximum a posteriori Estimation of Noise from Non-Acoustic Reference Signals in Very Low Signal-to-Noise Ratio Environments
Ryo Wakisaka 
Blind Speech Prior Estimation for Generalized Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator
13:30 - 15:30
ROBUST SPEECH RECOGNITION II
Logo Logo
Ramón Fernandez Astudillo 
Propagation of Uncertainty Through Multilayer Perceptrons for Robust Automatic Speech Recognition
Katariina Mahkonen 
Mapping Sparse Representation to State Likelihoods in Noise-Robust Automatic Speech Recognition
Heikki Kallasjoki 
Uncertainty Measures for Improving Exemplar-Based Source Separation
Hsien-Cheng Liao 
Maximum Confidence Measure Based Interaural Phase Difference Estimation for Noise Masking in Dual-Microphone Robust Speech Recognition
Rose Richard
A Performance Monitoring Approach to Fusing Enhanced Spectrogram Channels in Robust Speech Recognition
Xunying Lìu 
Generalized Variable Parameter HMMs for Noise Robust Speech Recognition
16:00 - 18:00
MULTIMODAL SIGNAL PROCESSING
Per Ola Kristensson 
Asynchronous Multimodal Text Entry Using Speechand Gesture Keyboards
Niall McLaughlin 
Robust Bimodal Person Identification Using Face and Speech with Limited Training Data and Corruption of Both Modalities
Atef Ben Youssef 
Toward a Multi-Speaker Visual Articulatory Feedback System
Thomas Hueber 
Statistical Mapping Between Articulatory and Acoustic Data for an Ultrasound-Based Silent Speech Interface
Joerg Schmalenstroeer 
Unsupervised Geometry Calibration of Acoustic Sensor Networks Using Source Correspondences
Michael Wand 
Investigations on Speaking Mode Discrepancies in EMG-Based Speech Recognition
Liao Y.
 
30/08/2011
Auditorium
08:30 - 09:30
HONEST SIGNALS
Fung Pascale
Alex 'Sandy' Pentland 
Honest Signals"
Logistics 
10:00 - 12:00
ASR ? LANGUAGE MODELS II
Tomá? Mikolov 
Empirical Evaluation and Combination of AdvancedLanguage Modeling Techniques
Geoffrey Zweig 
Personalizing Model M for Voice-Search
Takahiro Shinozaki 
Sentence Selection by Direct Likelihood Maximization for Language Model Adaptation
Ebru Arisoy 
Feature Combination Approaches for Discriminative Language Models
Sankaranarayanan Ananthakrishnan 
On-Line Language Model Biasing for Multi-Pass Automatic Speech Recognition
Moonyoung Kang 
Mandarin Word-Character Hybrid-Input NeuralNetwork Language Model
13:30 - 15:30
DIALECT AND ACCENT IDENTIFICATION
Philippe Boula de Mareüil 
In Search of Cues Discriminating West-African Accents in French
Abualsoud Hanani 
Computer and Human Recognition of Regional Accents of British English
Bin Ma 
Target-Aware Lattice Rescoring for Dialect Recognition
Murat Akbacak 
Effective Arabic Dialect Classification Using Diverse Phonotactic Models
Nancy F. Chen 
Characterizing Deletion Transformations Across Dialects Using a Sophisticated Tying Mechanism
Fadi Biadsy 
Dialect and Accent Recognition Using Phonetic-Segmentation Supervectors
16:00 - 18:00
LANGUAGE IDENTIFICATION
Rong Zheng 
Data-Driven UBM Generation via Tied Gaussians forn GMM-Supervector Based Accent Identification
David Martínez 
I3A Language Recognition System for Albayzin 2010 LRE
Mikel Penagarikano 
Dimensionality Reduction for Using High-Order n-Grams in SVM-Based Phonotactic Language Recognition
Pedro A. Torres-Carrasquillo 
Language Recognition via i-vectors and Dimensionality Reduction
David Martínez 
Language Recognition in iVectors Space
 
Brunelleschi
10:00 - 12:00
VOICE CONVERSION
Daisuke Saito 
One-to-Many Voice Conversion Based on Tensor Representation of Speaker Space
Yu Qiao 
A Study on Bag of Gaussian Model with Application to Voice Conversion
Lei Li 
A Bayesian Approach to Voice Conversion Based on GMMs Using Multiple Model Structures
Mahdi Eslami 
Quality Improvement of Voice Conversion Systems Based on Trellis Structured Vector Quantization
Hadas Benisty 
Voice Conversion Using GMM with Enhanced Global Variance
Elizabeth Godoy 
Spectral Envelope Transformation Using DFW and Amplitude Scaling for Voice Conversion with Parallel or Nonparallel Corpora
13:30 - 15:30
ASR ? ACOUSTIC MODELS III
Roger W Hsiao 
Generalized Baum-Welch Algorithm and its Implication to a New Extended Baum-Welch Algorithm
Phil Woodland 
Word Boundary Modelling and Full Covariance Gaussians for Arabic Speech-to-Text Systems
Mak Brian
A Fully Automated Derivation of State-Based Eigentriphones for Triphone Modeling with No Tied States Using Regularization
Nahamoo D.
Reducing Computational Complexities of Exemplar-Based Sparse Representations with Applications to Large Vocabulary Speech Recognition
Zhi-Jie YAN 
An i-vector Based Approach to Training Data Clustering for Improved Speech Recognition
Senaka Buthpitiya 
Rapid Training of Acoustic Models Using Graphics Processing Unit
16:00 - 18:00
ASR ? SEARCH, KEYWORD SPOTTING AND CONFIDENCE MEASURES II
Logo Logo
Evelyn Kurniawati 
A Template Based Voice Trigger System Using Bhattacharyya Edit Distance
D. Nolden 
Acoustic Look-Ahead for More Efficient Decoding in LVCSR
Frank Duckhorn 
A New Epsilon Filter for Efficient Composition of Weighted Finite-State Transducers
Sabato Marco Siniscalchi 
A Bottom-Up Stepwise Knowledge-Integration Approach to Large Vocabulary Continuous Speech Recognition Using Weighted Finite State Machines
M.S. Seigel 
Combining Information Sources for Confidence Estimation with CRF Models
Kouichi Katsurada 
Evaluation of Fast Spoken Term Detection Using a Suffix Array
 
Caravaggio
13:30 - 14:30
SPOKEN LANGUAGE PROCESSING OF HUMAN-HUMAN CONVERSATIONS I
Fabio Valente 
Language-Independent Socio-Emotional Role Recognition in the AMI Meetings Corpus
Rivka Levitan 
Measuring Acoustic-Prosodic Entrainment with Respect to Multiple Levels and Dimensions
Youngja Park 
Automatic Call Quality Monitoring Using Cost-Sensitive Classification
14:30 - 15:30
SPOKEN LANGUAGE PROCESSING OF HUMAN-HUMAN CONVERSATIONS II
Tomoharu Iwata 
Learning Influences from Word Use in Polylogue
Wen Wang 
Identifying Agreement/Disagreement in Conversational Speech: A Cross-Lingual Study
Daniel Neiberg 
A Dual Channel Coupled Decoder for Fillers and Feedback
Chi-Chun Lee 
An Analysis of PCA-Based Vocal Entrainment Measures in Married Couples' Affective Spoken Interactions
16:00 - 17:00
SPEECH AND AUDIO PROCESSING FOR HUMAN-ROBOT INTERACTION I
Lars Schillingmann 
Using Prominence Detection to Generate Acoustic Feedback in Tutoring Scenarios
Takuma Otsuka 
Bayesian Extension of MUSIC for Sound Source Localization and Tracking
Martin Wöllmer 
Speech-Based Non-Prototypical Affect Recognition for Child-Robot Interaction in Reverberated Environments
17:00 - 18:00
SPEECH AND AUDIO PROCESSING FOR HUMAN-ROBOT INTERACTION II
Mounira Maazaoui 
Blind Source Separation for Robot Audition Using Fixed Beamforming with HRTFs
Marie Tahon 
Real-Life Emotion Detection from Speech in Human-Robot Interaction: Experiments Across Diverse Corpora with Child and Adult Voices
Yazid Attabi 
Weighted Ordered Classes ? Nearest Neighbors: A New Framework for Automatic Emotion Recognition from Speech
David Doukhan 
Prosodic Analysis of a Corpus of Tales
Carlos T. Ishi 
Analysis of Acoustic-Prosodic Features Related to Paralinguistic Information Carried by Interjections in Dialogue Speech
Martin Heckmann 
Robust Intonation Pattern Classification in Human Robot Interaction
Takashi Sumiyoshi 
ASR for Human-Symbiotic Robot "EMIEW2" with Mechanical Noise and Floor-Level Noise Reduction
 
Donatello
10:00 - 12:00
SPEECH AUDIO ANALYSIS ANDCLASSIFICATION
Seppo Fagerlund 
Stop Consonant Recognition by Temporal FineStructure of Burst
Katrin Kirchhoff 
Phonetic Classification Using Controlled Random Walks
Luís Marujo 
Keyphrase Cloud Generation of Broadcast News
Alfonso M. 
Optimized Feature Extraction and HMMs in Subword Detectors
Ziqiang Shi 
Real-World Speech/Non-Speech Audio Classification Based on Sparse Representation Features and GPCs
Manas A. 
Privacy Preserving Speaker Verification Using Adapted GMMs
Éva Székely 
Clustering Expressive Speech Styles in Audiobooks Using Glottal Source Parameters
Bogdan Ludusan 
On the Use of the Rhythmogram for Automatic Syllabic Prominence Detection
Sethserey Sam 
Speech Modulation Features for Robust Nonnative Speech Accent Detection
Chi Zhang 
Frame-Level Vocal Effort Likelihood Space Modeling for Improved Whisper-Island Detection
Xing Fan 
Speaker Identification for Whispered Speech Using a Training Feature Transformation from Neutral to Whisper
Andrea DeMarco 
An Accurate and Robust Gender Identification Algorithm
Xiaohong Yang 
Deep Belief Networks for Automatic Music Genre Classification
Jonathan Dennis 
Image Representation of the Subband Power Distribution for Robust Sound Classification
Bo Xiao 
Acoustic and Visual Cues of Turn-Taking Dynamics in Dyadic Interactions
13:30 - 15:30
SHOW & TELL DEMONSTRATION ? MOBILITY AND WEB-SERVICES
Stuart N. Wrigley 
Making an Automatic Speech Recognition Service Freely Available on the Web
Yeon-Jun Kim 
AT&T VoiceBuilder: A Cloud-Based Text-to-Speech Voice Builder Tool
Roger Tucker 
Extending Audio Notetaker to Browse WebASRTranscriptions
Samantha Ainsley 
A Web-Based Tool for Developing Multilingual Pronunciation Lexicons
Michael Johnston 
Speak4it and the Multimodal Semantic Interpretation System
Tanel Alumäe 
TSAB ? Web Interface for Transcribed Speech Collections
Andrej Ljolje 
Visual Voice Mail to Text on the iPhone/iPad
Christoph Draxler 
Percy ? An HTML5 Framework for Media Rich Web Experiments on Mobile Devices
Mark Huckvale 
The KLAIR Toolkit for Recording Interactive Dialogues with a Virtual Infant
Francesco Nesta 
Real-Time Prototype for Integration of Blind Source Extraction and Robust Automatic Speech Recognition
 
Leonardo
10:00 - 12:00
PHONOLOGY AND PHONETICS
Vahid Sadeghi 
Laryngealization and Breathiness in Persian
Felicitas Kleber 
Age-Dependent Differences in the Neutralization of the Intervocalic Voicing Contrast: Evidence from an Apparent-Time Study on East Franconian
Barbara Samlowski 
Comparing Syllable Frequencies in Corpora of Written and Spoken Language
Luca Iacoponi 
Sylli: Automatic Phonological Syllabification for Italian
André N. Xavier 
A Preliminary Study on the Production of Signs in Brazilian Sign Language when One of the Manual Articulators is Unavailable
Ho-hsien Pan 
Electroglottograph and Acoustic Cues for Phonation Contrasts in Taiwan Min Falling Tones
13:30 - 15:30
FIRST LANGUAGE ACQUISITION
Kouki Miyazawa 
The Multi Timescale Phoneme Acquisition Model of the Self-Organizing Based on the Dynamic Features
Helen Brown 
The Time-Course of Talker-Specificity Effects for Newly-Learned Pseudowords: Evidence for a Hybrid Model of Lexical Representation
Antje Schweitzer 
A Parametric Approach to Intonation Acquisition Research: Validation on Child-Directed Speech Data
Maarten Versteegh 
Modelling Novelty Preference in Word Learning
G. Ananthakrishnan 
Using Imitation to Learn Infant-Adult Acoustic Mappings
Christina Bergmann 
Thresholding Word Activations for Response Scoring ? Modelling Psycholinguistic Data
16:00 - 18:00
SECOND LANGUAGE ACQUISITION, DEVELOPMENT AND LEARNING II
Xiaojun Qian 
On Mispronunciation Lexicon Generation Using Joint-Sequence Multigrams in Computer-Aided Pronunciation Training (CAPT)
Bianca Sisinni 
Validating a Second Language Perception Model for Classroom Context ? A Longitudinal Study within the Perceptual Assimilation Model
Makiko Sadakata 
The Role of Variability in Non-Native Perceptual Learning of a Japanese Geminate-Singleton FricativeContrast
Jared Bernstein 
Fluency Changes with General Progress in L2Proficiency
Slim Ouni 
Tongue Gestures Awareness and PronunciationTraining
Wim A. van Dommelen 
Impact of Speaker Variability on Speech Perception in Non-Native Listeners
 
Michelangelo
10:00 - 12:00
ROBUST SPEECH RECOGNITION III
Logo Logo
P. Mowlaee 
Sinusoidal Approach for the Single-Channel Speech Separation and Recognition Challenge
Cemil Demir 
Semi-Supervised Single-Channel Speech-Music Separation for Automatic Speech Recognition
Marco Matassoni 
A Level-Dependent Auditory Filter-Bank for Speech Recognition in Reverberant Environments
Mehrez Souden 
A Multichannel Feature-Based Processing for Robust Speech Recognition
Xiong Xiao 
Feature Normalization Using Structured Full Transforms for Robust Speech Recognition
Masakiyo Fujimoto 
A Robust Estimation Method of Noise Mixture Model for Noise Suppression
13:30 - 15:30
SPOKEN DIALOGUE SYSTEMS I
Teruhisa Misu 
User Study of Spoken Decision Support System
Antoine Raux 
Efficient Probabilistic Tracking of User Goal and Dialog History for Spoken Dialog Systems
Alexander Schmitt 
Tackling a Shilly-Shally Classifier for Predicting Task Success in Spoken Dialogue Interaction
Toyomi Meguro 
Evaluation of Listening-Oriented Dialogue Control Rules Based on the Analysis of HMMs
D. Suendermann 
Large-Scale Experiments on Data-Driven Design of Commercial Spoken Dialog Systems
Jessica Villing 
Comparing System-Driven and Free Dialogue in In-Vehicle Interaction
16:00 - 18:00
SLP FOR INFORMATION EXTRACTION AND RETRIEVAL I
Logo Logo
Timothy J. Hazen 
Latent Topic Modeling for Audio Corpus Summarization
Esteve Yannick 
Investigation of Spontaneous Speech Characterization Applied to Speaker Role Recognition
Armando Muscariello 
Zero-Resource Audio-Only Spoken Term Detection Based on a Combination of Template Matching Techniques
Yeon-Jun Kim 
Automatic Learning in Content Indexing Service Using Phonetic Alignment
Kuan-Yu Chen 
Leveraging Relevance Cues for Improved Spoken Document Retrieval
Yun-Nung Chen 
Spoken Lecture Summarization by Random Walk over a Graph Constructed with Automatically Extracted Key Terms
 
Raffaello
10:00 - 12:00
SPOKEN LANGUAGE UNDERSTANDING
Logo Logo
Xiao Li 
Multi-Task Learning for Spoken Language Understanding with Shared Slots
Dustin Hillard 
Learning Weighted Entity Lists from Web Click Logs for Spoken Language Understanding
Dilek Hakkani-Tür 
Bootstrapping Domain Detection Using Query Click Logs for New Domains
Asli Celikyilmaz 
Approximate Inference for Domain Detection in Spoken Language Understanding
Chien-Lin Huang 
Speech Indexing Using Semantic Context Inference
Yun-Cheng Ju 
Automatically Optimizing Utterance Classification Performance without Human in the Loop
13:30 - 15:30
SPOKEN LANGUAGE RESOURCES, EVALUATION AND STANDARDIZATION II
Michael A. Carlin 
Rapid Evaluation of Speech Representations for Spoken Term Discovery
Ben Hixon 
Phonemic Similarity Metrics to Compare Pronunciation Methods
Janto Skowronek 
Investigating the Effect of Number of Interlocutors on the Quality of Experience for Multi-Party Audio Conferencing
Jáchym Kolár 
On Development of Consistently Punctuated Speech Corpora
Shrikanth Narayanan 
A Multimodal Real-Time MRI Articulatory Corpus for Speech Research
Denis Burnham 
Building an Audio-Visual Corpus of Australian English: Large Corpus Collection with an Economical Portable and Replicable Black Box
 
31/08/2011
Auditorium
08:30 - 09:30
FUTURE AND APPLICATIONS OF SPEECH AND LANGUAGE TECHNOLOGIES FOR THE GOOD HEALTH OF SOCIETY
Not-so-surprise speaker 
Surprise talk
Riccardi Giuseppe
Presentation
Gabriele Miceli 
Language disorders: viewpoints on a complex object
Björn Granström 
Speech technology in (re)habilitation of persons with communication disabilities
Hiroshi Ishiguro 
From teleoperated androids to cellphones as surrogates
Logistics 
10:00 - 12:00
SPEAKER DIARIZATION I
Hagai Aronowitz 
Speaker Diarization Using a priori Acoustic Information
Kofi Boakye 
Improved Overlapped Speech Handling for Speaker Diarization
Stephen Shum 
Exploiting Intra-Conversation Variability for Speaker Diarization
Masafumi Nishida 
Speaker Clustering Based on Non-Negative Matrix Factorization
Sree Harsha Yella 
Information Bottleneck Features for HMM/GMM Speaker Diarization of Meetings Recordings
D. Dean 
Cross Likelihood Ratio Based Speaker Clustering Using Eigenvoice Models
13:30 - 15:30
SPEAKER DIARIZATION II
Janez ?ibert 
Prosodic and Phonetic Features for Speaker Clustering in Speaker Diarization Systems
David Van Leeuwen 
Diarization-Based Speaker Retrieval for Broadcast Television Archives
Martin Zelenák 
The Detection of Overlapping Speech with Prosodic Features for Speaker Diarization
Sree Hari Krishnan Parthasarathi 
LP Residual Features for Robust, Privacy-Sensitive Speaker Diarization
Houman Ghaemmaghami 
Extending the Task of Diarization to Speaker Attribution
Barras Claude
Comparing Multi-Stage Approaches for Cross-Show Speaker Diarization
16:00 - 17:30
CLOSING CEREMONY
Cosi Piero
Closing: Awards
Perrier Pascal
Christian Benoit Award
Trancoso Isabel
ISCA Awards
Klabbers Esther
InterSpeech 2012
Bimbot Frédéric
Interspeech 2013 - Lyon, France
Cosi Piero
Closing: Thanks
 
Brunelleschi
10:00 - 12:00
ASR ? NEW PARADIGMS
Logo Logo
Yunxin Zhao 
New Methods for Template Selection and Compression in Continuous Speech Recognition
Shi-Xiong Zhang 
Structured Support Vector Machines for Noise Robust Continuous Speech Recognition
Masayuki Suzuki 
Continuous Digits Recognition Leveraging Invariant Structure
Nahamoo David
Convergence of Line Search A-Function Methods
Yasuhisa Fujii 
Hidden Boosted MMI and Hierarchical State Posterior Feature for Automatic Speech Recognition Based on Hidden Conditional Neural Fields
Jun Cai 
Recognition and Real Time Performances of a Lightweight Ultrasound Based Silent Speech Interface Employing a Language Model
13:30 - 15:30
ADAPTATION FOR ASR
Logo Logo
Shinji Watanabe 
Model Adaptation for Automatic Speech Recognition Based on Multiple Time Scale Evolution
K. Chin 
Integrated Online Speaker Clustering and Adaptation
Zoltán Tüske 
A Study on Speaker Normalized MLP Features inn LVCSR
Yongwon Jeong 
Matrix-Variate Distribution of Training Models for Robust Speaker Adaptation
Michael L. Seltzer 
Separating Speaker and Environmental Variability Using Factored Transforms
Wang Chao
Your Mobile Virtual Assistant Just Got Smarter
 
Caravaggio
10:00 - 11:00
SPEECH TECHNOLOGY FOR UNDER-RESOURCED LANGUAGES I
Ngoc Thang Vu 
Rapid Building of an ASR system for Under-Resourced Languages Based on Multilingual Unsupervised Training
Shyamal Kr. Das Mandal 
Places and Manner of Articulation of Bangla Consonants: A EPG Based Study
Marelie H. Davel 
Efficient Harvesting of Internet Audio for Resource-Scarce ASR
11:00 - 12:00
SPEECH TECHNOLOGY FOR UNDER-RESOURCED LANGUAGES II
Milan Se¡cujski 
Automatic Prosody Generation for Serbo-Croatian Speech Synthesis Based on Regression Trees
Alexey Karpov 
Very Large Vocabulary ASR for Spoken Russian with Syntactic and Morphemic Analysis
Timothy Kempton 
Cross-Language Phone Recognition when the Target Language Phoneme Inventory is not Known
Sourish Chaudhuri 
A Paradigm for Limited Vocabulary Speech Recognition Based on Redundant Spectro-Temporal Feature Sets
N. Barroso 
GorUp: An Ontology-Driven Audio Information Retrieval System that Suits the Requirements of Under-Resourced Languages
Nic J. de Vries 
Woefzela ? An Open-Source Platform for ASR Data Collection in the Developing World
Hansjörg Mixdorff 
A Study on the Perception of Tone and Intonation in Sesotho
Febe de Wet 
Developing a Broadband Automatic Speech Recognition System for Afrikaans
Herman Kamper 
Multi-Accent Speech Recognition of Afrikaans, Black and White Varieties of South African English
C. Tantibundhit 
Perceptual Representation of Consonant Sounds in Thai
Mumtaz B. Mustafa 
A Cross-Lingual Approach to the Development of anHMM-Based Speech Synthesis System for Malay
 
Donatello
10:00 - 12:00
SPEECH PROCESSING TOOLS
Christoph Draxler 
Speech Processing Tools ? An Introduction to Interoperability
Jean-Philippe Goldman 
EasyAlign: An Automatic Phonetic Alignment Tool Under Praat
Julián Villegas 
Mtrans: A Multi-Channel, Multi-Tier Speech Annotation Tool
Christophe Cerisara 
The JSafran Platform for Semi-Automatic Speech Processing
Johannes Wagner 
The Social Signal Interpretation Framework (SSI) for Real Time Signal Processing and Recognition
Han Sloetjes 
ELAN ? Aspects of Interoperability and Functionality
Marc Schröder 
Open Source Voice Creation Toolkit for the MARY TTS Platform
Stefan Steidl 
Java Visual Speech Components for Rapid Application Development of GUI Based Speech Processing Applications
Michael Johnston 
mTalk ? A Multimodal Browser for Mobile Services
Stuart N. Wrigley 
Web-Based Automatic Speech Recognition Service ? webASR
Markus Klehr 
A Web Based Speech Transcription Workplace
Philippe Martin 
WinPitch: A Multimodal Tool for Speech Analysis of Endangered Languages
Mark Huckvale 
Recording Caregiver Interactions for Machine Acquisition of Spoken Language Using the KLAIR Virtual Infant
 
Leonardo
10:00 - 12:00
PROSODY I
Michele Gubian 
A Quantitative Investigation of the Prosody of Verum Focus in Italian
Amelie Dorn 
Effects of Focus on f0 and Duration in Irish (Gaelic) Declaratives
Jennifer Cole 
The Phonology and Phonetics of Perceived Prosody: What do Listeners Imitate?
Amandine Michelas 
Uncovering the Effect of Imitation on Tonal Patterns of French Accentual Phrases
Pilar Prieto 
Crossmodal Prosodic and Gestural Contribution to the Perception of Contrastive Focus
Erin Cvejic 
Temporal Relationship Between Auditory and Visual Prosodic Cues
13:30 - 15:30
PROSODY II
György Szaszák 
Analysing the Correspondence Between Automatic Prosodic Segmentation and Syntactic Structure
Joseph Tepperman 
Long-Distance Rhythmic Dependencies and their Application to Automatic Language Identification
Andrew Rosenberg 
Symbolic and Direct Sequential Modeling of Prosody for Classification of Speaking-Style and Nativeness
Wentao Gu 
Prosodic Analysis and Perception of Mandarin Utterances Conveying Attitudes
Michele Gubian 
Predicting Taiwan Mandarin Tone Shapes from their Duration
Charlotte Wollermann 
Variation of Accent Type and of Context ? Influences on Pragmatic Focus Interpretation
 
Michelangelo
10:00 - 12:00
SPOKEN DIALOGUE SYSTEMS II
Heriberto Cuayáhuitl 
Optimizing Situated Dialogue Management in Unknown Environments
Om D. Deshmukh 
Acoustic-Similarity Based Technique to Improve Concept Recognition
Doug Peters 
Dialog Methods for Improved Alphanumeric String Capture
David DeVault 
Detecting the Status of a Predictive Incremental Speech Understanding Model for Real-Time Decision-Making in a Spoken Dialogue System
Senthilkumar Chandramohan 
User Simulation in Dialogue Systems Using Inverse Reinforcement Learning
Paul A. Crook 
Lossless Value Directed Compression of Complex User Goal States for Statistical Spoken Dialogue Systems
13:30 - 15:30
SLP FOR INFORMATION EXTRACTION AND RETRIEVAL II
Logo Logo
Vincent Claveau 
Topic Segmentation of TV-Streams by Mathematical Morphology and Vectorization
Cheung-Chi Leung 
Probabilistic Latent Semantic Analysis for Broadcast News Story Segmentation
Evandro Gouvêa 
Hybrid Speech Recognition for Voice Search: A Comparative Study
Yao Qian 
A New Phonetic Candidate Generator for Improving Search Query Efficiency
Kiyoaki Aìkawa 
Towards Voice-Input Symbolic Pattern Retrieval Using Parameter-Based Search
J. Ajmera 
A Language Independent Approach to Audio Search
 
Raffaello
10:00 - 12:00
SPEAKER STATE CHALLENGE ? INTOXICATION AND SLEEPINESS I
Björn Schuller 
The INTERSPEECH 2011 Speaker State Challenge
Claude Montacié 
Combining Multiple Phoneme-Based Classifiers with Audio Feature-Based Classifier for the Detection of Alcohol Intoxication
Fadi Biadsy 
Intoxication Detection Using Phonetic, Phonotactic and Prosodic Cues
Tobias Bocklet 
Drink and Speak: On the Automatic Classification of Alcohol Intoxication by Acoustic, Prosodic and Text-Based Features
Daniel Bone 
Intoxicated Speech Detection by Fusion of Speaker Normalized Hierarchical Features and GMM Supervectors
Stefan Ultes 
Attention, Sobriety Checkpoint! Can Humans Determine by Means of Voice, if Someone is Drunk. . . and Can Automatic Classifiers Compete?
Florian Hönig 
Does it Groove or does it Stumble ? Automatic Classification of Alcoholic Intoxication using Prosodic Features
13:30 - 15:30
SPEAKER STATE CHALLENGE ? INTOXICATION AND SLEEPINESS II
Florian Schiel 
Perception of Alcoholic Intoxication in Speech
Carlos Busso 
Detecting Sleepiness by Fusing Classifiers Trained with Novel Acoustic Features
Albino Nogueiras Rodríguez 
An HMM-Based Approach to the INTERSPEECH 2011 Speaker State Challenge
Elif Bozkurt 
RANSAC-Based Training Data Selection for Speaker State Recognition
Rok Gaj?ek 
University of Ljubljana System for Interspeech 2011 Speaker State Challenge
Dong-Yan Huang 
Speaker State Classification Based on Fusion of Asymmetric SIMPLS and Support Vector Machines
Stefan Steidl 
Summary of the INTERSPEECH 2011 Speaker State Challenge
 

   home   |   progetto   |   chi siamo   |   servizi   |   contatti   |   note legali

Tuttocongressi ©1995 - 2024 | Via Dei Perfetti Ricasoli, 94-96 - 50127 Firenze  |  P.Iva 03755090481