Apple sponsored the International Conference on Acoustics, Speech and Signal Processing (ICASSP), which took place in person from April 14 to 19 in Seoul, South Korea. ICASSP is the IEEE Signal Processing Society's flagship conference on signal processing and its applications.
Schedule
Below was the schedule of Apple sponsored workshops and events at ICASSP 2024. Stop by the Apple booth from April 16 to 19 from 8:20 AM to 6:00 PM UTC at Booth D1 in the COEX Convention Center Exhibition Hall.
Monday, April 15
- Towards a World-English Language Model
- 1:30 PM - 3:30 PM UTC, Poster Zone 1B
- Rricha Jalota, Lyan Verwimp, Markus Nussbaum-Thom, Amr Mousa, Arturo Argueta, Youssef Oualil
- Workshop on Hands Free Communication and Microphone Arrays (HSCMA)
- 4:00 PM UTC, Room 205
- Multichannel Voice Trigger Detection based on Transform-average-concatenate
- Takuya Higuchi, Avamarie Brueggeman (The University of Texas at Dallas), Masood Delfarah, Stephen Shum
Wednesday, April 17
- Women in Signal Processing (WiSP)
- 11:40 AM - 1:40 PM UTC, Room 402
- Panos Georgiou and Clara Borrelli will be representing Apple at the Women in Signal Processing Luncheon.
- Leveraging Large Language Models for Exploiting ASR Uncertainty
- 1:10 PM - 3:10 PM UTC, Poster Zone 1A
- Pranay Dighe, Yi Su, Daniel Zheng, Yunshu Liu, Vineet Garg, Xiaochuan Niu, Ahmed Tewfik
- A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
- 1:10 PM - 3:10 PM UTC, Poster Zone 1B
- Dominik Wagner (FAU), Alex Churchill, Siddharth Sigtia, Panos Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi
- Personalization of CTC-based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization
- 1:10 PM - 1:30 PM UTC, Room 102
- Zhihong Lei, Ernie Pusateri, Michael Han, Leo Liu, Mingbin Xu, Tim Ng, Zhen Huang, Ruchir Travadi, Darien Zhang, Mirko Hannemann, Man-Hung Siu
- Resource-constrained stereo singing voice cancellation
- 1:30 PM - 3:30 PM UTC, Poster Zone 4A
- Clara Borrelli, Dogac Basaran, Matthias Mauch , Matthew McVicar, James Rae, Mehrez Souden
Thursday, April 18
- Corpus Synthesis for Zero-shot ASR Domain Adaptation using Large Language Models
- 8:20 - 10:20 AM UTC, Poster Zone 1A
- Hsuan Su (National Taiwan University), Ting-Yao Hu, Hema Koppula, Raviteja Vemulapalli, Rick Chang, Karren Yang, Gautam Varma Mantena, Oncel Tuzel
- Modality Drop-out for Multimodal Device Directed Speech Detection Using Verbal and Non-Verbal Features
- 8:30 - 10:30 AM UTC, Poster Zone 5C
- Gautam Krishna, Sameer Dharur, Oggi Rudovic, Pranay Dighe, Saurabh Adya, Ahmed Hussen Abdelaziz, Ahmed Tewfik
- Student and Young Professionals Luncheon
- 12:00 - 2:00 PM UTC, Room E5 - E6
- Kisun You, Evan Yamasaki, and Alex Acero will be representing Apple at the Student Job Fair and Luncheon.
- Investigating Salient Representations and Label Varience in Dimensional Speech Emotion Analysis
- 1:10 - 3:10 PM UTC, Poster Zone 5A
- Vikramjit Mitra, Jingping Nie, Erdrin Azemi
- Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
- 4:30 - 6:30 PM UTC, Poster Zone 1B
- Kumari Nishu, Minsik Cho, Paul Dixon, Devang Naik
- Improving vision-inspired keyword spotting using a streaming conformer encoder with input-dependent dynamic depth
- 4:30 - 6:30 PM UTC, Poster Zone 3C
- Alexandre Bittar (Ecole Polytechnique Fédérale de Lausanne, Switzerland), Paul Dixon, Mohammad Samragh Razlighi, Kumari Nishu, Devang Naik
Friday, April 19
- Streaming Anchor Loss: Augmenting Supervision with Temporal Significance
- 1:10 - 1:30 PM UTC, Room 103
- Oggy Sarawgi, Jack Berkowitz, Vineet Garg, Arnav Kundu, Minsik Cho, Sai Srujana Buddi, Saurabh Adya, Ahmed Tewfik
- Dialog modeling in audiobook synthesis
- 2:10 - 2:30 PM UTC, Room 104
- Cheng-Chieh Yeh, Reza Shirani, Weicheng Zhang, Tuomo Raitio, Ramya Rasipuram, Ladan Golipour, David Winarsky
Accepted Papers
Corpus Synthesis for Zero-shot ASR Domain Adaptation using Large Language Models
Hsuan Su (National Taiwan University), Ting-Yao Hu, Hema Koppula, Raviteja Vemulapalli, Rick Chang, Karren Yang, Gautam Varma Mantena, Oncel Tuzel
Dialog modeling in audiobook synthesis
Cheng-Chieh Yeh, Reza Shirani, Weicheng Zhang, Tuomo Raitio, Ramya Rasipuram, Ladan Golipour, David Winarsky
Flexible Keyword Spotting based on Homogeneous Audio-Text Embedding
Kumari Nishu, Minsik Cho, Paul Dixon, Devang Naik
Alexandre Bittar (Ecole Polytechnique Fédérale de Lausanne, Switzerland), Paul Dixon, Mohammad Samragh Razlighi, Kumari Nishu, Devang Naik
Leveraging Large Language Models for Exploiting ASR Uncertainty
Pranay Dighe, Yi Su, Daniel Zheng, Yunshu Liu, Vineet Garg, Xiaochuan Niu, Ahmed Tewfik
Gautam Krishna, Sameer Dharur, Oggi Rudovic, Pranay Dighe, Saurabh Adya, Ahmed Hussen Abdelaziz, Ahmed Tewfik
Zhihong Lei, Ernie Pusateri, Michael Han, Leo Liu, Mingbin Xu, Tim Ng, Zhen Huang, Ruchir Travadi, Darien Zhang, Mirko Hannemann, Man-Hung Siu
Streaming Anchor Loss: Augmenting Supervision with Temporal Significance
Oggy Sarawgi, Jack Berkowitz, Vineet Garg, Arnav Kundu, Minsik Cho, Sai Srujana Buddi, Saurabh Adya, Ahmed Tewfik
Towards a World-English Language Model
Rricha Jalota, Lyan Verwimp, Markus Nussbaum-Thom, Amr Mousa, Arturo Argueta, Youssef Oualil
A Multimodal Approach to Device-Directed Speech Detection with Large Language Models
Dominik Wagner (FAU), Alex Churchill, Siddharth Sigtia, Panos Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi
Investigating Salient Representations and Label Varience in Dimensional Speech Emotion Analysis
Vikramjit Mitra, Jingping Nie, Erdrin Azemi
Resource-constrained stereo singing voice cancellation
Clara Borrelli, Dogac Basaran, Matthias Mauch , Matthew McVicar, James Rae, Mehrez Souden
Workshop Accepted Papers
Multichannel Voice Trigger Detection based on Transform-average-concatenate
Takuya Higuchi, Avamarie Brueggeman (The University of Texas at Dallas), Masood Delfarah, Stephen Shum
Acknowledgements
Daniele Giacobello is a member of the ICASSP 2024 Organizing Committee.
Takaaki Hori, Daniele Giacobello, and Yi Su are ICASSP 2024 session chairs.
Vikram Mitra is an Affiliate SLTC Member.
Yi Su, Aswin Sivaraman, Takaaki Hori, Daniele Giacobello, Vineet Garg, Jack Berkowitz, and Vikram Mitra are reviewers for ICASSP 2024.
Related readings and updates.
Empirical Methods in Natural Language Processing (EMNLP) 2024
Apple is presenting new research at the Empirical Methods in Natural Language Processing (EMNLP) conference, which takes place in person in Miami, Florida, from November 12 - 16. We are proud to again sponsor the conference, which brings together the scientific and industrial research communities around natural language processing and artificial intelligence. Below is an overview of Apple’s participation at EMNLP 2024.
International Conference on Machine Learning (ICML) 2024
Apple is sponsoring the International Conference on Machine Learning (ICML) 2024, which is taking place in person from July 21 to 27 in the Messe Wien Exhibition and Congress Center, Vienna Austria. ICML is globally renowned for presenting and publishing cutting-edge research on all aspects of machine learning used in closely related areas like artificial intelligence, statistics and data science, as well as important application areas such as machine vision, computational biology, speech recognition, and robotics. Below is the schedule of our sponsored workshops and events at ICML 2024.