Global information
- Repository: https://github.com/georgesterpu/AVSR-tf1
- Contact:
- License:
- Reference:
@article{Sterpu2020Mar,
author = {Sterpu, George and Saam, Christian and Harte, Naomi},
title = {{How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition}},
journal = {IEEE/ACM Trans. Audio Speech Lang. Process.},
volume = {28},
pages = {1052--1064},
year = {2020},
month = {Mar},
issn = {2329-9304},
publisher = {IEEE},
doi = {10.1109/TASLP.2020.2980436}
}
Description
AVSR-tf1 is an open-source research system for Speech Recognition.
Written entirely in Python, AVSR-tf1 aims to provide a simple and reproducible way of training and evaluating speech recognition models based on sequence to sequence neural networks. AVSR-tf1 can exploit both auditory and visual speech modalities, considered either independently (ASR, VSR) or together (AVSR).