Flying Together with Audio and Video: Enhancing Communication for the Hearing-Impaired Through an Emerging Closed Captioning Standard - Details

Author：

Indexed by：

EI Scopus

Abstract：

As　the　text-based　visual　representation　of　a　program’s　audio　elements,　Closed　Captioning　primarily　serves　as　a　technology　to　enhance　communication　for　the　hearing　impaired.　Since　text　is　much　simpler　than　audio　and　video,　Closed　Captioning　is　traditionally　transmitted　as　supplementary　or　auxiliary　information　as　part　of　the　image　or　in　an　extended　or　private　data　field　of　an　encoded　video　bitstream　called　video　elementary　stream,　usually　accompanied　by　one　or　more　audio　elementary　streams.　Since　Closed　Captioning　is　extremely　important　for　the　accessibility　of　the　audio　content　of　a　program　to　the　hearing　impaired,　we　propose　to　encode　the　closed　caption　into　a　bitstream　called　caption　elementary　stream,　which　can　fly　together　with　audio　and　video　elementary　streams.　In　other　words,　closed　caption　can　be　stored　and　transmitted　in　a　manner　similar　to　how　audio　and　video　are　handled.　We　have　drafted　a　national　standard　for　Closed　Captioning　in　China,　which　is　now　in　its　final　stage　of　approval　and　publication.　In　this　paper,　the　main　technical　content　of　the　emerging　Closed　Captioning　standard　will　be　introduced.　Specifically,　the　encoding,　storage,　and　transmission　of　Closed　Captioning　will　be　described.　Moreover,　the　decoding　and　presentation　of　Closed　Captioning　under　the　two　scenarios　of　on　demand　streaming　and　live　streaming　will　also　be　designed　and　discussed.　The　AI　technology　of　Speech-to-Text　enables　Closed　Captioning　to　be　implemented　efficiently　with　the　help　of　manual　proofreading.　Positively,　the　emergence　of　the　Closed　Captioning　standard　will　enhance　accessibility　to　audio-visual　programs　on　both　the　broadcasting　network　and　the　Internet　for　the　hearing-impaired　in　China　and　worldwide.　©　The　Author(s),　under　exclusive　license　to　Springer　Nature　Singapore　Pte　Ltd.　2025.

Keyword：

Image coding Image enhancement Video on demand Video streaming Encoding (symbols) Audio acoustics Audio streaming Signal encoding Speech enhancement Energy security

Author Community：

[ 1 ] [Mou, Luntian]Beijing University of Technology, Beijing, China
[ 2 ] [Mou, Luntian]Beijing Institute of Artificial Intelligence, Beijing, China
[ 3 ] [Li, Peize]Beijing University of Technology, Beijing, China
[ 4 ] [Zhao, Haiwu]Shanghai University of Engineering Science, Shanghai, China
[ 5 ] [Fu, Qiang]Photosynthetic AI Tech Co., Ltd., Hangzhou, China
[ 6 ] [Luo, Hong]China Mobile Information Technology Co., Ltd., Hangzhou, China
[ 7 ] [Liu, Cong]IFLYTEK Research, Hefei, China
[ 8 ] [Ma, Nan]Beijing University of Technology, Beijing, China
[ 9 ] [Ma, Nan]Beijing Institute of Artificial Intelligence, Beijing, China
[ 10 ] [Huang, Tiejun]Peking University, Beijing, China
[ 11 ] [Gao, Wen]Peking University, Beijing, China
[ 12 ] [Gao, Wen]Peng Cheng Laboratory, Shenzhen, China

Reprint Author's Address：

Email：

Show more details

Related Keywords：

Implementation of Multichannel Speech Coding Based on the Opus Codec and Spatial Parameters
2024，14th IEEE International Conference on Signal Processing, Communications and Computing, ICSPCC 2024
A fast H.264 inter-frame prediction algorithm for special mode
2011，Acta Armamentarii
A fast motion estimation algorithm based on motion vector distribution prediction
2013，Journal of Software
Fast motion estimation method based on the motion characteristics of macro block
2011，

Source ：

ISSN： 0302-9743

Year： 2025

Volume： 15170 LNAI

Page： 282-292

Language： English

Cited Count：

WoS CC Cited Count：

SCOPUS Cited Count：

ESI Highly Cited Papers on the List： 0 Unfold All

WanFang Cited Count：

Chinese Cited Count：

30 Days PV： 8

Affiliated Colleges：

Get Fulltext

DOI Library Discovery Baidu Scholar Search Engineering Village

Type
Departments

All Years Choose Year From to