Research and Development of a Near Real-time Captioning Service

Authors

  • โยธิน สิทธิบดีกุล Public Broadcasting Organization of Thailand
  • ดร.ณัฐนันท์ ทัดพิทักษ์กุล National Electronics and Computer Technology Center
  • ดร.อนันต์ลดา โชติมงคล National Electronics and Computer Technology Center

Keywords:

Closed caption, Real-time, Broadcasting system, Digital television, Bit rate

Abstract

To broadcasting a live television program with a real-time closed captioning service, two sub-systems must be added to the existing broadcasting system: 1) a real-time speech transcribing module and 2) a live stream text receiving module. In this study, the technique called simulated-typing developed by National Electronics and Computer Technology Center (NECTEC) is used to produce real-time transcription for Thai. The PIMmala subtitle gateway and VT3 PEACH DVB-sub generator together are then used to produce closed captioning service from a live text stream in DVB-T2 format. From five technical experiments and two user experiments, it has been shown that the television broadcasting systems developed under this study can produce and broadcast a real-time closed captioning service that is acceptable by people with hearing disability which is the targeted group of this study. The captioning accuracies of live news programs are around 80-90%. In terms of delay, the caption appears on a television screen within 6.6 seconds after the corresponding speech. Due to bandwidth limitation, which the regulator allocated only 100 kpbs for the closed caption service, the caption has to be displayed in block of characters every 500 milliseconds which increases the delay.

References

สำนักงานคณะกรรมการกิจการกระจายเสียง กิจการโทรทัศน์ และกิจการโทรคมนาคม, 2560, แนวปฏิบัติทางเทคนิคสำหรับการให้

บริการโทรทัศน์ภาคพื้นดินในระบบดิจิตอล (Technical Guidelines for Digital Terrestrial Television Broadcasting).

กรุงเทพมหานคร. น. 15, 36.

ห้างหุ้นส่วนจำกัด นีว่า เทคโนโลยี. (2560). คำป้อน Subtitle. น. 1-2. (ออนไลน์). สืบค้นจาก

https://www.nivatech.co.th/download/Kumpon-Subtitle.pdf

Chunwijitra, V., Chotimongkol, V., and Wutiwiwatchai, C., (2015)., Combining multiple-type input units using

recurrent neural network for LVCSR language modeling. Proceedings of Interspeech 2015, Germany, 16, 2385-

Mozilla. (2018, February 3). Introduction to WebRTC protocols. Retrieved from https://developer.mozilla.org/en-

US/docs/Web/API/WebRTC_API/Protocols

Downloads

Published

12-12-2018

How to Cite

สิทธิบดีกุล โ., ทัดพิทักษ์กุล ด., & โชติมงคล ด. (2018). Research and Development of a Near Real-time Captioning Service. Journal of Digital Communications, 2(2), 91–108. Retrieved from https://so04.tci-thaijo.org/index.php/NBTC_Journal/article/view/131273

Issue

Section

Research article