ARTICLE
TITLE

Text to speech using Mel-Spectrogram with deep learning algorithms

SUMMARY

The purpose of text to speech (TTS), sometimes called speech synthesis, is to synthesize a natural and intelligible speech for a given text. A wide range of applications uses TTS technologies in media, chatbots, and entertainment, among other fields, making it a hot topic for the research community. Recently, the progress achieved by artificial intelligence, especially in deep learning and neural networks, enables TTS to produce a high-quality synthesized speech. However, despite the success achieved, currently, available works suffer from the need for very long training and inference time, which makes it dominated by big tech companies. This paper proposes a model based on convolutional neural networks (CNN) and gated recurrent units (GRU). The proposed model can work even in low computational environments and requires low training time. The MOS achieved is 4.26, higher than the MOS performed by state-of-the-art methods.

 Articles related

Mukhamad Rizal Ilham, Arif Dwi Laksito    

A group of theory-driven computing techniques known as natural language processing (NLP) are used to interpret and represent human discourse automatically. From part-of-speech (POS) parsing and tagging to machine translation and dialogue systems, NLP ena... see more


Zico Pratama Putera,Mila Desi Anasanti,Bagus Priambodo    

The gesture is one of the most natural and expressive methods for the hearing impaired. Most researchers, however, focus on either static gestures, postures or a small group of dynamic gestures due to the complexity of dynamic gestures. We propose the Ki... see more


Atif Khan, Junaid Yousaf, Tila Muhammad, Muhammad Ismail    

Toxic online material has emerged as a significant issue in contemporary society as a result of the exponential increase in internet usage by individuals from all walks of life, including those with varied cultural and educational backgrounds. Automatic ... see more


Khairunizam Khairunizam,Danuri Danuri,Jaroji Jaroji    

Intisari  - Perkembangan  teknologi  pada  saat  ini  semakin  maju  dan  sangat  pesat,  terutama  pada teknologi  smartphone  semakin  canggih,  oleh  sebab  itu&... see more


Shwetha V,Dr.Vijaya laxmi,Dhanin Anoop Asarpota,Himanshu Verma    

A sizable population around the world has some form of hearing or speaking disability. This creates a communication barrier among them and the rest of the world. Sign language was introduced to bridge this gap. The objective is to design a glove that can... see more