For a given text in any language or (English text), the solution should speak in the tone of the specific person. The best would be when the Voice Personalization happens, the words/phrases spoken bring in the associated emotion in that text. The emotions could only four, happy, anger, sad and neutral.

Technical design

Proposed Model:- As we need emotions in the synthesized speech, we will consider the embedding vector which is generated from the emotional analysis module and given as input to the vectors in Generative Adversarial Networks (GANs)-based Text-to-Speech (TTS) model. The architecture for the same is as follows:

The language…

With the historical text data, images data or speech data, we can build an application that will help to understand the historical terms more effectively and will also broad line the visuals if needed. Using Natural Language Processing techniques like named entity recognition, part-of-speech tagging we can aim for text summarization with the clear perspective of explaining the historical terms. The report can be generated which could be further utilized for analysis for specific incident or event. During learning history, I felt hard to pronounce the names of kingdom and rulers. Thus, we can apply, listen and speak button for…

Understanding speech in the presence of multiple talkers is one of the most challenging problems in automatic speech recognition. In this demo our system separates and recognizes the speech of mixtures of up to four speakers recorded in a single channel: can you separate and recognize speech as well as our machine? Try it with and without looking at the transcripts.

from scipy import signal
import matplotlib.pyplot as plt
from import wavfile
import numpy as np
from math import ceil
file1 = "./signal1.wav"
file2 = "./signal2.wav"
#%% Play wav file

import winsound
winsound.PlaySound(file1, winsound.SND_FILENAME|winsound.SND_ASYNC)#%% Join 2 wav files…

Caption generation is a challenging artificial intelligence issue, where a textual explanation for a given image must be produced. It requires both computer vision methods to understand the image content, and a language model from the Natural Language Processing (NLP) field to transform the image understanding into words in the right order.

Python Environment:- Python SciPy environment installed with Keras, TensorFlow backend and libraries like scikit-learn, Pandas, NumPy, and Matplotlib are needed for the implementation purpose.

Caption Dataset:- According to the research paper, I have selected two datasets i.e …

Statistical learning Stroke Prediction Using Logistic Regression

Machine Learning is the fastest-growing technology in many sectors, and the healthcare sector is no exception to this. Machine Learning algorithms play a crucial role in forecasting the presence / absence of heart disease, cancers, and more. Such needed knowledge, if well expected in advance, will offer valuable guidance to physicians who can then change their treatment and treat the patient appropriately.
The World Health Organization estimates that 12 million deaths occur worldwide every year due to heart disease. Half of the fatalities in the United States and other developing nations was attributed to…

Text-to-Speech (TTS) synthesis converts input written text to speech signals. The task of TTS is broken down into two sub-problems. The text is first converted by the frontend into a linguistic specification. This specification is then used to generate a speech waveform. Over the last two decades, statistical parametric speech synthesis (SPSS) has risen in prominence and is now a mainstream method used to produce speech of comparable quality to the well-established method called “unit selection” in which recorded speech is segmented, re-arranged and concatenated to produce novel utterances. The SPSS approach has a number of overwhelming advantages over the…

Recent Google and Facebook focused on behind-the-scenes mechanisms of text prediction. In addition to using Recurrent Neural Network and Long Short-Term Memory Networks for the motivation, there were two word2vec models for generating word embeddings also discussed.

framework for LSTM

Exploring the evolution of deep learning, I exemplified some of the areas that are applied only within Google as one of the major prodigies at the playground. So, thanks to deep learning.

There are two ways of developing the word2vec model: continuous bag-of-words (CBOW) and skip-gram.

To understand more in detail, first, let’s have a look into CBOW and Skip-gram model.

Continuous skip-gram & bag-of-words

In terms…

The virtual assistant is a software-based program which can perform tasks or particular service for an individual. In the coming era, technology is expected to be simple to use and execute by the end-user. The field of artificial intelligence and mobile technology also helps the visually blind people to overcome their disability and to live a healthy balanced life. Now–a day’s speech recognition made it roots for performing and triggering the various activities in human life. And the English and many other foreign languages have made dominant and remarkable work for the same.

The study reveals that companies like Google…

Federated learning is a response to the question that can a model be trained without the need to move and store the training data to a central location? Federated learning works without the need to store user data in the cloud which is a form of Decentralized AI model. Cloud architecture where local learning model updates are exchanged and verified. This enables on-device machine learning without any centralized training data. Federated Learning, a new framework for Artificial Intelligence (AI) model development that is distributed over millions of mobile devices. …

Sangramsing Kayte

I am a Machine Learning Scientist with over 9+ years of experience in both the Industrial and Research & Development domain.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store