Sie sind auf Seite 1von 53

Python - Desktop Assistant

A MINI PROJECT REPORT

Submitted by
V CASBRO (310818104015)

W DANIEL ALFRED VISUVASAM (310818104016)

S GANESH (310818104026)

in the partial fulfillment for the award of the degree


Of
BACHELOR OF ENGINEERING

In

COMPUTER SCIENCE AND ENGINEERING

JEPPIAAR ENGINEERING COLLEGE


ANNA UNIVERSITY: CHENNAI 600 025
APRIL 2021
CERTIFICATE OF EVALUATION

S.No Name of the Title of


Student(s) who the
have done the Project
project

1. V. CASBRO

PYTHO
2. W.DANIEL N -
ALFRED DESKT
VISUVASAM OP
3. ASSIST
ANT
S. GANESH

The Reports of the project work submitted by the above students in partial
fulfillment for the award of Bachelor of Engineering Degree in Computer Science
and Engineering of Anna University, Chennai were evaluated and confirmed to be
reports of the work by the above students and then evaluated.

Submitted on 30/07/2021

INTERNAL EXAMINER EXTERNAL EXAMINER


ABSTRACT

Voice assistants are programs in our digital world that listen and respond to

verbal commands. A user can ask, “What’s the weather?” and the voice

assistant

will answer with the weather report for that day and location. A user can also

listen to current leading NEWS broadcasted vide newsapi.org as per their request.

Voice assistants are so easy to use such a way that many people forget to stop and

wonder how it works.

How does voice assistants understand us? Is it magic? A complex system of codes?

An actual person listening on the other end?

The answer is less complicated than you might think. The application works

like Siri, Google Assistant etc. The U.I of the application is self-explainable

and very minimum. It takes voice as input. The system is being designed in

such a way that many services provided by the web services are accessible by

the end user on the user's voice commands.

4|Page
5|Page
TABLE OF CONTENTS

S No TITLE PAGE
NO

ABSTRACT 4

01 1.1INTRODUCTION 7
1.2WHAT IS A VOICE 8
ASSISTANT ?
1.3USES OF VOICE ASSISTANTS 10

02 2.1LITERATURE SURVEY 11

03 3.1REQUIRMENT ANALYSIS 13

04 4.1 DESIGN SYSTEM 16

05 5.1 DRAWBACKS OF VOICE 18


ASSISTANTS
5.2 ADVATAGES AND 21
DISADVANTAGES
06 6.1 FUTURE OF VOICE 25
ASSISTANT TECHNOLOGY
6.2 7 PREDICTIONS-2021 26
07 CONCLUSIONS 31
REFERENCES 34
APPENDIX 1- SOURCE CODE 36
APPENDIX 2- RESULTS 42

6|Page
LIST OF FIGURES

FIGURE NO. NAME PAGE NO.

1.1 Process Flow 7

4.1 System Design 17

7|Page
CHAPTER 1

1.1 INTRODUCTION

Well, I had the similar thought before I started making my very own “Digital”

Personal Assistant. Though it is not as capable and high as like Amazon’s Alexa

or Google Assistant, Home or Apple’s Siri or JARVIS from Iron Man. Nowadays,

People are troubled by typing commands into the computer. Be it procrastination

or a busy schedule. Typing is a big obsolete process. The solution to this is that we

switch over to an assistant which understands us and do the initial work for us. An

assistant is the best replacement for typing commands.

It is a Voice Recognition Intelligence, which takes the user input in form of user’s

voice and processes it and return the output in various ways like an action to be

performed or the search result is speaked out to the end user.

Below is the flow of the user-interface

8|Page
Fig 1.1 : Process flow

1.2 What Is a Voice Assistant?

For most of us, the ultimate luxury would be an assistant who always listens for

your call, anticipates your every need, and takes action when necessary. That

luxury is now available thanks to artificial intelligence assistants, aka voice

assistants.

Voice assistants come in somewhat small packages and can perform a variety

of actions after hearing a wake word or command. They can turn on lights,

answer questions, play music, place online orders, etc.

Voice assistants are not to be confused with virtual assistants, which are people
9|Page
who work remotely and can therefore handle all kinds of tasks. Rather, voice

assistants are technology based. As voice assistants become more robust, their

utility in both the personal and business realms will grow as well.

To call any technology that makes our lives easier by one name is almost

impossible. There are a variety of terms that refer to agents that can perform

tasks or services for an individual, and they are almost interchangeable — but no

t quite. They differ mainly based on how we interact with the technology, the app,

or a combination of both.

Here are some basic definitions, similarities, and differences:

Intelligent Personal Assistant: This is software that can assist people with bsic

tasks, usually using natural language. Intelligent personal assistants can go online

and search for an answer to a user’s question. Either text or voice can trigger an

action.

Automated Personal Assistant: This term is synonymous with intelligent

personal assistant.

Smart Assistant: This term usually refers to the types of physical items that can

provide various services by using smart speakers that listen for a wake word to

become active and perform certain tasks. Amazon’s Echo, Google’s Home, and

Apple’s HomePod are types of smart assistants.

Virtual Digital Assistants: These are automated software applications or

10 | P a g e
platforms

that assist the user by understanding natural language in either written or spoken

form.

Chatbot: Text is the main way to get assistance from a chatbot. Chatbots can

simulate a conversation with a human user. Many companies use them in the

customer service sector to answer basic questions and connect with a live person

if necessary.

Voice Assistant: The key here is voice. A voice assistant is a digital assistant

that uses voice recognition, speech synthesis, and natural language processing

(NLP) to provide a service through a particular application.

1.3The Uses of Voice Assistants

Many devices we use every day utilize voice assistants. They’re on our smart

Phones and inside smart speakers in our homes. Many mobile apps and operating

systems use them. Additionally, certain technology in cars, as well as in retail,

education, healthcare, and telecommunications environments, can be operated by

voices.

11 | P a g e
12 | P a g e
CHAPTER 2

2.1 LITERATURE SURVEY

AbhayDekate (2016) et al. presented in the Modern Era of fast moving technology

we can do things which we never thought we could do before but, to achieve and

accomplish these thoughts there is a need for a platform which can automate all

our

tasks with ease and comfort Thus we need to develop a Personal Assistant having

brilliant powers of deduction and the ability to interact with the surroundings just

by one of the materialistic form of human inter action i.e. Human Voice. The Hard

ware device captures the audio request through microphone and processes the

request so that the device can respond to the individual using in-built speaker

module. For Example, if you ask the device ’what’s the weather?’ using its built-in

skills, it looks up the weather and then returns the response to the customer

through connected speaker .

Rutuja V. Kukade (2018) et al. proposed there are various communication barriers

for people who are blind , and they have to face various challenges. In this paper,

we have discussed the implementation of a personal virtual assistant which can

take the human voice commands to perform tasks which otherwise would need the

depende

nce on others. It enables user to receive and send emails, know the weather
13 | P a g e
forecast report, maintain a personal diary/Online Blog, recognize image etc, using

Speech to Text Engine, Text to speech Engine, OCR (Optical character

recognition) using microphone for the input and speakers for the output

M. A. Jawale (2019) et al. proposed in today’s world, many artificial intelligence

applications developed using programming languages like Python, R and so on.

Each language comes with its own programming structure and syntactical forms.

Programmers are broadly classified into three categories namely, novice users,

knowledge intermittent and expert one. For novice users, it is always a challenge

to write a code without typographic errors though users know theoretical knowl

edge of Programming language, its structure and syntax as well as logic of

program. Therefore, this paper explores use of voice recognition technique in the

field .

According to Global Market Insights, Inc., between 2016 and 2024, the market

share for the technology will grow at an annual rate of almost 35 percent. More

and more sectors of the economy, like healthcare and the automotive industry,

are finding uses for the speech recognition technology in addition to those found

in devices like smart speakers and phones.

14 | P a g e
CHAPTER 3
3.1 REQUIREMENT ANALYSIS

The basic requirement for this project is Python 3.8.8 . We’ll be using the

pyttsx3 package which is a text-to-speech library for Python , Sellenium for web

automation, datetime module , request module, rand facts, pyjokes, pyaudio , sub

process, open weather map for weather forecast , newsapi.org for results pertain

ing to news. The basic reason why we use this is because it works offline. Another

basic requirement of this project will be Python’s Speech Recognition library.

Pyttsx3: This module is used for the conversion of text to speech in a program it

works offline. To install this module type the below command intheterminal.

Selenium web automation : Selenium is a free (open- source) automated testing

frameworkused to validate web applications across different browsers and

plat

forms. You can use multiple programming languages like Java, C#, Python etc to

create Selenium Test Scripts.

Pyjokes: Pyjokes is used for collection Python Jokes over the Internet. To

install this module type the below command in the terminal.

15 | P a g e
Pip install pyjokes

Datetime: Date and Time is used to showing Date and Time. This module

comes built-int with Python.

Speech Recognition: Since we’re building an Application of voice assistant, one

of the most important things in this is that your assistant recognizes your voice

(means what you want to say/ ask). To install this module type the below

command

in the terminal.

pip install SpeechRecognition

Subprocess:- This module is used for getting system subprocess details which

are used in various commands i.e Shutdown, Sleep, etc. This module comes

built-in with Python.

Requests: Requests is used for making GET and POST requests. To install this

module type the below command in the terminal. pip install requests

Pyaudio : PyAudio provides Python bindings for PortAudio, the cross-platform

audio I/O library. With PyAudio, you can easily use Python to play and record
16 | P a g e
audio on a variety of platforms.

getFacts Function: It is a function that accepts a boolean value. This Boolean

value determines whether the explicit content filter is on or not. If we pass True

to the function, the filter will be on (this is the default) and if we pass False to

the function, the filter will be turned off.

JSON or JavaScript Object Notation is a format for structuring data.

In Python, webbrowser module provides a high-level interface which allows

displaying Web-based documents to users. The webbrowser module can be used

to launch a browser in a platform-independent manner .

Yt-Auto-Search-Python is a python library to search keyword automatically

on youtube and get search results using browser automation. It currently runs

only on windows. pip install yt-auto-search-python

17 | P a g e
CHAPTER 4

System Design:

What is voice assisted technology?

Simply put, a voice or smart home assistant is a piece of software that

Communicates to the user audibly, and responds to spoken commands.

It's technology like Google Home, Siri and Alexa that can be used to literally talk

to a computer, a smartphone, or another device.

4.1 DESIGN SYSTEM

The project has been carried out in Anagonda Navigator – Jupiter notebook

environment using Python 3.8.8 since many packages are supportive under the

same.The aforesaid environment is user- friendly to carry on with various tests on

voice recognition.

The overall system design consists of following phases:

 Data collection in the form of user’s voice

 Voice analysis and conversion to text

 Data processing

 Generating the task to be done from the processed text output

18 | P a g e
Fig 4.1 : Design System

19 | P a g e
CHAPTER -5

DISCUSSIONS

5.1 The Drawbacks of Voice Assistants

As the acceptance and usage of voice assistants continues to grow, it is only natural
for some people to have reservations about using them. Below, we discuss some of
the major issues regarding voice assistants.
Privacy: Privacy is a concern, especially involving smart speakers. While
waiting for a wake word, smart speakers are always listening. On a smartphone,
pressing a button or opening an app activates the assistant. Once you wake it up, it
begins recording audio clips of what you say. These clips represent the files that go
to a server to process the audio and formulate a response. The real brains are not in
the little speakers in our homes: They’re on a massive server somewhere else.
What the speaker sends is on an encrypted connection. Speakers do not record
anything prior to the wake words.

“People confuse ‘always listening’ with ‘always recording,’” says Mutchler of


Voicebot.ai. “The genius of [smart speakers] is they can remove background noise
and single out the wake word,” she continues. Only then do they begin recording.

Smart speakers and other AI assistants, like those on a smartphone, save these
recordings and allow the user to go into their account and delete them.

There are also questions about what can happen with those recordings. One
situation that raises these privacy concerns is using the recordings as possible

20 | P a g e
evidence in a criminal investigation. Back in 2016, detectives in an Arkansas
murder case found an Amazon Echo linked to many smart home devices at a
murder scene. Police seized the Echo and tried to get information from it by
serving a warrant to Amazon for records of any recordings on the device. Amazon
did not release the information, and it is unclear what law enforcement expected to
get from the smart speaker and its files.

Laws surrounding information on our phones and devices are struggling to keep up
with the ever-changing technology and how we use it. There are even questions
about whether smart speakers and other devices should have a mechanism to report
dangerous words, search patterns, or activity to authorities. What happens if
someone asks a voice assistant to do something illegal? Should it be able to
override our commands? These issues and others are sure to be the subjects of new
laws as the technology continues to change.

Even though smart speaker usage is growing amongst all age groups, younger
people don’t seem to have as much of a problem with privacy as older ones. “I
think we’re so used to having our privacy invaded for convenience,” Mutchler
says.

Lucas points out that twenty years ago, the thought of having an ever-listening
speaker in our homes wouldn’t have boded well with consumers. But our lines of
concern are different than they once were. “You’ll accept all of these things
because they are helpful,” he says.
Accuracy: Voice assistants don’t always understand what we are asking.
Sometimes, it’s how we speak. Other times, it is simply because the artificial
intelligence hasn’t yet learned how to do something.
There is also a question concerning answer sources. During an online search, a user
can select results, note the source, and click for more information. When asking a
21 | P a g e
voice assistant a question, the answer usually comes back as fact, often without
stating the source.

The “conversations” people have with their voice assistants are not really two way
at all. To ask a follow-up question, you need to wake the assistant up again. Also,
real people need to monitor the artificial intelligence in order for it to “learn” new
things.

Hackability and Security: Even though voice assistants communicate with their
servers using encrypted connections, there is still a concern about hackability and
security.

In early 2018, some users of Amazon’s Echo reported it would suddenly emit an
evil laugh for no reason. In the beginning, people thought someone had hacked
into their smart speakers. Amazon investigated the problem and later announced
that the Echo had been hearing words similar to “Alexa laugh,” so it began
laughing. As a response, Amazon disabled the reaction and changed Alexa’s
response to a user’s request that it laugh to “Sure, I can laugh,” followed by
laughter.

Since some smart speakers can recognize and respond to any nearby voice, a guest
can check or alter your calendar or your contacts. Also, an annoyed neighbor can
set an alarm for an early morning wake-up call by yelling through your door.

With regard to this capability, be careful not to link door locks and security
systems to voice assistants. If you do, a burglar could just as easily say “unlock the
front door” or “disable security cameras” as you could.

22 | P a g e
Someone could also use your device to make purchases without you knowing
about it. In order to avoid this possibility, Alexa allows you to set a PIN
confirmation option for voice purchases.

For business, the security concerns are a bit different. Lucas uses the following
example: In the past, burglars would break into the CEO’s office and steal
documents, like sales data and earnings reports. Businesses adapted security
procedures. Then, companies began storing information on computers, so hacking
into files became the crime. Businesses adapted security procedures. Now, a thief
can simply ask a voice assistant for critical data. Security procedures need to adapt.
“I think it’s quite likely you will see enterprise solutions coming out. They might
be based on consumer technology but with add-ons,” Lucas says.

5.2 Advantages and Disadvantages of Voice Assistant

Voice assistants have been part of life since Apple introduced Siri on the iPhone.
From there, Amazon gave us Amazon Echo and Alexa smart speakers followed by
Google Assistant. There is also Samsung Bixby and Microsoft Cortana.
According to eMarketer, 2019 saw 111.8 million people in the US using voice
assistants at least once a month. Statista has projected that 2021 will see 132
million people using voice assistants at least once a month in the US.
With so many people expressing an obvious interest in them, voice assistants
provide an opportunity for marketers to better reach, engage, and understand
customers and prospects.
Let’s start with the advantages they can give digital marketers.

 Reach elusive prospects


By marketing through a virtual assistant on mobile phones or smart speaker
devices, you have a greater chance of reaching your target audience. It gives you
23 | P a g e
an option of doing so besides the Internet and mobile. According to Voicebot.ai,
87.7 million U.S. adults now use smart speakers as of January 2020, which is 32
percent more than in January 2019 and is 85 percent greater than January 2018.

 Generate personal conversations


Voice assistants are a chance for marketers to begin conversations in a much more
personalized way than ever before. Users generally share exactly what they want
and what they are thinking with voice assistants. Thus, the channel allows
marketers to answer back with what they need and then continue reaching out for a
personalized customer experience.

 Reach multiple users at once


Voice assistants give marketers access to multiple users in a single household.
These consumers all make unique purchase decisions because they have their own
brand preferences, product interests, and music playlists. Marketers can achieve
greater results through one voice assistant, as it is a hub to collect more insights
and sell through one segmented campaign.

 Go beyond the usual devices


Another advantage is that voice assistants are becoming more popular outside of
our homes and cell phones. They are popping up in our cars, in smart TVs,
wearable devices, and home appliances. These provide new opportunities to reach
even more targets as well as provide additional value for existing customers.

 Drive new purchases


Through voice assistants, marketers can reach customers at a point in the shopping
journey where they are ready to buy. To entice customers that have used their
voice assistants to ask about a particular product or service, marketers can deliver
promotional campaigns like instant digital coupons. Instead of having to locate
24 | P a g e
coupon codes, the discount ready for the customer to redeem, potentially pushing
the customer to complete the purchase.

Now that we’ve covered the advantages of leveraging voice assistant channels in
your marketing campaigns, we should cover the disadvantages:

 Data security concerns


Although consumers are using voice assistants more often, there is still great
concern over the data these devices collect and the companies behind the apps on
those devices. Consumers are wary of how the data is stored, who looks at it, and
what happens to that information. Marketers will have to address those data and
privacy concerns, or they will not get access to these prospects and their
information.

 Disconnected interaction
Another disadvantage is that voice assistants as a channel provide less enriching
interactions than other platforms. The options are voice content only, which
typically involves repurposing existing content, versus visual interactions. This
may diminish some of the more meaningful engagements that marketers can have
elsewhere.

 Reliance on device makers


As a marketer, you are at the mercy of device makers, such as smart speaker
brands, wearable device companies, vehicle manufacturers, and smart appliance
producers. That means carefully researching which device makers you want to
work with for sustainable results before jumping in. Investment in voice app and
skill set
It can be costly for a marketing budget to develop the voice app to use for this
25 | P a g e
channel. Participating in this channel may also be time intensive in terms of
building an internal skill set geared toward the nuances of voice assistants.
Therefore, it’s important to assess the benefits and costs involved in participating
in voice assistant channels.
Curious about integrating voice assistant channels into your marketing strategy?
One of the first places to start with voice marketing is to try Voice Engine
Optimization (VEO), which is the process of optimizing content so that it turns up
in voice searches. It can help you gain a position on these devices by focusing on
the most voiced keywords, which tends to involve longer keywords and questions
versus statements.

 Voice assistants call for a voice marketing plan


If you’re ready to dive in, it’s important to create a voice marketing plan. It should
include the voice marketing potential among your targeted audience segments.
From this research, you can develop a short-term and long-term set of marketing
strategies, including investment in a voice app and strategic partnerships with
voice assistant devices and channels.
And don’t give up. You may find that you need to experiment for some time to
better understand what works for your brand and to acknowledge the ongoing
evolution of voice assistant devices, technology advancements like artificial
intelligence, applications, and user segments.

26 | P a g e
CHAPTER 6

6.1 The Future of Voice Assistants

The number of people using voice assistants is expected to grow. According to the
Voicebot Smart Speaker Consumer Adoption Report 2018, almost ten percent of
people who do not own a smart speaker plan to purchase one. If this holds true, the
user base of smart speaker users will grow 50 percent, meaning a quarter of adults
in the United States will own a smart speaker.

Smart speaker sales are expanding in other parts of the world, meaning they need
to “learn” how to “understand” languages, accents, dialects, slang, and nuances in
each country in which they are sold. Chinese companies are developing their own
smart speakers. “The rest of the world is behind the U.S. and will catch up pretty
quickly,” Mutchler says.

Voice assistants are always improving and “learning.” AI companies use data from
existing systems to improve what assistants can do. Lucas believes that ultimately,
the voice assistant might get so smart that it will automatically order a pizza if you
say you’re hungry. It will use existing data from your previous purchases to come
to the conclusion that saying you’re hungry equals ordering a pizza.

The experts predict that voice assistants will improve in many other ways. As
described in a 2017 article for The Atlantic, “A subfield of AI called computational
creativity forges algorithms that can write music, paint portraits, and tell jokes.”
These capabilities will help smart speakers “show emotion” and “think” for
themselves without being scripted. Systems that explain why they did what they
did and what they’re going to do next are also on the horizon.

27 | P a g e
Voice assistants are not going anywhere. “I think people thought of it as a fad, but
it’s not. It’s changing what people do in their homes. Voice assistants will grow
and are here to stay,” Mutchler says. “I think they [voice assistants] will be in
everything, and the smart speaker might fade away in a few years because many
technologies, like televisions and refrigerators, will have their own voice assistants.
The kids today won’t understand that there was a world where you couldn’t talk to
things,” she concludes.

6.2 7 key predictions for future of voice assistant

When voice technology began to emerge in 2011 with the introduction of Siri, no
one could have predicted that this novelty would become a driver for tech
innovation. Now, a decade later, it’s estimated that every 1 in 4 U.S. adults own a
smart speaker (i.e., Google Home, Amazon Echo) and eMarketer forecasts that
nearly 92.3 percent smartphone users will be using voice assistants by 2023.

Brands such as Amazon, and Google are continuing to fuel this trend as they
compete for market share. Voice interfaces are advancing at an exponential rate in
all industries, with notable growth in healthcare to banking, as companies are
racing to release their own voice technology integrations to keep pace with
consumer demand.

When voice technology began to emerge in 2011 with the introduction of Siri, no
one could have predicted that this novelty would become a driver for tech
innovation. Now, a decade later, it’s estimated that every 1 in 4 U.S. adults own a
smart speaker (i.e., Google

28 | P a g e
Home, Amazon Echo) and eMarketer forecasts that nearly 92.3 percent
smartphone users will be using voice assistants by 2023.

Brands such as Amazon, and Google are continuing to fuel this trend as they
compete for market share. Voice interfaces are advancing at an exponential rate in
all industries, with notable growth in healthcare to banking, as companies are
racing to release their own voice technology integrations to keep pace with
consumer demand.

What’s Causing The Shift Towards Voice?


The main driver for this shift towards voice user interfaces is changing user
demands. There is an increased overall awareness and a higher level of comfort
demonstrated specifically by millennial consumers. In this ever-evolving digital
world where speed, efficiency, and convenience are constantly being optimized.

The mass adoption of artificial intelligence in users’ everyday lives is also fueling
the shift towards voice applications. The number of IoT devices such as smart
thermostats, appliances, and speakers are giving voice assistants more utility in a
connected user’s life. Smart speakers are the number one way we are seeing voice
being used, however, it only starts there. Many industry experts even predict that
nearly every application will integrate voice technology in some way in the next 5
years.

Applications of this technology are seen everywhere, so where will it take us in


2021 and beyond? We provide a high-level overview of the potential that voice has
and 7 key predictions we think will take off in the coming years.

29 | P a g e
7 Key Predictions For Voice In 2021

1. Mobile App Integration


Integrating voice-tech into mobile apps has become the hottest trend right now, and
will remain so because voice is a natural user interface (NUI).

Voice-powered apps increase functionality, and save users from complicated app
navigation. Voice-activated apps make it easier for the end-user to navigate an app
— even if they don’t know the exact name of the item they’re looking for or where
to find it in the app’s menu. While at this stage, voice integration may be seen as a
nice-to-have by users, this will soon become a requirement that users will expect.

2. Voice-Tech In Healthcare
In 2020, AI-powered chatbots and virtual assistants played a vital role in the fight
against COVID-19. Chatbots helped screen and triage patients, and Apple’s Siri
now walks users through CDC COVID-19 assessment questions and then
recommends telehealth apps.
Voice and conversational AI have made health services more accessible to
everyone who was unable to leave their home during COVID-19 restrictions. Now
that patients have a taste for what is possible with voice and healthcare, behaviors
are not likely to go back to re-pandemic norms. Be prepared to see more
investment in voice-tech integration in the healthcare industry in the years to come.

3. Search Behaviors Will Change


Voice search has been a hot topic of discussion. Visibility of voice will
undoubtedly be a challenge. This is because the visual interface with voice
assistants is missing. Users simply cannot see or touch a voice interface unless it is
connected to the Alexa or Google Assistant app. Search behaviors, in turn, will see
30 | P a g e
a big change. In fact, if tech research firm Juniper Research is correct, voice-based
ad revenue could reach $19 billion by 2022, thanks in large part to the growth of
voice search apps on mobile devices.

Brands are now experiencing a shift in which touchpoints are transforming to


listening points, and organic search will be the main way in which brands have
visibility. As voice search grows in popularity, advertising agencies and marketers
expect Google and Amazon will open their platforms to additional forms of paid
messages.

4. Individualized Experiences
Voice assistants will also continue to offer more individualized experiences as they
get better at differentiating between voices. Google Home is able to support up to
six user accounts and detect unique voices, which allows Google Home users to
customize many features. Users can ask “What’s on my calendar today?” or “tell
me about my day?” and the assistant will dictate commute times, weather, and
news information for individual users. It also includes features such as nicknames,
work locations, payment information, and linked accounts such as Google Play,
Spotify, and Netflix. Similarly, for those using Alexa, simply saying “learn my
voice” will allow users to create separate voice profiles so the technology can
detect who is speaking for more individualized experiences.

5. Voice Cloning
Machine learning tech and GPU power development commoditize custom voice
creation and make the speech more emotional, which makes this computer-
generated voice indistinguishable from the real one. You just use a recorded speech
and then a voice conversion technology transforms your voice into another. Voice
cloning becomes an indispensable tool for advertisers, filmmakers, game
31 | P a g e
developers, and other content creators.

6. Smart Displays
Last year smart displays were on the rise as they expanded voice-tech’s
functionality. Now, the demand for these devices is even higher, with consumers
showing a preference for smart displays over regular smart speakers. In the third
quarter of 2020, the sales of smart displays rose year-on-year by 21 percent to 9.5
million units, while basic smart speakers fell by three percent. In 2021, we expect
for there to be a lot of innovation in the world of smart displays to integrate more
advanced technology and more
customization. Smart displays, like the Russian Sber portal or a Chinese smart
screen Xiaodu, for example, are already equipped with a suite of upgraded AI-
powered functions, including far-field voice interaction, facial recognition, hand
gesture control, and eye gesture detection.

7. Voice In The Gaming Industry


It takes a lot of time and effort to record a voice for spoken dialogues within the
game for each of the characters. In the upcoming year, developers will be able to
use sophisticated neural networks to mimic human voices. In fact, looking a little
bit ahead, neural networks will be able to even create appropriate NPC responses.
Some game design studios and developers are working hard to create and embed
this dialogue block into their tools, so seeing games include dynamic dialogues
isn’t too far off.

32 | P a g e
CHAPTER 7

CONCLUSIONS

Why Adopt A Mobile Voice Strategy?

Mobile phones are already personalized, more so than any website. Additionally,
there is very little screen space on mobile, making it more difficult for users to
search, or navigate. With larger product directories and more information, voice
applications enable consumers to use natural language to eliminate or reduce
manual effort, making it a lot faster to accomplish tasks.

Rogers has introduced voice commands to their remotes allowing customers to


quickly browse and find their favorite shows or the latest movies with certain
keywords, for example, an actor’s name. Brands need to focus on better mobile
experiences for their consumers and voice is the way to do so. Users are searching
for quicker and more efficient ways of accomplishing tasks and voice is quickly
becoming the ideal channel for this.

Whether that’s finding out information, making a purchase, or achieving a task,


voice is the new mobile experience. With the voice and speech recognition market
is expected to grow at a 17.2 percent CAGR to reach $26.8 billion by 2025, It’s
clear that brands are racing to figure out their voice strategy.

Voice User Interface (VUI) Will Continue To Advance


33 | P a g e
Even with just that handful of simple scenarios, it’s easy to see why voice
assistants are shaping up to become the hubs of our connected homes and
increasingly connected lives.

Voice technology is becoming increasingly accessible to developers. For example,


Amazon offers Transcribe, an automatic speech recognition (ASR) service that
enables developers to add speech-to-text capability to their applications. Once the
voice capability is integrated into the application, users can analyze audio files and
in return, receive a text file of the transcribed speech.

Google has made moves in making Assistant more ubiquitous by opening the
software development kit through Actions, which allows developers to build voice
into their own products that support artificial intelligence. Another one of Google’s
speech-recognition products is the AI-driven Cloud Speech-to-Text tool which
enables developers to convert audio to text through deep learning neural network
algorithms.

This is only the beginning of voice technology as we will see major advancements
in the user interface in the years to come. With the advancements in VUI,
companies need to start educating themselves on how they can best leverage voice
to better interact with their customers. It’s important to ask what the value of
adding voice will be as it doesn’t always make sense for every brand to adopt.
How can you provide value to your customers? How are you solving their pain
points with voice? Will voice enhance the user experience or frustrate the user?

In 2021, voice-enabled apps will not only accurately understand what we are
saying, but how we are saying it and the context in which the inquiry is made.

However, there are still a number of barriers that need to be overcome before voice
34 | P a g e
applications will see mass adoption. Technological advances are making voice
assistants more capable particularly in AI, natural language processing (NLP), and
machine learning. To build a robust speech recognition experience, the artificial
intelligence behind it has to become better at handling challenges such as accents
and background noise.

And as consumers are becoming increasingly more comfortable and reliant upon
using voice to talk to their phones, cars, smart home devices, etc., voice
technology will become a primary interface to the digital world and with it,
expertise for voice interface design and voice app development will be in greater
demand.

Voice Is The Future Of Brand Interaction And Customer Experience


Advancements in a number of industries are helping digital voice assistants
become more sophisticated and useful for everyday use. Voice has now established
itself as the ultimate mobile experience. A lack of skills and knowledge make it
particularly hard for companies to adopt a voice strategy. There is a lot of
opportunity for much deeper and much more conversational experiences with
customers. The question is, is your brand willing to jump on this opportunity?

35 | P a g e
References :

[1] https://ieeexplore.ieee.org/document/9210344

[2] https://www.analyticsvidhya.com/blog/2020/11/build-your-own-desktop-voice-
assistant-in-python/

[3]https://www.geeksforgeeks.org/voice-assistant-using-python/

36 | P a g e
APPENDIX
Appendix 1 – Source code
import pyttsx3 as p
import speech_recognition as sr
import pyaudio
import time
import datetime
from datetime import date
import calendar
from selenium import webdriver
import randfacts
import pyjokes
import json
from ss import *
from subprocess import run
import requests
from yt_auto_search_python import *
import webbrowser
 
engine=p.init()
rate=engine.getProperty('rate')
engine.setProperty('rate',130)
voices=engine.getProperty('voices')
engine.setProperty('voice',voices[1].id)

listener=sr.Recognizer()

def talk(text):
    engine.say(text)
    engine.runAndWait()

class wiki():
    def __init__(self):#construct fun
        self.driver=webdriver.Chrome()#inita driver
    
    def get(self,query):#task perfrm fucn
        self.query=query#creating a objct fr query
        self.driver.get(url="https://www.wikipedia.org")
        search=self.driver.find_element_by_xpath('//*[@id="searchInput"]')#method in selenuim we trick "seacrh box"
n store in seacrh
        search.click()
        search.send_keys(query)
        enter=self.driver.find_element_by_xpath('//*[@id="search-form"]/fieldset/button/i')#seach button
        enter.click()
        talk()

class music():
    def __init__(self):
        self.driver=webdriver.Chrome()
37 | P a g e
    def play(self,query):
        self.query=query
        self.driver.get(url="https:/www.youtube.com/results?search_query="+query)
        vid=self.driver.find_element_by_xpath('//*[@id="video-title"]/yt-formatted-string')
        vid.click()

def wishme():
    hours=int(datetime.datetime.now().hour)
    if hours>=0 and hours<12:
        return ("good Morning")
    elif hours>=12 and hours<18:
        return("good Afternoon")
    else:
        return("good evening")

def time():
    current_time= datetime.datetime.now().strftime("%H:%M:%S")
    print(current_time)
    return time()

talk("welcome buddy,"+" "+wishme()+" "+" happy to see you")

def present_date():
    current_date=date.today()
    print(current_date)
    talk("the today date is")
    talk( current_date)
present_date()

talk("how are you?")

try:
    with sr.Microphone() as source:#using the microphne from sr so now source will bw our microphone
        listener.energy_threshold=15000
        listener.adjust_for_ambient_noise(source,duration=1.5)
        print("listening...")
        voice=listener.listen(source)
        command=listener.recognize_google(voice,language='en-US')
        print(command)
except sr.UnknownValueError:
    print("sorry I did not get you")
    talk("sorry I did not get you")
except sr.RequestsError as e:
    print('Request error')
    talk("request error")

if "what" and "about" and "you" in command:


    talk("i am doing so good")
    talk("how can i help you")
else:
    talk("how can i help you")    

38 | P a g e
try:
    with sr.Microphone() as source:
        listener.energy_threshold=15000
        listener.adjust_for_ambient_noise(source,duration=1.5)
        print("listening...")
        voice=listener.listen(source)
        command=listener.recognize_google(voice,language='en-US')
        print(command)
except sr.UnknownValueError:
    print("sorry I did not get you")
    talk("sorry I did not get you")
except sr.RequestsError as e:
    print('Request error')
    talk("request error")

def respond(command):
  
    if 'date' in command:
        print('Sorry in little busy with my work stuff')
        talk('Sorry in little busy with my work stuff')
    
    elif 'are' and 'u' and 'single' in command:
        print('Sorry  i have a boyfriend Better luck next time')
        talk('Sorry  i have a boyfriend Better luck next time')
    
    elif 'what' and 'is' and 'your'and 'name' in command:
        print('myself ur  friend')
        talk('myself friend')
    
    elif 'who'and'are'and 'you' in command:
        print('i am virtual assistant')
        talk('i am virtual assistant')
        talk("who are you")
    
    
    elif 'what'and 'is' and 'your'and 'age' in command:
        print('i am sweet 16')
        talk('i am sweet 16')
    
    elif 'how'and 'are' and 'you' in command:
        print('i am super fine')
        talk('i am super fine')
    elif 'can' and 'u'and 'help'and 'me' in command:
        print('my pleasure')
        talk('my pleasure')
    elif "i love you" in query:
        print("Its hard to understand")
        talk("It's hard to understand")
    elif  "good bye" in statement or "ok bye" in statement or "stop" in command:
        print("ok bye seen you later")

39 | P a g e
        talk("ok bye seen you later")

respond(command)

try:
    with sr.Microphone() as source:
        listener.energy_threshold=15000
        listener.adjust_for_ambient_noise(source,duration=1.5)
        print("listening...")
        voice=listener.listen(source)
        command=listener.recognize_google(voice,language='en-US')
        print(command)
except sr.UnknownValueError:
    print("sorry I did not get you")
    talk("sorry I did not get you")
except sr.RequestsError as e:
    print('Request error')
    talk("request error")

if "information" in command:
  
    talk('you need information related to what topic:')
  
    try:
        with sr.Microphone() as source:
            listener.energy_threshold=15000
            listener.adjust_for_ambient_noise(source,duration=1.0)
            print("listening...")
            voice=listener.listen(source)
            informations=listener.recognize_google(voice,language='en-US')
        print("searching{} in wikipedia result will be shown in minutes".format(informations))#info r informtion
        talk("searching{} in wikipedia result will be shown in minutes".format(informations))
        assist=wiki()
        assist.get(informations)
        talk(assist)
      
try:
    with sr.Microphone() as source:
        listener.energy_threshold=15000
        listener.adjust_for_ambient_noise(source,duration=1.5)
        print("listening...")
        voice=listener.listen(source)
        command=listener.recognize_google(voice,language='en-US')
        print(command)
except sr.UnknownValueError:
    print("sorry I did not get you")
    talk("sorry I did not get you")
except sr.RequestsError as e:
    print('Request error')
    talk("request error")

if "search" in command:

40 | P a g e
    talk("what do I want to search for")
    print("what di i want to search for?")
    with sr.Microphone() as source:
            listener.energy_threshold=12000
            listener.adjust_for_ambient_noise(source,duration=1.0)
            print("listening...")
      
            voice=listener.listen(source)
            find=listener.recognize_google(voice,language='en-US')
            print("searching{} in google result will be shown in minutes".format(find))#info r informtion
            talk("searching {} in google result will be shown in minutes".format(find))
    
    url="https://google.com/search?q="+find
    webbrowser.get().open(url)
    print("here is what i found for "+find)
    talk("here is what i found for "+find)

try:
    with sr.Microphone() as source:
        listener.energy_threshold=15000
        listener.adjust_for_ambient_noise(source,duration=1.5)
        print("listening...")
        voice=listener.listen(source)
        command=listener.recognize_google(voice,language='en-US')
        print(command)
except sr.UnknownValueError:
    print("sorry I did not get you")
    talk("sorry I did not get you")
except sr.RequestsError as e:
    print('Request error')
    talk("request error")

if "find" and "location" in command:


    talk("what is the location  u want to find")
    print("what is the location u want to find?")
  
    with sr.Microphone() as source:
            listener.energy_threshold=12000
            listener.adjust_for_ambient_noise(source,duration=1.0)
            print("listening...")
      
            voice=listener.listen(source)
            destination=listener.recognize_google(voice,language='en-US')
            print("searching {} location in google map".format(destination))#info r informtion
            talk("searching {} location in google map".format(destination))
    
    url="https://google.nl/maps/place/"+destination+ "/&amp;"
    webbrowser.get().open(url)
    print("here is what i found for "+destination)
    talk("here is what i found for "+destination)

41 | P a g e
try:
    with sr.Microphone() as source:
        listener.energy_threshold=15000
        listener.adjust_for_ambient_noise(source,duration=1.5)
        print("listening...")
        voice=listener.listen(source)
        command=listener.recognize_google(voice,language='en-US')
        print(command)
except sr.UnknownValueError:
    print("sorry I did not get you")
    talk("sorry I did not get you")
except sr.RequestsError as e:
    print('Request error')
    talk("request error")

if "play" and "video" in command:


    talk("what video i want to play")
    with sr.Microphone() as source:
        listener.energy_threshold=12000
        listener.adjust_for_ambient_noise(source,duration=1.0)
        print("listening...")
        voice=listener.listen(source)
        vd=listener.recognize_google(voice,language='en-US')
    print("playing"+" "+ vd +" "+ "video in youtube result will be shown in minutes")
    talk("playing" + vd + "video in youtube result will be shown in minutes")
    
    a=music()
    a.play(vd)

try:
    with sr.Microphone() as source:
        listener.energy_threshold=15000
        listener.adjust_for_ambient_noise(source,duration=1.5)
        print("listening...")
        voice=listener.listen(source)
        command=listener.recognize_google(voice,language='en-US')
        print(command)
except sr.UnknownValueError:
    print("sorry I did not get you")
    talk("sorry I did not get you")
except sr.RequestsError as e:
    print('Request error')
    talk("request error")

if "joke" or "jokes" in command:


    y=pyjokes.get_jokes()
    print(y)
    talk(y)

try:
    with sr.Microphone() as source:

42 | P a g e
        listener.energy_threshold=15000
        listener.adjust_for_ambient_noise(source,duration=1.5)
        print("listening...")
        voice=listener.listen(source)
        command=listener.recognize_google(voice,language='en-US')
        print(command)
except sr.UnknownValueError:
    print("sorry I did not get you")
    talk("sorry I did not get you")
except sr.RequestsError as e:
    print('Request error')
    talk("request error")

if "fact" or "facts" in command:


    talk("your fact of the day is here")
    x=randfacts.getFact()
    print("Did u know that: "+x)
    talk("did u know that:"+x)

43 | P a g e
Appendix – 2 – Screenshots of result

44 | P a g e
45 | P a g e
46 | P a g e
47 | P a g e
48 | P a g e
49 | P a g e
50 | P a g e
51 | P a g e
52 | P a g e
53 | P a g e
The above seen screenshots (information from Wikipedia , playing video from
youtube, google search, location respectively. ) are replies and execution of
programme in python platform.

54 | P a g e

Das könnte Ihnen auch gefallen