Sie sind auf Seite 1von 20

A NOVEL APPROACH TO DATA SECURITY

USING CRYPTOGRAPHY ALONG WITH STT


AND TTS THROUGH PYTHON

A MINI PROJECT REPORT Submitted by

NAME ROLL NO. REG NO.

SREEPARNA DUTTA 430216020103 161270110293

SAYAN DAM 430216010078 161270100267

DYUTI SOME 430216020035 161270100224

ADRITA CHAKRABORTY 430216020006 161270100195

Under the guidance of

MRS. ARPITA BARMAN SANTRA

in partial fulfillment for the award of the degree of

Bachelor of Technology

in

Electronics and Communication Engineering


of

NARULA INSTITUTE OF TECHNOLOGY

81, NILGUNJ ROAD AGARPARA

KOLKATA-109

MAY, 2019
ACKNOWLEDGEMENT

We would like to express our sincere thanks to our


respected teacher (Mrs. Arpita Barman Santra) who gave us the
opportunity to do this project and also to respected faculties and
individuals of ECE department who co-operated with us.

Signature of the students :

1
PROBLEM STATEMENT

Data, in today’s digital world is indispensable. Wherever any system exists, it


will take data as an input. Data is important, only if it can be interpreted and
analysed. So data security is one of the vital issue in the world now. The
importance of cryptography (an essential security tool) increases with every new
attack on the internet. So we need to secure data by cryptography (encryption and
decryption) along with speech to text (STT) and text to speech (TTS) conversion
which is beneficial for end users to ensure integrity and confidentiality.

2
METHODOLOGY

Speech to text conversion : For converting speech into text at first speech
must be converted from physical sound to an electrical signal with a microphone, and
then to digital data with an analog-to-digital converter. Once digitized, several models
can be used to transcribe the audio to text.

Most modern speech recognition systems rely on what is known as a Hidden


Markov Model(HMM)[1]. This approach works on the assumption that a speech
signal, when viewed on a short enough timescale (say, ten milliseconds), can be
reasonably approximated as a stationary process—that is, a process in which
statistical properties do not change over time.

In a typical HMM, the speech signal is divided into 10-millisecond fragments.


The power spectrum of each fragment, which is essentially a plot of the signal’s
power as a function of frequency, is mapped to a vector of real numbers known
as cepstral coefficients. The dimension of this vector is usually small - sometimes as
low as 10, although more accurate systems may have dimension 32 or more. The final
output of the HMM is a sequence of these vectors.

To decode the speech into text, groups of vectors are matched to one or
more phonemes—a fundamental unit of speech. This calculation requires training,
since the sound of a phoneme varies from speaker to speaker, and even varies from
one utterance to another by the same speaker. A special algorithm is then applied to
determine the most likely word (or words) that produce the given sequence of
phonemes.

People today are more selective about the websites they visit, as most are
inundated with information daily. Text to speech (TTS) is an exciting technology that
addresses these challenges in an easy and inexpensive way as websites, mobile apps,
digital books, e-learning tools and online documents can literally have their own
voice. 3 following reasons why text to speech is an essential technology to offer with
your digital content or how TTS beneficial for end users -

3
1. Extend the reach of your content

2. Accessibility is relevant

3. Populations are evolving

Encryption & Decryption : The word ‘cryptography’ in Greek means ‘secret


writing’. Encryption is the process of translating plain text data (plaintext) into
something that appears to be random and meaningless (ciphertext)[1]. Decryption is
the process of converting ciphertext back to plaintext. Encryption is the process of
helping protect personal data by using a “secret code” to scramble it so that it cannot
be read by anyone who doesn’t have the code key. Data is jumbled up in a manner so
that when it travels through the internet it is completely unreadable, this stops hackers
who may intercept the data from seeing what you’re doing, as all they’d receive is a
random bunch of letters, numbers & symbols.

The goal of every encryption algorithm is to make it as difficult as possible to


decrypt the generated ciphertext without using the key. Decryption without the correct
key is very difficult, and in some cases impossible for all practical purposes.

Encryption Scheme

Symmetric Key Scheme Asymmetric Key Scheme

Symmetric Key Scheme : Symmetric encryption is a type of encryption


where only one key (a secret key) is used to both encrypt and decrypt electronic
information. By using symmetric encryption algorithms, data is converted to a form
that cannot be understood by anyone who does not possess the secret key to decrypt it.
Once the intended recipient who possesses the key has the message, the algorithm
reverses its action so that the message is returned to its original and understandable
form[2]. The secret key that the sender and recipient both use could be a specific
password/code or it can be random string of letters or numbers.

4
Some examples of where symmetric cryptography is used are:

1. Payment applications, such as card transactions where PII needs to be


protected to prevent identity theft or fraudulent charges.

2. Validations to confirm that the sender of a message is who he claims to be.

3. Random number generation or hashing.

Asymmetric Key Scheme : Asymmetric encryption is a type of encryption


where a pair of keys, one public and one private, is used to encrypt and decrypt
messages. A message that is encrypted using a public key can only be decrypted using
a private key, while also, a message encrypted using a private key can be decrypted
using a public key[4]. Security of the public key is not required because it is publicly
available and can be passed over the internet. Asymmetric key has a far better power
in ensuring the security of information transmitted during communication.

ASCII Table :

5
Flow chart for speech to text conversion and encryption :

Start recording automatically at the given time in


the program given by programmer

Audio will record for given duration in the


program given by programmer

The audio file will convert into text

The original text file will convert into


encrypted text file using symmetric key
scheme

Audio file and original text file delete


automatically

Methodology for encryption :

We use a list of eight integers as a key for the encryption. We only encrypt the
numbers and lower case alphabets.

The list is [-1,2,0,-3,1,-4,-2,-9].

So, first time we take the first character, then convert it into its ASCII value.
Then it is shifted by -1 and is converted back into its character. After that we take the
next character and convert it into its ASCII value. Then it is shifted by +2 and is
converted back into its character and so on. After every eight character this process
is repeating.

So after encryption, the original character and the encrypted character pairs are :

{'0': '/', '1': '3', '2': '2', '3': '0', '4': '5', '5': '1', '6': '4', '7': '.', ' 8': '7', '9': ';', 'a': 'a', 'b': '_',
'c': 'd', 'd': '`', 'e': 'c', 'f': ']', 'g': 'f', 'h': 'j', 'i': 'i', 'j': 'g', 'k': 'l', 'l': 'h', 'm': 'k', 'n': 'e', 'o':
'n', 'p': 'r', 'q': 'q', 'r': 'o', 's': 't', 't': 'p', 'u': 's', 'v': 'm', 'w': 'v', 'x': 'z', 'y': 'y', 'z': 'w'}

6
Example : We take a word ‘Encryption 123’.

E n c r y p t i o n 1 2 3

E e d o y r p i n e 3 2 0

Flow chart for decryption and text to speech conversion :

The encrypted text file will convert into original


text file using symmetric key scheme in reverse
order

Decrypted text file will convert into audio file

Methodology for decryption :

We use a list of eight integers as a key for the decryption. We only decrypt the
numbers and lower case alphabets.

The list is [-1,2,0,-3,1,-4,-2,-9].

So, first time we take the first character, then convert it into its ASCII value.
Then it is shifted by - (-1) and is converted back into its character. After that we take
the next character and convert it into its ASCII value. Then it is shifted by - (+2) and
is converted back into its character and so on. After every eight character this
process is repeating.

So after decryption, the decrypted character and the original character pairs are :

{'/': '0', '3': '1', '2': '2', '0': '3', '5': '4', '1': '5', '4': '6', '.': '7', '7': '8', ';': '9', 'a': 'a', '_':
'b', 'd': 'c', '`': 'd', 'c': 'e', ']': 'f', 'f': 'g', 'j': 'h', 'i': 'i', 'g': 'j', 'l': 'k', 'h': 'l', 'k': 'm', 'e':
'n', 'n': 'o', 'r': 'p', 'q': 'q', 'o': 'r', 't': 's', 'p': 't', 's': 'u', 'm': 'v', 'v': 'w', 'z': 'x', 'y': 'y',
'w': 'z'}

7
Example : We take a word ‘Eedoyrpine 320’.

E e d o y r p i n e 3 2 0

E n c r y p t i o n 1 2 3

We use Spyder 3.2.8 to write the code.

We use ‘sounddevice’ module for the recoding purpose, ‘speech_recognition’


module for converting speech into text, ‘soundfile’ module to save the recorded audio
in ‘.wav’ format, ‘schedule’ and ‘time’ modules for making a schedule for the
program, ‘os’ module for deleting the audio file and the original text file after
generating the encrypted text file, ’gTTS’ module for converting text into speech,
‘tkinter’ module for making the GUI.

8
GUI for the encryption :

GUI for the decryption :

9
Program for speech to text conversion and encryption with GUI :

import sounddevice as sd

import speech_recognition as sr

import soundfile as sf

import schedule as sc

import time

import os

import tkinter as tk

def recording():

Fs=44100

d=7

print("Start")

a=sd.rec(int(d*Fs),Fs,1,blocking=True)

print("stop")

sf.write('D:/Python Project/Encrypt_File/audio_file.wav',a,Fs)

gui()

def recognise():

r=sr.Recognizer()

with sr.AudioFile('D:/Python Project/Encrypt_File/audio_file.wav') as source:

a=r.record(source)

text=r.recognize_google(a)

try:

print("You said : "+text)

except Exception or sr.UnknownValueError or sr.RequestError as e:

10
print(e)

savefile=open('D:/Python Project/Encrypt_File/textfile.txt','w')

savefile.write(text)

savefile.close()

def encrypt():

dic={}

lis=['0','1','2','3','4','5','6','7','8','9','a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r',
's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

data=" "

seq=[-1,2,0,-3,1,-4,-2,-9]

n=0

file=open('D:/Python Project/Encrypt_File/en_textfile.txt','w')

for i in range (36):

dic[str(lis[i])]=chr((ord(lis[i]))+seq[n])

n+=1

if n==8:

n=0

with open("D:/Python Project/Encrypt_File/textfile.txt") as f:

while True:

c=f.read(1)

if not c:

break

if c in dic:

data=dic[c]

11
else:

data=c

file.write(data)

file.close()

os.remove('D:/Python Project/Encrypt_File/audio_file.wav')

os.remove("D:/Python Project/Encrypt_File/textfile.txt")

def gui():

root=tk.Tk()

root.geometry("540x420")

root.minsize(540,420)

root.maxsize(800,500)

root.title("Encryption GUI")

l=tk.Label(text="ENCRYPTION ENCRYPTION ENCRYPTION",


bg="black",fg="white",padx=7,pady=7,font="timesnewroman 15
bold",borderwidth=5,relief=tk.SUNKEN)

l.pack(fill=tk.X)

r_c=tk.Label(text="Recording Completed...",bg="red",fg="white",font="timesnewroman
15 bold")

r_c.pack(fill=tk.X)

pic=tk.PhotoImage(file="E:/Program/Python/Image_1.png")

pic_l=tk.Label(image=pic)

pic_l.pack(side = "top", fill = "both", expand = "no")

frame=tk.Frame(root,bg="grey",borderwidth=3,relief=tk.SUNKEN)

frame.pack(side=tk.LEFT,anchor="nw")

b1=tk.Button(frame,fg="black",bg="white",activebackground="red",activeforeground="oran
ge",borderwidth=3,text="Start Converting Speech to Text",font="timesnewroman 16

12
bold",command=recognise)

b1.pack(side=tk.LEFT)
b2=tk.Button(frame,fg="black",bg="white",activebackground="blue",activeforeground="oran
ge",borderwidth=3,text="Start Encryption",font="timesnewroman 16
bold",command=encrypt)

b2.pack(side=tk.LEFT)

root.mainloop()

sc.every().day.at("13:20:50").do(recording)

while True:

sc.run_pending()

time.sleep(1)

Program for decryption and text to speech conversion with GUI :

from gtts import gTTS

import tkinter as tk

def decryption():

dic={}

lis= ['/', '3', '2', '0', '5', '1', '4', '.', '7', ';', 'a', '_', 'd', '`', 'c', ']', 'f', 'j', 'i', 'g', 'l', 'h', 'k', 'e', 'n', 'r', 'q',
'o', 't', 'p', 's', 'm', 'v', 'z', 'y', 'w']

seq=[-1,2,0,-3,1,-4,-2,-9]

n=0

file=open('D:/Python Project/Decrypt_File/de_textfile.txt','w')

for i in range (36):

dic[str(lis[i])]=chr((ord(lis[i]))-seq[n])

n+=1

13
if n==8:

n=0

with open("D:/Python Project/Encrypt_File/en_textfile.txt") as f:

while True:

c=f.read(1)

if not c:

break

if c in dic:

data=dic[c]

else:

data=c

file.write(data)

file.close()

def output():

mytext = ' '

language = 'en'

with open ('D:/Python Project/Decrypt_File/de_textfile.txt','r') as file:

for line in file:

mytext=mytext+line

audio = gTTS(text=mytext, lang=language, slow=False)

audio.save("D:/Python Project/Decrypt_File/de_audio.mp3")

root=tk.Tk()

root.geometry("490x370")

root.minsize(490,370)

14
root.maxsize(490,370)

root.title("Decryption GUI")

l=tk.Label(root,text="DECRYPTION DECRYPTION DECRYPTION",


fg="black",bg="white",padx=7,pady=7,font="timesnewroman 15
bold",borderwidth=5,relief=tk.SUNKEN)

l.pack(fill=tk.X)

pic=tk.PhotoImage(file="E:/Program/Python/Image_2.png")

pic_l=tk.Label(image=pic)

pic_l.pack(side = "top", fill = "both", expand = "no")

frame=tk.Frame(root,bg="grey",borderwidth=3,relief=tk.SUNKEN)

frame.pack(side=tk.LEFT,anchor="nw")

b1=tk.Button(frame,height=20,width=13,activebackground="blue",activeforeground="orange
",bg="black",fg="white",anchor=tk.CENTER,borderwidth=3,text="Start
Decryption",font="timesnewroman 14 bold",command=decryption)

b1.pack(side=tk.LEFT)

b2=tk.Button(frame,height=20,width=28,activebackground="red",activeforeground="green",
bg="black",fg="white",anchor=tk.CENTER,borderwidth=3,text="Start Converting Text to
Speech",font="timesnewroman 14 bold",command=output)

b2.pack(side=tk.LEFT)

root.mainloop()

15
RESULT ANALYSIS

Python Console - Speech to Text Conversion :

Encrypted Text File :

Encrypted Text :

16
Decrypted Text File and Audio File :

Decrypted Text :

Audio – Text to Speech Conversion :

17
CONCLUSIONS

In this project we have done speech to text (STT) and text to speech (TTS)
conversion along with encryption and decryption of text file. We have learnt a lot of
things about data cryptography which covers some essential areas like authentication,
integrity and confidentiality. We all know that data security is the most essential
aspect in this digital era as it provides protective digital privacy measures to protect
data from corruption. Encryption provides End to End data protection. This project
helps us to learn how to secure data from unauthorized revelation and access of
information and also making it more efficient and powerful by doing speech to text
conversion before encryption and text to speech conversion after decryption using
Python.

18
REFERENCES

[1] Prashanth Kannan, Saai Krishnan Udayakumar , K. Ruwaid Ahmed, ‘Voice recognition with
python’, IEEE, Bali, Indonesia, 30 Aug. 2014, ISBN: 978-1-4799-4909-0.

[2] A. Desoky, ‘Cryptography: algorithms and standards’, IEEE, Athens, Greece, 21 Dec. 2005,
ISBN: 0-7803-9313-9.

[3] Dan Boneh and Victor Shoup, ‘Principles of Modern Cryptography’, Openlibra, August 17,
2015.

[4] T. Rajani Devi, ‘Importance of Cryptography in Network Security’, IEEE, Gwalior, India, 6
April 2013, ISBN: 978-1-4673-5603-9.

19