Abstract. Machine learning (ML) is a category of 2 Internship
algorithm that computer systems can learn with data without being explicitly programmed. Many of ML 2.1 Python algorithm codes are implemented with Python which is the fastest growing and most popular programming Python [1] is widely used for programming as language. I was given the opportunity to learn about it is easier to read with a structure similar to that Python and machine learning in an internship under Dr. of a human language, but it takes less time to Roy and Professor Swaminathan. learn than other high-level programming languages This Report details the information I learned on those such as Java or C++. Python, just like any topics through the internship: Following sections include other programming language, has three major comparison of programming languages and reason why types of control statements. Sequential control the Python is powerful. And, general meaning of statements read the code from top to bottom and ML, advantages of deep learning (DL), and feasible from left to right. Repetition control statements, approaches and applications of them are also included, also known as loops, repeats the same line of followed by conclusion and future plans at the end of this report. code multiples times and prevents redundant lines in programs. Conditional control statements, if Keywords. Machine Learning, Deep Learning, Pattern statements, decide to run or ignore lines of code Recognition depending on whether a statement is true or false.[2]
1 Introduction
Machine learning, named by Arthur Samuel in
1959, is a field within artificial intelligence in which a large data set of examples are given to the computer, and by recognizing patterns within the set of information, the computer is capable of predicting the solution to a new problem. Deep learning is a type of machine learning unique for the many hidden layers within the neural network. We focused on deep learning because it is capable of deciphering information in many layers and executing multiple transformations to identify Fig. 1. 2018 Programming Language Ranking [3] pictures.This ability makes deep learning useful in areas such as protein classification where the Increased popularity of Python is evidenced by structure plays a significant role in identification. its rank as 1st place in IEEE’s list of programming languages in 2018 (shown in Figure 1). Two reasons explained in the article are increase in various applications such as embedded and machine learning. Python is not only very handy in certain applications with attached hardware but also powerful with high-quality libraries for both statistics and machine learning.[3] Python is also listed to be in the first place in the list of all AI development languages due to the simplicity and object-oriented structure. It takes shorter development time than other languages, such as Java, C++ or Ruby.[4] Another advantage of Python is the large number of libraries that Fig. 3. Neural Network allows the programmer to execute certain functions that are not included in the built-in functions. Of the different libraries, the specific one commonly used output layers are called hidden layers which each for machine learning and neural network is called transform the input in some way to give the desired Tensorflow.[5] result.[6] Because machine learning allows computers to 2.2 Machine Learning solve problems and conduct simulations without the need of explicitly stating the desired outcome, it can be applied to just about any field.
2.3 Deep learning
Fig. 4. Differences in Neural Networks
Fig. 2. Artificial Intelligence, Neural Network, and Deep Learning [3] The term Deep learning was coined by a computer scientist named John McCarthy in the The basic structure in which the information is 1950s and is a type of machine learning unique for processed in machine learning is called neural its structure of the neural network. It has multiple network. Neural Networks are comprised of layers of hidden network, hence the name deep different interconnected nodes which are supposed learning, that allows numerous transformations to to imitate the neurons of the human brain. It can be be made to the input data and is well suited for represented in a diagram similar to that of Figure dealing with large data. By processing the data 3. The leftmost column is the input layer which from lower level, more general, categories to higher receives the data entered into the machine and the level, more specific, categories, deep learning right most layer is the output layer which return holds much potential in fields such as computer the result. Any layers in between the input and vision, speech recognition, and robotics. 2.4 Applications of ML and DL
Fig. 5. Levels of Image transformations [7]
The different levels of hidden network consider
the given image in varying degree of specifics, ranging from the image as a whole to individual pixels. Figure 5 for example breaks the image down into different categories depending on the characteristics of the number in order to identify the Fig. 6. Protein Categorization[8] number 4. While programming a neural network used for image recognition, there must be a set chemical bonds that affect the structure and the of qualifiers that are unique to the type of image interaction of atoms within the macromolecule, being identified or categorized. After establishing bioinformatics is a perfect example of our need to those qualifiers and directing the neural network use deep learning for image recognition. Because what features of the image to search for, a large of the sheer size of the data, it is necessary set of data is necessary to allow the computer to programs are developed to even identify different learn. types at this scale. A data set full of images whose output that has already been determined will be given to the computer as a set of examples and guidelines to follow when it is actually given the task of predicting the outcome. Then, another set of data with predetermined results is used to test the program of its accuracy in distinguishing and categorizing different types of images. Should there be any errors, modifications and additional data sets would be needed to improve the accuracy, but with minimal errors from the testing data set, the Fig. 7. Venomous Snakes in Georgia [9] program can now give accurate outputs to images it has never seen before.[7] Another potential application of image Image recognition is often used to identify recognition using deep neural network is different protein molecules as the image, or the identification of animals by their specific taxa shape, of the protein is crucial in its function with just the image. As an easy way to distinguish and therefore identification. With multiple protein similar animals without any prior knowledge in the molecules combining to make a polymer and up area, it has real life uses in detecting potential to thousands of amino acids each with many dangers from venomous snakes. With thousands of cases of venomous bites and several fatal cases in bio-medical engineering and confirmed my each year, proper identification of the type of snake decision to study further in this area. is essential in avoiding fatal cases and even in the case of an emergency, the information on the type Acknowledgements of venom can be useful in determining the degree of emergency and the appropriate anti-venom. We would like to thank professor Swaminathan One of the most distinguishing features of a for his guidance and support for this internship. venomous snake is its head shape. Most people This work is funded by Center for Co-Design of assume that distinguishing a triangular head shape Chip, Package, System (C3PS) at Georgia Institute from a rounded head shape is enough, but in cases of Technology. of emergency, even different types of rattlesnakes each have different amounts of venom injected, symptoms, anti-venoms, and the amount of time References one has before the bite becomes fatal. Information 1. “Python.” https://www.python.org/. about a snake’s head shape, the shape of the 2. A. Sweigart, Automate the Boring Stuff with Python. scales on its head, the patterns on its back and No Starch Press, 2015. its colors are all necessary for proper identification. Especially in the time of emergency, an average 3. S. Cass, “The 2018 top programming languages: person with no expert knowledge on this topic can Python stays on top, and assembly enters the top ten,” IEEE Spectrum, 2018. accurately diagnose the severity of the situation in which the need for image recognition may arise. 4. “Top 5 best programming languages for artificial intelligence field.” https://www.geeksforgeeks.org/top-5-best- 3 Conclusion and Future Work programming-languages-for-artificial-intelligence-field/. Accessed: 2018-09-01. Through this internship, I was able to learn about 5. G. v. Rossum, The Python Library Reference. machine learning that I would never have had Python Software Foundations, 2013. the access to at school, and this experience has 6. N. Buduma, Fundamentals of Deep Learning. helped me decide what I would like to pursue in O’Reilly, 2017. college and beyond. 7. F. Chollet and J. Allaire, Deep Learning with R. Using what I had learned through this program, Manning, 2017. I am planning on submitting an app to the 8. “Protein data bank.” Congressional App Challenge this coming https://www.rcsb.org/structure/6AXT. Accessed: October.[10] The app is focused on preventing 2018-09-16. dangers of encountering venomous snakes in the 9. “Watch out for snakes.” wild. Users will be able to take a picture of the https://georgia.gov/blog/2018-05-22/watch-out-snakes. snake and using machine learning and developing Accessed: 2018-09-01. a program that can identify the key features of a 10. “Congressional app challenge.” venomous snake, the app will inform the users https://www.congressionalappchallenge.us/. 2018. how dangerous the snake they have encountered is. This app will be helpful because it reduces the need for the user to have any background knowledge for a quick identification of the type of snake. In terms of college and beyond, I am planning to pursue a major in bio-medical engineering. This internship has also taught me that machine learning and statistical analysis are very applicable Seohyun Park is a Sterlite Best Paper Award at Photonics 2010, twelfth grade student at IIT Guwahati, MHRD Scholarship, Government of Chattahoochee High school, India 2007, Jawaharlal Nehru Scholarship Steel Johns Creek, GA. Her Authority of India Limited, 2000. research interest is bio-medical engineering. She was ranked Madhavan Swaminathan 1st and 3rd place on Dynamic received the M.S. and Planet and Herpetology, Ph.D. degrees in electrical respectively at Georgia State Science Olympiads engineering from Syracuse in 2018. She is the recipient of President’s Award University, Syracuse, NY, USA, for Educational Excellence in 2012 and 2015. in 1989 and 1991, respectively. He was with IBM, E. Fishkill, NY, USA, where he was involved in packaging for supercomputers. He was the Joseph M. Pettit Professor of Electronics in Electrical and Kallol Roy completed his BS Computer Engineering and the Deputy Director from Department of Electrical of the NSF Microsystems Packaging Research Engineering at Indian Institute Center with the Georgia Institute of Technology of Technology Kanpur. After his (Georgia Tech), Atlanta, GA, USA. He has been BS he has worked for some time the Founder and Co-Founder of two start-up as a Research Assistant in the companies, such as E-System Design, Johns theoretical Sciences Division at Creek, GA, USA, and Jacket Micro Devices, SN Bose National Center For Decatur, GA, USA. He is currently the John Basic Science, Kolkata, India. He then joined the Pippin Chair in Electromagnetics with the School Electrical Communication Engineering Department of Electrical and Computer Engineering and at Indian Institute of Science Bangalore for the Director of the Interconnect and Packaging his Masters and Ph.D. program and completed Center with Georgia Tech. He has authored it on 2014. His research topic during his over 400 refereed technical publications, holds Ph.D. is Quantum Algorithms and Quantum 29 patents, authored and co-edited three books. Cryptography. After his Ph.D., he has joined Dr. Swaminathan was the Founder of the Department of Mathematics, Indian Institute of International IEEE Electrical Design of Advanced Science Bangalore, as a Postdoctoral Researcher. Packaging and Systems Conference. He has From August 19, 2015 he has joined Statistical served as the Distinguished Lecturer of the IEEE Artificial Intelligence Lab of Prof. Jaesik Choi, Electromagnetic Compatibility Society. UNIST, as a Postdoctoral Researcher. He is the recipient of APS-IUSSTF Physics Student Visitation Award, 2012 Microsoft Travel Award,