Sie sind auf Seite 1von 9

Hello, everyone welcome to the University of Illinois Coursera Course on Heterogeneous Parallel Programming.

My name is Wen-mei Hwu, I'm a professor at the university. I'm in both Electrical and Computer Engineering Department and the Computer Science Department. And on behalf of the teaching staff, I'd like to welcome you to this course. The purpose of this course is for you to learn how to program heterogeneous parallel computing systems and achieve high performance and high energy efficiency. And while you do so, we'd like to also help you to understand how to achieve full functionality and maintainability for your code. Like to teach you, in the programming style in which you can expect your code to be scalable across future hardware generations. And towards the end of the course, we will also teach the programming interfaces and the concepts that will help you to make sure that your code is portable across vendor devices. In terms of technical subjects, we're going to start with the basics. We're going to be teaching parallel programming APIs, or application programming interfaces, tools, and techniques. Once we cover the basic material, we'll go into the core of the course, which is the principles and patterns of parallel algorithms. And through the course, we also introduce processor architecture features and constraints that you need to understand well, in order to achieve your performance and energy efficiency goals. Again, my name is Wen-mei Hwu. And if you want to send me e-mail, here's my e-mail address. And please use square bracket Coursera HPP to start your email subject line. I'll be looking for emails with this subject line on a daily basis. I like to acknowledge that there is a very strong team of teaching assistants for this offering. We have Abdul Dakkak, Izzat El Hajj, Tom Jablin and Andy Schuh who are University Illinois staffs, who have been

working on the various infrastructure and the contents, and so, so on for this course. And we have a very strong team of community TAs and these were outstanding students who had who had achieved an the certificate of distinction from the previous offering and they have kindly offered their time to help you to learn in this offering. And this course contains an, an enormous amount of intellectual contents and peop-, there are many, many people who have contributed to to these contents over the years. In particular I'd like to acknowledge my co-author, Dr. David Kirk, and who has also been my collaborator and co-, co-conspirator for many initiatives in the past several years. My former students, John Stratton and Isaac Gelado have been teaching various forms of this course with me, and of course, they have added so much, to this course. John Stone who is the lead developer of of BMD and he has been you know, adding much application, content, and examples and so on to this course. And Javier Casavas, Ca, Ca, Cabezas is my current PhD student who have been helping me to update the contents of the course. And Dr. Michael Garland at NVIDIA Research, who have been helping me with so many algorithm contents and important feedback over time. Of course all the TAs and many many other people have, have been adding and refining these contents for us. Since you're watching the video now, you already know how to get to the Coursera HTTP website. However, you should also know that we will be able to find handouts, quizzes, labs, and lecture slides in those, in, in the website. In this offering, we're going to to give you the option of, of, weekly view verses Classic view for all your work. You know in a weekly view you'll be able to find everything you need to do for this course during each week. And this is a very easy way for you to ensure that you have completed all your work for that

for that week. But if you strongly prefer Coursera web page layout and so on, then you will be able to also use the Classic view of, Coursera. In the website, you will also find sample book chapters and documentation and software resources to help you with your learning. And we're also going to be making periodic electronic announcements. So you can also subscribe to email notices of these announcements. For the forum, I'd like to you know, encourage you to bring all your questions to the electronic discussion forum. And the community TAs will be, you know, reading and answering these postings, and many of your classmates will often have, often have very good answers to your questions as well, on a very timely basis. When it comes to grading this course is divided 50/50 between quizzes and labs. For the quizzes, are weekly quizzes that are designed to help to fully understand the lecture contents and then you can repeat taking these cour- quizzes until you are satisfied with your performance. And then the labs are also weekly, and they are designed to help you to master all the concepts you have learned in this course. At the beginning, there will be a single track lab, but at the towards the end you will have the option to choose between alternative languages and the concepts so that you can you can tailor your experience through this course or with those choices. The, there is a very simple Academic Honesty Code in this course; basically this course encourages collaboration and discussion. So you are allowed to and encouraged to discuss the assignments with others with other students in the class. You're also, more than welcome to get verbal advice, help from people who have already taken this course.

On the other hand, any sharing of code is unacceptable. This includes posting your own code at the discussion forum, or reading someone else's code and then go and write your own code. The reason why we don't want to do so is because we'd really like to for you to have a complete, top to bottom experience in developing every piece of code in this course so that you can truly thoroughly understand all the concepts. And as far as quizzes are concerned, giving and receiving help on the quiz is unacceptable. These quizzes should be solely your personal work. The course is designed to be self-contained, that is, you should be able to learn all the concepts by listening to, by watching the lecture videos, by reading through all the material that we will be provided to you online. However for those of you who would like to have a richer learning experience and having more depth and more breadth in your learning I would recommend you the read the textbook by David Kirk and myself: Programming Massively Parallel Processors - A Hands-on Approach. We're at our second addition and it's published by Morgan Kaufman in 2013. And for those of you who are you know, who will be doing lab assignments, I would like to highly encourage you to read all the accompanying instructions and note, notes for these assignments. Over the years, we have learned that there are quite a few pitfalls and details that all the students need to be aware of in order to be productive in the lab. So we have you know, added all this material into those accompanying instructions and notes. I would highly recommend that you read them before you, you start your project, each assignment. And of course as you write your CUDA C code you are more than welcome to refer

to NVIDIA CUDA C Programming Guide, and pro-, version five. Now NVIDIA just recently released version six. Either of these versions will work just fine for the purpose of this course. Now I'd like to just have have a few words about the history of this course, how this course came about. In 2006, David Kirk, then the chief scientist of NVIDIA, gave a guest lecture in my University of Illinois class. And at dinner, David asked my department head [UNKNOWN] Blahut and myself you know, what that the ind-, the whole industry was in the process of introducing parallel computing devices you know in 2005 and 2006. So you know the question was whether the academia was ready to, to educate the the enormous number of students who need to know parallel programming in the years to come. And you know [UNKNOWN] being a good department had turned the table around and asked David And saying, hey, David, if people like Wen Mei would like to teach such a high-quality parallel programming class, would you be willing to come and help us to plant and teach such a course? And David said of course I would. So in Ju-, July of 2006, David finished touring through all the top universities and after some evaluation at NVIDIA, NVIDIA announced that Illinois will be the primary partner for developing such a high quality parallel programming class. And then David and I started to collaborate, and in that particular course and many many other projects ever since. In two, November 2006, NVIDIA announced G80 and, and, and they also had the first version of CUDA available in their labs. So I took four of my grad students for a training workshop at NVIDIA, and we spent two days working on these training projects, what we called the kitchen projects at NVIDIA. And all the Foursquare students you know,

students managed to finish their training projects ahead of me. And know, I would like to highly encourage everyone who wants to learn parallel programming to learn it while they are, while you are young. And then 2000, in December 2007, David and I were in panic mode and we were producing all the lectures for ECE498AL, which is the course number that my department assigned to our experimental course. And this lasts through the entire Christmas season and our families were not in exactly happy with us during that particular Christmas season. In January 2007, we we started the course and the course was you know was registered by about 40 students but when we started the course, CUDA was not officially released and the hardware was not even you know officially on the market. So we actually had to have all the students to sign an NDA, so that they can access to the, to the to the soft-, to the software, to the hardware and so one. But in February 2007, NVIDIA officially released the first version of CUDA and then the lab and the lecture material of the course begin to to go online. Ever since then the, the various versions of this lab and the lecture contents have been used by hundreds of thousands of students and faculty worldwide, and you know, you are actually looking at a you will be learning a a version of the material based on that original contents. In March 2007 several application teams including the NAMD team at the University of Illinois, posted universi, ECE498AL class projects. And even to this day the class projects have been a, a major component of the course that we offer at the University of Illinois. In June 2008, we partner with Virtual School of Computational Science and Engineering, the Barcelona Super Computing Center, the Chinese Academy of Sciences, and the, Pan American Institute of Advanced Studies and we started to offer week-long summer schools based on

the contents of ECE498AL. In August 2011, ECE498AL became an official University of Illinois course. It's listed as ECE408 for Electrical and Computer Engineering students and CS483 for computer science students. And for every semester we offer the course, we typically will also have about 10 students from other science, engineering and business, financing kind of areas in, in the class. And in November 2012, the University of Illinois decided to offer a version of ECE408 and through Coursera. And this course has about half the contents of ECE408 that we managed to, to, to, to trans-, to transform from a on campus, in-person delivery to a MOOK platform. And we had 25,000 students registered. About 10,000 of them actually finished all the labs and and quizzes and nearly 3,000 students receive certificate of achievement and even better in some case many cases certificate of distinction. So you know, this turned out to be one of the highest rate of success completion in the MOOK community and we are very proud of the students who put in so much good work into into this course in that first offering. Through the past year, we took a year to you know to work additional material into the course. So for the, the second offering of HPP course, we will have between 60% and 70% of the ECE408 material in the, in this offering. And also by, through our experience in the first offering and, and student feedback, we, we made several major enhancements in the, the way the course is organized and delivered on the Coursera platform. So we're hoping that this will give you a even better, even more supported and even more friendly environment to learn this material. So here comes our tentative plan or tentative schedule. During week one, we will have introduction to heterogeneous computing, and we will

give an overview of CUDA C and we will introduce the basic concepts of kernel-based parallel programming. In the lab, you will be, we'll give you a tour through our online cloud-based lab, and you will be doing a programming assignment of vector addition in CUDA C. This and during week two, we're going to cover memory module for locality and we'll be teaching tiling techniques for conserving memory bandwidth. And we'll be also teaching how you can handle boundary conditions in parallel tiled algorithms. And finally, we'll also cover performance considerations. So in the lab, you will be doing programming assignment of simple matrix-matrix multiplication in CUDA C. During week three, you are going to be we are going to be covering Parallel Convolution Pattern arguably one of the most important parallel algorithm patterns for many many applications. And you'll, in the lab, you'll be doing a programming assignment of titled matrix-matrix multiplication in CUDA C. During week four, we will be teaching parallel scan pattern, and this is a, a very important parallel algorithm pattern for converting many sequential algorithms into parallel algorithms. And in the lab, you will be doing programming assignment of parallel convolution in CUDA C. During week five we'll be covering parallel histogram pattern and the concept of critical sections and atomic operations. And so in the lab you will be doing program assignment of parallel scan in CUDA C. And during week six, we'll be teaching data transfer considerations and streams and so on, and to overlap communication with comp-, computation. And we will be also teaching the conc-, basic concepts in task parallelism. And in the lab, you will be doing a programming assignment of parallel histogram in CUDA C. Week seven is the beginning of broadening your knowledge base for parallel programming, so we will be introducing you to OpenCL, to C++AMP, and to Open ACC.

And these programming interfaces are close relatives to CUDA C, but they will also allow you to, your code to be portable to other vendor hardware. And you, in the lab, you will be doing a programming assignment of vector addition using streams in CUDA C. In week eight, we will do a course summary. And we will also cover some related programming models. And Thrust is a, a productive library, parallel library, for CUDA programmers. Bolt is a productive parallel library, primitive library, for OpenCL programmers. And then we'll also talk about CUDA FORTRAN, for peop-, those of you who would like to be able to apply your knowledge to scientific computing. And in the lab, we'll have a programming assignment of simple matrix-matrix multiplication, with your choice of OpenCL, Open C++AMP, and OpenACC. During week nine, we will not, not have any more lectures, but it is a week where you will be completing any remaining lab assignments with the an optional bonus programming assignment in the in the choice of OpenCL, C++AMP or OpenACC. At this point, I'd like to acknowledge the generous support of, from Personify. And their you know, video technology has enabled the recording of this these lectures. And, I like to very much thank them for their generous support and generous donation of their time and efforts. So at this point, again I'd like to welcome you aboard, and I look forward to working with all of you in the next nine weeks. Thank you.