Beruflich Dokumente
Kultur Dokumente
Submission deadline: September 30th, 2011 (11:59pm Toronto, Canada time) Participation details
1. Ensure you fully complete your profile in BigDataUniversity.com (or DB2University.com). Should you be selected for the trip, this information will be used for validation purposes. To update your profile, visit http://www.db2university.com/courses/user/view.php 2. Choose a dataset from the link provided below, or Google for a dataset of your choice. Ensure you follow any licensing requirements for using the dataset. http://www.delicious.com/pskomoroch/redistributable+dataset 3. With the dataset selected, use IBM InfoSphere BigInsights software to run a Hadoop MapReduce job that can discover something interesting about the dataset. This challenge requires you to be innovative and creative! Here are some examples: B.C. federal prison seizures, 2008-2010 dataset http://buzzdata.com/mariusbutuc/b-c-federal-prison-seizures-2008-2010 Using this dataset you can find interesting facts about things that have been seized from prisoners at British Columbia (Canada) prisons from 2008 to 2010. For example, the wordcount sample provided in the course could be used to count which items where seized the most. Distribution of Venture Capital in the United States in 2011 dataset http://buzzdata.com/azad2002/the-united-states-of-venture-capital-2011 Using this dataset, you can determine the city and industry where most venture capital has been raised in the first six months of 2011 in the United States Validate the "Halloween Effect" phenomenon in the stock market The Halloween Effect is a phenomenon that occurs in the stock market where returns are significantly higher during the November-April periods (after Halloween) versus the MayOctober period. Using Hadoop and a dataset for the stock market, you can confirm or reject such phenomenon. 4. The dataset can be small so that it can be run with Hadoop in pseudo-distributed mode in one single node. If you need to work on a larger dataset, you may want to use a Hadoop Cluster on the Cloud, but first ensure your program works on a subset of the dataset and develop it on a single node in pseudo-distributed mode. The selection process will be based mainly on creativity and interesting results rather than the size of the dataset. 5. You must use IBM InfoSphere BigInsights software. You can use the VMWare image provided in the course, install BigInsights Basic locally, or run it on the Cloud. To analyze or graph the results, you may use any other software for which you have a license such
as SPSS, Excel spreadsheets, and so on, or you may write an application in any language that can display results in a neat way. 6. By submitting your solution to this challenge, you agree to have your entire solution (including your code) added to examples for courses in BigDataUniversity.com or any other promotional avenues. If your submission is very interesting, you may be invited to present it in person at the IOD conference.
On October 3rd, we will announce at BigDataUniversity.com who were selected for this trip. If you were selected, you will be notified by email and telephone. As soon as you are confirmed, you need to process your travel visas to the United States.
Good luck!