Sie sind auf Seite 1von 1

Technical Elective 2

Artificial Intelligence

Midterm Exam

Name:__________________________________________ Date:__________________
Year and Section:________________________________ Score:__________________

Instructions: Read the problem carefully and answer the questions appropriately. Submit your code together with
your answer on a A4 sized bond paper with your NAME, YEAR&SECTION. Scores are indicated on every problem
and will be graded accordingly. Similar codes and algorithms will result into a division of the grades for the given
number.(ex. 3 student with same code on a 20point, so that’s 20/3=6.7).

Feature Selection and Reduction

Principal Component Analysis (PCA)

The X variable which contains the features is a N x 9 dataset. One way to reduce the computation for building a
model, you can use Principal Component Analysis to reduvce the dimensions of the original dataset with minimal loss
of information.

The outcome of the principal component analysis (PCA) is to project a feature space (our dataset containing d-
dimensional samples) onto a smaller subspace tha t represents our data.

This is performed to reduce the computational costs and error in pattern estimation (but not always) by reducing
the number of dimensions of the feature space. In PCA, the entire data is projected onto a different subspace by
finding axes with maximum variances where the data is most spread (within a class, since PCA treats the whole data
set as one class).

Problem 1:

Apply PCA on the Movie dataset (use all the features available), then use this to predict whether a movie will
have low sales, average sales, or high sales.

1. Load dataset and separate the features from labels (10pts)

2. Pre-process the dataset (normalization, standardization of data etc.) (10pts)

3. Implement PCA on the data. (10pts)

4. Choose a supervised learning algorithm to create a model. (10pts)

5. Evaluate the performance of the model. (10pts)

Problem 2:

Given the Dengue Cases in the Philippines dataset, answer the following questions. You may use any
machine learning algorithm that you deemed perfect for this type of data. The dataset contains the recorded
number of dengue cases per 100,000 populations per region of the Philippines from 2008 to 2016

1. What is the trend of dengue cases in the Philippines?


2. What region/s recorded the highest prevalence of dengue cases?
3. In what specific years do we observe the highest dengue cases?
4. When and where will a possible dengue outbreak occur?

Das könnte Ihnen auch gefallen