Cluster ANalysys

© All Rights Reserved

Als DOC, PDF, TXT **herunterladen** oder online auf Scribd lesen

6 Aufrufe

Cluster ANalysys

© All Rights Reserved

Als DOC, PDF, TXT **herunterladen** oder online auf Scribd lesen

- QB Students Dm
- Performance Analysis of Classification and Clustering Based on Mining for Age Related Diseases From Patients Record
- Applying Clustering Techniques for Efficient Text Mining in Twitter Data
- JaiweiHanDataMining.ppt
- A Swarm Based Approach to Improve Traditional Document Clustering Approach
- 1 a New Algorithm for Cluster Initialization
- Applying Data Mining Techniques
- Multimodal Optimization Using Self-Adaptive Real Coded Genetic Algorithm With K-Means & Fuzzy C-Means Clustering
- Pirooznia_2008_PMID18366602
- biggio14-aisec
- Zhu 2009 Pattern-Recognition
- A Novel Leaf Classification Technique Using GLCM and RBFNN
- k Means Clusterring 2
- DB-OLS: An Approach for IDS1
- Review of Color Segment at i on 2
- Bio Medical Data Analysis Using Novel Clustering Techniques
- 08540811.pdf
- b7c968235e575efecb61c8be1be2e7533fd9
- Cluster Analysis
- 11

Sie sind auf Seite 1von 43

The SAS Enterprise Miner Filter tool enables you to remove unwanted records from an analysis. Use

these steps to build a diagram that will read a data source and filter records. Create a new diagram called

Segmentation Analysis.

1. Drag the CENSUS2000 data source to the Segmentation Analysis workspace window.

2

If you explore the data in CENSUS2000, you will notice that there are a number of records that

have a value of 0 for Median Household Income and Average Household Size. The Explore

window below shows the histogram windows for each variable, with the data that has a Median

Household Income of 0 highlighted. Clicking on any of the bars in one of histograms highlights

the same set of records in all the other histograms. The zero average household size seems to be

evenly distributed across the longitude (LOCX), latitude (LOCY), and density percentile

(RegDens) variables. It also seems concentrated on low incomes and populations.

3

A portion of the CENSUS2000 window is shown below. Records 28 and 33 (among others) have

a value of 0 for Average Household Size. These records also have unusual values in the

remaining non-geographic fields. For example, the Median Household Income is listed as $0,

the Region Density Percentile is missing, and the Region Population is 0.

If you sort the records in the CENSUS2000 window by ascending order of the Average

Household Size, you can examine these records together. (You click the Average Household

Size column header once to sort by descending order, and click it again to sort by ascending

order.) With these records grouped together, it is easy to see that most of the cases with

an Average Household Size of 0 have a value of 0 or missing on the remaining non-geographic

attributes. There are some exceptions, but you might decide that cases such as this are not of

interest for analyzing household demographics.

4

3. Drag the Filter tool (fourth from the left) from the tools palette into the Segmentation Analysis

workspace window.

You can use filters to exclude certain observations, such as extreme outliers and errant data

that you do not want to include in your mining analysis. Filtering extreme values from the

training data tends to produce better models because the parameter estimates are more stable.

You might also want to filter your data to focus on a particular subset of the original data

source.

4. Connect the CENSUS200 data to the Filter node.

You have just created a process flow. The process flow, at this point, reads the raw CENSUS2000

data and filter unwanted observations. However, you must specify which observations are unwanted.

To do this, you must change the settings of the Filter node.

5

The Properties panel displays the analysis methods used by the node when run. By default, the node

will filter cases in rare levels in any class input variable and cases exceeding three standard deviations

from the mean on any interval input variable.

You can control the number of standard deviations by using the Advanced Properties sheet,

which is discussed later.

Because the CENSUS2000 data source only contains interval inputs, only the Interval Variables

criterion is considered.

6. Change the Default Filtering Method property (under the Interval Variables grouping) to

User-Specified Limits.

7. Select the Interval Variables ellipsis (). SAS Enterprise Miner then informs you that it is updating

the path. When the update is complete, the Interactive Interval Filter window opens.

6

You are warned at the top of the window that the train or raw data set does not exist. This indicates

that you are restricted from the interactive filtering elements of the node (which are available after a

node has been run). Nevertheless, you can enter filtering information.

7

8. Type 0.1 in the Filter Lower Limit field for the input variable MeanHHSz.

9. Select OK to close the Interactive Interval Filter window. You are returned to the SAS Enterprise

Miner interface window.

When the diagram is run, all cases with an average household size less than 0.1 are filtered from

subsequent analysis steps.

10. Right-click the Filter node and click Run from the shortcut menu.

8

11. Select Results in the window. The Filter nodes Results window opens.

The Output window indicates that 1081 observations were excluded from the TRAIN

(CENSUS2000) data.

12. Close the Results window. The CENSUS2000 data is ready for pattern discovery analyses.

9

Before you run a cluster analysis, you must select and evaluate the inputs that you want to use. In general,

you should seek inputs that have the following attributes:

relatively independent

limited in number

Using the Variables window, you can explore the inputs you have selected for the analysis. Histograms for

each input variable enable you to see problems with your data that might need to be fixed, such as skewed

data or data that needs standardization, before you run the cluster analysis.

Not all the inputs in the CENSUS2000 data set will be used for the segmentation analysis. The variables

LocX, LocY, and (arguably) RegPop describe characteristics of the geographic regions and not the

people who live there. Because this analysis is an analysis of demographic and not geographic

characteristics, you should reject these variables. Using the Input Data node, change the role for the three

variables to rejected.

Only the variables with a role of Input will be used in subsequent steps of the segmentation

analysis. Explore these variables using the Input Data node. The default setting of Sample

Method is Top, but you can specify a different method in the Sample Properties window. You

can also use the Preferences window to change the default setting of Sample

Method to Random so that you do not need to change the method in the Sample Properties

window each time that you use the Explore window.

By default, the Explore window selects a sample of 10000 observations. You can change

the Fetch Size to Max to increase the sample size to 60,000 observations. If your data source

contains fewer than 60,000 observations and the Fetch Size is set to Max, the Explore window

uses the full set of observations for exploration.

10

The histograms reveal two issues that must be resolved before attempting a meaningful segmentation

analysis.

The distribution of MedHHInc is highly skewed. Unless this problem is resolved, cases in the tail of

the distribution will be isolated into orphan segments. It is common practice to transform highly

skewed inputs to regularize the shape of their distribution.

The input ranges for the inputs MeanHHSz, MedHHInc, and RegDens differ by several orders

of magnitude. Unless this problem is resolved, the input with the largest range will dominate in the

k-means algorithm used by the Cluster tool. In a later demonstration, you use an option in the Cluster

tool to standardize the range of the inputs.

11

The k-means clustering algorithm is sensitive to distributions with outlying cases. As was noted above, a

handful of cases have large values for MedHHInc. To avoid creating several segments with a small

number of cases, you should consider transforming this input to have a less extreme distribution. The

Transform Variables tool enables you to perform such data regularizations.

1. Close the Explore window.

2. Select the Modify tab.

3. Drag a Transform Variables tool into the diagram workspace.

4. Connect the Filter node to the Transform Variables node.

5. Select Formulas from the Properties panel for the Transform Variables node.

6. Select OK to update the path. The Formulas window opens.

12

The Formulas window lets you interactively create customized transformations of analysis variables.

The top half of the window displays plots of new and existing variables. The lower half displays the

names of new variables, existing variables, or other information, depending on the selected tab.

Operations are controlled by five icons at the lower left of the window.

Use the Formulas window to create an appropriate transformation of the MedHHInc variable.

7. Select the Create icon, . The Add Transformation dialog box opens.

13

The top half of the Add Transformation dialog box shows metadata information about the new

variable; the bottom half shows the formula for the transformation.

8. Type LogMedHHInc for the Name property.

You can either type the formula directly in the lower half of the Add Transformation dialog box or use

the Expression Builder.

9. Select Build. The Expression Builder opens.

14

The Expression Builder lets you interactively pick transformations and select from existing variables.

Used correctly, this can limit mistakes when entering expressions.

10. Select the Mathematical category folder. A list of mathematical operators is shown in the Functions

pane.

15

11. Select LOG(argument) and select the Insert button. The LOG function is placed in the Expression

Text area with the required numeric argument highlighted.

12. Select the Variables List tab. The lower half of the Expression Builder now shows a list of variables

in the analysis data.

16

14. Select OK to close the Expression Builder. The defined expression appears in the Add Transformation

dialog box.

15. Select OK to close the Add Transformation dialog box. The newly created LogMedHHInc variable

is listed in the bottom half of the Formula Builder.

17

Recall that the original MedHHInc variable was skewed right. What does the distribution of the new

LogMedHHInc input look like?

18

The distribution of LogMedHHInc is shown in the lower half of the Formulas window. (The number

of histogram bars has been increased to 30 using the Graph Properties window.)

An interesting problem has occurred. The new variable has outlying values and is now slightly left

skewed. Because small outlying values are just as harmful to segmentations as large outlying values,

you might question the value of the log transformation. A simple adjustment to the transformation

will correct this problem.

17. Select the Edit Expression icon, . The Expression Builder opens.

18. Edit the Expression Text area to have the following formula:

LOG(MAX(MedHHInc,10000))

19

This action truncates its distribution: the newly created variable will equal the logarithm of the larger

of MedHHInc or 10,000.

19. Select OK to close the Expression Builder. You return to the Formulas window.

20

Note that the plot has not been updated (except for mysteriously returning to the default number of

bins).

20. Select Refresh Plot. Increase the number of histogram bars to 30 using Graph Properties.

21

The distribution for this variable is now nicely compact and nearly symmetric.

21. Select OK to close the Formulas window.

The variable LogMedHHInc is now part of the analysis data, ready for use in a segmentation analysis.

22

The Cluster tool performs k-means cluster analyses, a widely used method for cluster and segmentation

analysis. This demonstration shows you how to use the tool to segment the cases in the CENSUS2000

data set.

1. Select the Explore tab.

2. Locate and drag a Cluster tool into the diagram workspace.

3. Connect the Transform Variables node to the Cluster node.

To create meaningful segments, you will need to set the Cluster node to do the following:

ignore the MedHHInc input. (It has been replaced by the newly created LogMedHHInc input.)

standardize the inputs to have a similar range.

A nodes Variables property determines which variables are used in an analysis.

1. Select the Variables property for the Cluster node. The Variables window opens. Click Update

Path to enable the LogMedHHInc variable to be displayed. MedHHInc is automatically suppressed.

23

The Cluster node will create segments using the inputs LogMedHHInc, MeanHHSz, and RegDens.

Segments are created based on the (Euclidean) distance between each case in the space of selected inputs.

If you want to use all the inputs to create clusters, these inputs have similar measurement scales.

Calculating distances using standardized distance measurements (subtracting the mean and dividing by

the standard deviation of the input values) is one way to ensure this. You can standardize the input

measurements using the Transform Variables node. However, it is easier to use the built-in property in the

Cluster node.

Where is the built-in standardization property? It turns out that only a handful of node properties are

shown by default in SAS Enterprise Miner. To see the full extent of options for an analysis node, you

must view the nodes advanced property sheet.

24

1. Select View Property Sheet Advanced. The full range of node options is now available for

changing.

2. Select Internal Standardization Standardization. Distances between points are calculated based

on standardized measurements.

Another way to standardize an input is by subtracting the inputs minimum value and dividing by

the inputs range. This is called range standardization. Range standardization rescales the

distribution of each input to the unit interval, [0,1].

25

After you have selected your inputs and manipulated the data to prepare it for the analysis, you can decide

whether you want SAS Enterprise Miner to determine the number of clusters to create or whether you

want to specify a number of clusters for the analysis. It is often useful to have SAS Enterprise Miner

determine the number of clusters automatically the first time you run the analysis. By default, the Cluster

tool attempts to automatically determine the number of clusters in the data. A three-step process is used.

Step 1 A large number of cluster seeds are chosen (50 by default) and placed in the input space. The

Euclidean distance from each case in the training data to each cluster seed (center) is

calculated. Cases are assigned to the closest cluster center. Because the distance metric is

Euclidean, it is important for the inputs to have compatible measurement scales. Unexpected

results can occur if one inputs measurement scale differs greatly from the others. In this

manner, cases in the training data are assigned to the closest seed, and an initial clustering of

the data is completed. The means of the input variables in each of these preliminary clusters

are substituted for the original training data cases representing the seeds in the second step of

the process. Cases are reassigned to the closest cluster center. Cluster centers are updated and

cases are reassigned until the process converges. On convergence, final cluster assignments

are made. Each case is assigned to a unique segment. The segment definitions can be stored

and applied to new cases outside of the training data.

Step 2 A hierarchical clustering algorithm (Wards method) is used to sequentially consolidate the

clusters that were formed in the first step. At each step of the consolidation, a statistic called

the cubic clustering criterion (CCC) is calculated. The first consolidation in which the CCC

exceeds 3 provides the third step with the number of cluster to use. If no consolidation yields

a CCC in excess of 3, the maximum number of clusters is selected.

More details on the CCC can be found in the following 59-page technical report:

https://support.sas.com/documentation/onlinedoc/v82/techreport_a108.pdf

Step 3 The number of clusters determined by the second step provides the value for k in a k-means

clustering of the original training data cases.

Choosing meaningful inputs is clearly important for interpretation and explanation of the generated

clusters. Independence and limited input count make the resulting clusters more stable. An interval

measurement level is recommended for k-means to produce nontrivial clusters. Low skewness and

kurtosis on the inputs avoid creating single-case outlier clusters.

Enterprise Miner has three methods for calculating cluster distances:

Average: the distance between two clusters is the average distance between pairs of observations,

one in each cluster. This method:

o Tends to join clusters with small variances.

26

Centroid: the distance between two clusters is the Euclidean distance between their centroids or

means. This method is more robust to outliers than most of the other hierarchical methods, but

does not generally perform as well as Ward`s method or the Average method.

Ward:

This method does not use cluster distances to combine clusters. Instead, it joins the clusters such

that the variation inside each cluster will not increase drastically. This method:

o Tends to join clusters with few observations

o Minimizes the variance within each cluster. Therefore, it tends to produce homogeneous

clusters and a symmetric hierarchy.

o Is biased toward finding clusters of equal size (similar to k-means) and approximately

spherical shape. It can be considered as the hierarchical analogue of k-means.

o Is poor at recovering elongated clusters.

First: Select the first k complete cases as the initial seeds. Also developed by MacQueen (see

below).

MacQueen: Chooses the centers randomly from the data points. The rationale behind this method

is that random selection is likely to pick points from dense regions, i.e., points that are good

candidates to be centers. However, there is no mechanism to avoid choosing outliers or points that

are too close to each other. This is the default method.

Full Replacement: Select initial seeds that are very well separated using a full replacement

algorithm. Use this if you see clusters that are too clumped together.

Partial Replacement: Select initial seeds that are well separated using a partial replacement

algorithm

1. Run the Cluster node and select Results. The Results - Cluster window opens.

27

The Results - Cluster window contains four embedded windows. The Segment Plot window attempts

to show the distribution of each input variable by cluster. The Mean Statistics window lists various

descriptive statistics by cluster. The Segment Size window shows a pie chart describing the size of

each cluster formed. The Output window shows the output of various SAS procedures run by the

Cluster node.

Apparently, the Cluster node has found three clusters in CENSUS2000 data. Because the number of

clusters is based on the cubic clustering, it could be interesting to examine the values of this statistic

for various cluster counts.

28

2. Select View Summary Statistics CCC Plot. The CCC Plot window opens.

The CCC statistic is zero for Number of Clusters equal to one. The statistic decreases for Number of

Clusters equal to two and then rapidly increases. It reaches a maximum at Number of Clusters equal

to 14 and then slowly decreases. The number of clusters selected is the first instance that the CCC

goes from increasing to decreasing. In the above graph, the CCC decreases from 1 cluster to 2,

increases from 2 clusters to 4, then decreases from 4 clusters to 5. So, 4 clusters is chosen as the

optimal number.

In theory, the number of clusters in a data set is revealed by the peak of the CCC versus Number of

Clusters plot. However, when no distinct concentrations of data exist, the utility of the CCC statistic

is somewhat suspect. SAS Enterprise Miner attempts to establish reasonable defaults for its analysis

tools. The appropriateness of these defaults, however, strongly depends on the analysis objective and

the nature of the data.

29

You might want to increase the number of clusters created by the Cluster node. You can do this by

changing the CCC cutoff property or by specifying the desired number of clusters.

1. On the Properties panel for the Segmentation node, select Specification Method User Specify.

The User Specify setting creates a number of segments indicated by the Maximum Number of

Clusters property listed above it (in this case, 10).

2. Run the Segmentation node and select Results. The Results - Cluster window opens, this time

showing a total of 10 generated segments.

Segment frequency counts vary from 10 cases to more than 8,000 cases.

30

Exploring Segments

While the Results window shows a variety of data summarizing the analysis, it is difficult to understand

the composition of the generated clusters. If the number of cluster inputs is small, the Graph wizard can

aid in interpreting the cluster analysis.

1. Close the Results - Cluster window.

2. Select Exported Data from the Properties panel for the Cluster node. The Exported Data - Cluster

window opens.

This window shows the data sets that are generated and exported by the Cluster node.

3. Select the Train data set and select Explore. The Explore window opens.

31

You can use the Graph Wizard to generate a three-dimensional plot of the CENSUS2000 data.

4. Select Actions Plot. The Select a Chart Type window opens.

5. Select the icon for a three-dimensional scatter plot.

6. Select Next >. The Graph Wizard proceeds to the next step, Select Chart Roles.

7. Select roles of X, Y, and Z for MeanHHSz, MedHHInc, and RegDens, respectively.

32

9. Select Finish.

33

The Explore window opens with a three-dimensional plot of the CENSUS2000 data.

Even though LogMedHHInc was used for the segmentation analysis, it is easier to interpret

the results using the original variable, MedHHInc. This works because the two variables are

monotonically related.

10. Rotate the plot by holding the CTRL key and dragging the mouse.

Each square in the plot represents a unique postal code. The squares are color-coded by cluster

segment.

34

1. Select Action Plot.

2. Select a Bar chart.

4. Select Role Category for the variable _SEGMENT_.

35

By itself, this plot is of limited use. However, when the plot is combined with the three-dimensional

plot, you can easily interpret the generated segments.

6. Select the tallest segment bar in the histogram, segment 8.

36

37

8. Rotate the three-dimensional plot to get a better look at the highlighted cases.

Cases in this largest segment correspond to households averaging between two and three members,

low population density, and median household incomes between $20,000 and $50,000.

9. For further interpretation, you can make a scatter plot of longitude and latitude to see where people in

this cluster reside.

38

Although these inputs are not used to create the clusters, there is an interesting correlation.

The geographic plot suggests that cases in segment 8 are located in the American heartland. The low

population density suggests rural rather than urban or suburban settings. You could accurately call this

segment Middle America.

39

By closing and tiling the windows, you can see many aspects of the cluster analysis

simultaneously.

40

Profiling Segments

You can gain a great deal of insight by creating plots as in the previous demonstration. Unfortunately, if

more than three variables are used to generate the segments, the interpretation of such plots becomes

difficult.

Fortunately, there is another useful tool in SAS Enterprise Miner for interpreting the composition of

clusters: the Segment Profile. This tool enables you to compare the distribution of a variable in an

individual segment to distribution of the variable overall. As a bonus, the variables are sorted by how well

they characterize the segment.

1. Drag a Segment Profile tool from the Assess tool palette into the diagram workspace.

2. Connect the Cluster node to the Segment Profile node.

41

3. Run the Segment Profile node and select Results. The Results - Segment Profile window opens.

42

Features of each segment become apparent. For example, segment 8when compared to the overall

distributionshas a lower Region Density Percentile, more central Median Household Income,

and slightly higher Average Household Size.

43

The window shows the relative worth of each variable in characterizing each segment. For example,

segment 8 is largely characterized by the RegDens variable.

Again, similar analyses can be employed to describe the other segments. The advantage of the

Segment Profile window (compared to direct viewing of the segmentation) is that the descriptions can

be more than three-dimensional.

- QB Students DmHochgeladen vonVinay Gopal
- Performance Analysis of Classification and Clustering Based on Mining for Age Related Diseases From Patients RecordHochgeladen vonInternational Journal of Innovative Science and Research Technology
- Applying Clustering Techniques for Efficient Text Mining in Twitter DataHochgeladen vonIntegrated Intelligent Research
- JaiweiHanDataMining.pptHochgeladen vonAlfan Bahar
- A Swarm Based Approach to Improve Traditional Document Clustering ApproachHochgeladen vonEditor IJRITCC
- 1 a New Algorithm for Cluster InitializationHochgeladen vonBety Septika Setya Hanggara
- Applying Data Mining TechniquesHochgeladen vonIbrar Hussain
- Multimodal Optimization Using Self-Adaptive Real Coded Genetic Algorithm With K-Means & Fuzzy C-Means ClusteringHochgeladen vonEditor IJACSA
- Pirooznia_2008_PMID18366602Hochgeladen vonSalvador Martinez
- biggio14-aisecHochgeladen vonapiotaya
- Zhu 2009 Pattern-RecognitionHochgeladen vonKanaan Bhissy
- A Novel Leaf Classification Technique Using GLCM and RBFNNHochgeladen vonInternational Journal of Innovative Science and Research Technology
- k Means Clusterring 2Hochgeladen vonDonald Church
- DB-OLS: An Approach for IDS1Hochgeladen vonijp2pjournal
- Review of Color Segment at i on 2Hochgeladen vonKadja De Sousa Cavalcante
- Bio Medical Data Analysis Using Novel Clustering TechniquesHochgeladen vonHarikrishnan Shunmugam
- 08540811.pdfHochgeladen vonisaac
- b7c968235e575efecb61c8be1be2e7533fd9Hochgeladen vonNinad Samel
- Cluster AnalysisHochgeladen vonMani Kanta
- 11Hochgeladen vonShashwat Suman
- Bf7abCluster Analysis NewHochgeladen vonsd
- 12 Chapter 5Hochgeladen vonDineshkumar Shanmugam
- Tesis - Juan Carlos MorenoHochgeladen vonlikufanele
- 74934Hochgeladen vonjotaroxo
- proj_docHochgeladen vonkrithicuttie
- 1-s2.0-S030646031dfsf3004280-mainHochgeladen vonDragos Dragomir
- Summer Statistics Homework'17Hochgeladen vonJoe
- [IJCST-V5I2P32]:Dr. Balamurugan .A, Arul Selvi. S, Syedhussian .A , Nithin .AHochgeladen vonEighthSenseGroup
- DatabaseHochgeladen vonmedadian
- ec5555problemset2(2012-13)Hochgeladen vonKrayonFisher

- VCFHochgeladen vonArpan Kumar
- JavaHochgeladen vonArpan Kumar
- Additional Hadoop Setup InfoHochgeladen vonArpan Kumar
- 01.21.14.Revised.basel .III .Leverage.ratioHochgeladen vonArpan Kumar
- Dunn&BradstreetAnalyticsHochgeladen vonArpan Kumar
- New Text DocumentHochgeladen vonArpan Kumar
- Content ServerHochgeladen vonArpan Kumar
- Fall2016OrientationSchedule-Graduate.pdfHochgeladen vonArpan Kumar
- 2017 WAGC Call for RegistrationHochgeladen vonArpan Kumar
- IntrosHochgeladen vonArpan Kumar
- PAHochgeladen vonArpan Kumar
- opHochgeladen vonArpan Kumar
- Chapter 3Hochgeladen vonArpan Kumar
- BASEL 3_Revisions to the Standardised Approach for Credit RiskHochgeladen vonJenny Dang
- Basel 3 Summary TableHochgeladen vonIzian Sherwani
- Basil 2Hochgeladen vonArpan Kumar
- basel3doc.pdfHochgeladen vonArpan Kumar
- Tables for Test 2Hochgeladen vonArpan Kumar
- 16sustampweekpHochgeladen vonArpan Kumar
- 16suncpHochgeladen vonArpan Kumar
- Chord (1)Hochgeladen vonArpan Kumar
- Seoul Retail Case Q1 (1)Hochgeladen vonArpan Kumar
- Decision TreeHochgeladen vonArpan Kumar
- Neural NetworksHochgeladen vonArpan Kumar
- Summer Session Blue Line South to DowntownHochgeladen vonArpan Kumar
- ChargingHochgeladen vonDilip Singh Thakur

- Hw1Hochgeladen vonAnonymous gUySMcpSq
- LEACH Improvement Based on Ant Colony Optimization and Energy BalanceHochgeladen vonSEP-Publisher
- Mini Project Report SampleHochgeladen vonSakshi Issar
- Radial Basis Function Artificial Neural Network: Spread SelectionHochgeladen vonIJEC_Editor
- Security Management in Mobile Cloud ComputingHochgeladen vonalcides naranjo
- Cohen-Litch - On Explaining BehaviorHochgeladen vonTomás Asurmendi
- A Social-media-based Approach to Predicting Stock Comovement 1-s2.0-S0957417414008288-MainHochgeladen vonMuhammed Aydın
- [Abeyasekera] Multivariate Methods for Index ConstructionHochgeladen vonJoseph Noriel Joven
- Tropical Conservation Science 2015Hochgeladen vonJose Cruz
- Research Glossary (1)Hochgeladen vonutkarsh44
- icml09_dasgupta_langford_actl.pdfHochgeladen vonAjeet_1991
- Cluster Head Election and Multi Hop Using Fuzzy Logic for Wireless Sensor NetworkHochgeladen vonseventhsensegroup
- An Integration of K-means and Decision Tree (ID3) towards a more Efficient Data Mining AlgorithmHochgeladen vonJournal of Computing
- Clone Detection for Efficient System in WSN using AODVHochgeladen vonEditor IJRITCC
- AN IMPROVED CLUSTERING BASED SEGMENTATION ALGORITHM FOR BRAIN MRIHochgeladen vonAnonymous vQrJlEN
- Machine LearningHochgeladen vonAmit Patra
- Load and Electricity Rates Prediction for Building Wide Optimizat.pdfHochgeladen vonVignesh Ramakrishnan
- A Study on the Impact of Women Self-help Groups (SHGs) on Rural Entrepreneurship Development-A Case Study in Selected Areas of West BengalHochgeladen vonIJSRP ORG
- WireVis- Visualization of Categorical, Time-Varying DataHochgeladen vonSirotnikov
- 14.IJAEST Vol No 6 Issue No 2 Formatting a Novel Clustering Protocol Based on Artificial Immune System Algorithm for Wireless Sensor Networks 256 260Hochgeladen vonhelpdesk9532
- ExplorerGuide wekaHochgeladen vonsathiyabalacse
- IDOL 10.9 GettingStarted EnHochgeladen vonSurendra Babu Katta
- Privacy Preservation Techniques in Data MiningHochgeladen vonesatjournals
- 4 Using Data Mining in Your IT SystemsHochgeladen vonJacob G Thomas
- The Global Fuzzy C-Means Clustering AlgorithmHochgeladen vonfrmalthus
- Finding Efficient Initial Clusters Centers for K-MeansHochgeladen vonIJAFRC
- A Threshold fuzzy entropy based feature selection method applied in various benchmark datasets using Ant-miner algorithmHochgeladen vonIJMER
- An Introduction to Cluster Analysis for Data MiningHochgeladen vonmrmrva
- 1805.02638Hochgeladen vonLiviu Aur
- 03 Literature ReviewHochgeladen vonVaibhav Hiwase