Beruflich Dokumente
Kultur Dokumente
Discussion
Important points
•Keep to the plan
•Keep in regular touch with supervisors, even if to say no progress
•It's really valuable to write-up as I go along
•Send update every 2 weeks
•Ask Vicky for 1 or 2 good papers as an example
Plan
•Phone conference in January
•Later meeting just before presentation
•Intermediate drafts are good - gradually filled out versions
•Only expect 8-10 good referneces
•Classic format
•Intro
•Literature review
•Statement of what I did/how I did it
•Results
•Analysis
•How well it went - key chapter
Approach
•At first, build skeleton with section headings and sub-section headings
•Write a brief abstract for each section - even if it will change
•Include a "criteria for success" and review this in, say, chapter 6 "the project was
successful because…"
•For example, I might look at how the ontology-based search is better than the UI-
based search (accuracy, number of hits, etc)
Bayesian Analysis
•Some sort of processing on keywords
•Pick a training set
•Apply algorithm
Cluster analysis
•Clusters would build up around similar products and services
•However, might be too small a data set
Analysis approach
•Do a scan of 4-5 well-defined keywords
•Search needs to include location
•Maybe look at 1000 and categorise them by hand
•Then do a proximity search (but this would require some ontology)
•Ian Witten - data mining book - released a piece of opensource data mining
software called "WEKA"
•Particle swarm approach, possibly, however these are "wild" searches
•Might be worth seeing if we can extract some mechanism to extract the ontology
from the keywords
•First step - look through the keywords
Questions
•Are there successful keywords?
Good books
•Toby Segaran, Programming Collective Intelligence
•Ian Witten, Data Mining: Practical Machine Learning Tools and Techniques
Actions
•talk to someone from marketing to get a money-driven view of what I am doing
•google Technion - find wikipedia article, woman in darmstadt looking at
disambiguation
•check out "WEKA"
•Look at "particle swarm"
•Look through keywords and categorise 1000 by hand
•Scour LREC proceedings to see if someone's already done keyword disambiguation
•Email Darrel and ask about book on information retrieval, that includes the "bag of
words" approach