Sie sind auf Seite 1von 35

Taxonomy Development

An Infrastructure Model
Tom Reamy
Chief Knowledge Architect
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com

Agenda
Introduction
Type of Taxonomies
The Enterprise Context

Making the Business Case

Infrastructure Model of Taxonomy Development

Taxonomy in 4 Contexts
Content, People, Processes, Technology

Infrastructure Solutions the Elements


Applying the Model Practical Dimension

Starting and Resources

Conclusion

KAPS Group

Knowledge Architecture Professional Services (KAPS)


Consulting, strategy recommendations
Knowledge architecture audits
Partners Convera, Inxight, FAST, and others
Taxonomies: Enterprise, Marketing, Insurance, etc.

Taxonomy customization

Intellectual infrastructure for organizations

Knowledge organization, technology, people and processes


Search, content management, portals, collaboration,
knowledge management, e-learning, etc.

Two Types of Taxonomies: Browse and Formal


Browse Taxonomy Yahoo

Two Types of Taxonomies: Formal

Browse Taxonomies: Strengths and Weaknesses


Strengths: Browse is better than search

Context and discovery


Browse by task, type, etc.

Weaknesses:

Mix of organization
Catalogs, alphabetical listings, inventories
Subject matter, functional, publisher,
document type

Vocabulary and nomenclature Issues


Problems with maintenance, new material
Poor granularity and little relationship
between parts.
Web site unit of organization

No foundation for standards

Formal Taxonomies: Strengths and Weaknesses


Strengths:

Fixed Resource little or no maintenance


Communication Platform share ideas, standards
Infrastructure Resource
Controlled vocabulary and keywords
More depth, finer granularity

Weaknesses:

Difficult to develop and customize


Dont reflect users perspectives
Users have to adapt to language

Facets and Dynamic Classification


Facets are not categories
Entities or concepts belong to a category
Entities have facets

Facets are metadata - properties or attributes


Entities or concepts fit into one category
All entities have all facets defined by set of values

Facets are orthogonal mutually exclusive dimensions

An event is not a person is not a document is not a place.

Facets variety of units, of structure


Date or price numerical range
Location big to small (partonomy)
Winery alphabetical
Hierarchical - taxonomic

Faceted Navigation: Strengths and Weaknesses


Strengths:

More intuitive easy to guess what is behind each door


20 questions we know and use

Dynamic selection of categories


Allow multiple perspectives

Trick Users into using Advanced Search


wine where color = red, price = x-y, etc..

Weaknesses:

Difficulty of expressing complex relationships


Simplicity of internal organization

Loss of Browse Context


Difficult to grasp scope and relationships

Limited Domain Applicability type and size


Entities not concepts, documents, web sites

Dynamic Classification / Faceted navigation


Search and browse better than either alone
Categorized search context
Browse as an advanced search

Dynamic search and browse is best

Cant predict all the ways people think


Advanced cognitive differences
Panda, Monkey, Banana

Cant predict all the questions and activities


Intersections of what users are looking for
and what documents are often about
China and Biotech
Economics and Regulatory

10

Business Case for Taxonomies:


The Right Context

Traditional Metrics

Time Savings 22 minutes per user per day = $1Mil a Year


Apply to your organization customer service, content
creation, knowledge industry
Cost of not-finding = re-creating content

Research

Advantages of Browsing Marti Hearst, Chen and Dumais


Nielsen Poor classification costs a 10,000 user
organization $10M each year about $1,000 per employee.

Stories

Pain points, success and failure in your corporate language

11

Business Case for Taxonomies:


IDC White Paper
Information Tasks

Email 14.5 hours a week


Create documents 13.3 hours a week
Search 9.5 hours a week
Gather information for documents 8.3 hours a week
Find and organize documents 6.8 hours a week

Gartner: Business spend an estimated $750 Billion annually


seeking information necessary to do their job. 30-40% of a
knowledge workers time is spent managing documents.

12

Business Case for Taxonomies:


IDC White Paper
Time Wasted

Reformat information - $5.7 million per 1,000 per year (400M)


Not finding information - $5.3 million per 1,000 (370M)
Recreating content - $4.5 Million per 1,000 (315M)

Small Percent Gain = large savings

1% - $10 million
5% - $50 million
10% - $100 million

13

Business Case for Taxonomies:


The Right Context
Justification

Search Engine - $500K-$2Mil


Content Management - $500K-$2Mil
Portal - $500-$2Mil
Plus maintenance and employee costs

Taxonomy

Small comparative cost


Needed to get full value from all the above

ROI asking the wrong question

What is ROI for having an HR department?


What is ROI for organizing your company?

14

Infrastructure Model of Taxonomy Development


Taxonomy in Basic 4 Contexts

Ideas Content Structure

Language and Mind of your organization


Applications - exchange meaning, not data

People Company Structure

Communities, Users, Central Team

Activities Business processes and procedures

Central team - establish standards, facilitate

Technology / Things

CMS, Search, portals, taxonomy tools


Applications BI, CI, Text Mining

15

Taxonomy in Context
Structuring Content
All kinds of content and Content Structures

Structured and unstructured, Internet and desktop

Metadata standards Dublin core+


Keywords - poor performance
Need controlled vocabulary, taxonomies, semantic network

Other Metadata

Document Type
Form, policy, how-to, etc.

Audience
Role, function, expertise, information behaviors

Best bets metadata

Facets entities and ideas

Wine.com

16

Taxonomy in Context:
Structuring People
Individual People

Tacit knowledge, information behaviors


Advanced personalization category priority
Sales forms ---- New Account Form
Accountant ---- New Accounts ---- Forms

Communities

Variety of types map of formal and informal


Variety of subject matter vaccines, research, scuba
Variety of communication channels and information behaviors
Community-specific vocabularies, need for inter-community
communication (Cortical organization model)

17

Taxonomy in Context:
Structuring Processes and Technology
Technology: infrastructure and applications

Enterprise platforms: from creation to retrieval to application


Taxonomy as the computer network
Applications integrated meaning, not just data

Creation content management, innovation, communities of


practice (CoPs)

When, who, how, and how much structure to add


Workflow with meaning, distributed subject matter experts (SMEs) and
centralized teams

Retrieval standalone and embedded in applications and


business processes

Portals, collaboration, text mining, business intelligence, CRM

18

Taxonomy in Context:
The Integrating Infrastructure
Starting point: knowledge architecture audit, K-Map

Social network analysis, information behaviors

People knowledge architecture team

Infrastructure activities taxonomies, analytics, best bets


Facilitation knowledge transfer, partner with SMEs

Taxonomies of content, people, and activities

Dynamic Dimension complexity not chaos


Analytics based on concepts, information behaviors

Taxonomy as part of a foundation, not a project

In an Infrastructure Context

19

Taxonomy in Context:
The Integrating Infrastructure
Integrated Enterprise requires both an infrastructure team and
distributed expertise.

Software and SMEs is not the answer - keywords

Taxonomies not stand alone


Metadata, controlled vocabularies, synonyms, etc.
Variety of taxonomies, plus categorization, classification, etc.

Important to know the differences, when to use which

Multiple Applications

Search, browse, content management, portals, BI & CI, etc.

Infrastructure as Operating System


Word vs. Word Perfect
Instead of sharing clipboard, share information and knowledge.

20

Infrastructure Solutions: The start and foundation


Knowledge Architecture Audit

Knowledge Map - Understand what you have, what you


are, what you want

The foundation of the foundation

Contextual interviews, content analysis, surveys, focus


groups, ethnographic studies
Category modeling Intertwingledness -learning new
categories influenced by other, related categories
Natural level categories mapped to communities, activities
Novice prefer higher levels
Balance of informative and distinctiveness

Living, breathing, evolving foundation is the goal

21

Infrastructure Solutions: Resources


People and Processes: Roles and Functions

Knowledge Architect and learning object designers


Knowledge engineers and cognitive anthropologists
Knowledge facilitators and trainers and librarians
Part Time

Librarians and information architects


Corporate communication editors and writers

Partners

IT, web developers, applications programmers


Business analysts and project managers

22

Infrastructure Solutions: Resources


People and Processes: Central Team
Central Team supported by software and offering services

Creating, acquiring, evaluating taxonomies, metadata standards,


vocabularies
Input into technology decisions and design content management,
portals, search
Socializing the benefits of metadata, creating a content culture
Evaluating metadata quality, facilitating author metadata
Analyzing the results of using metadata, how communities are using
Research metadata theory, user centric metadata
Design content value structure more nuanced than good / poor
content.

23

Infrastructure Solutions: Resources


People and Processes: Facilitating Knowledge Transfer
Need for Facilitators
Amazon hiring humans to refine recommendations
Google humans answering queries

Facilitate projects, KM project teams

Facilitate knowledge capture in meetings, best practices

Answering online questions, facilitating online discussions,


networking within a community
Design and run KM forums, education and innovation fairs
Work with content experts to develop training, incorporate
intelligence into applications
Support innovation, knowledge creation in communities

24

Infrastructure Solutions: Resources


People and Processes: Location of Team

KM/KA Dept. Cross Organizational, Interdisciplinary


Balance of dedicated and virtual, partners

Library, Training, IT, HR, Corporate Communication

Balance of central and distributed


Industry variation

Pharmaceutical dedicated department, major place in the


organization
Insurance Small central group with partners
Beans a librarian and part time functions

Which design knowledge architecture audit

25

Infrastructure Solutions: Resources


Technology

Taxonomy Management

Text and Visualization

Entity and Fact Extraction


Text Mining
Search for professionals

Different needs, different interfaces

Integration Platform technology

Enterprise Content Management

26

Taxonomy Development: Tips and Techniques


Stage One How to Begin
Step One: Strategic Questions why, what value from the
taxonomy, how are you going to use it

Variety of taxonomies important to know the differences, when to


use what.

Step Two: Get a good taxonomist! (or learn)

Library Science+ Cognitive Science + Cognitive Anthropology

Step Three: Software Shopping

Automatic Software Fun Diversion for a rainy day


Uneven hierarchy, strange node names, weird clusters

Taxonomy Management, Entity Extraction, Visualization

Step Four: Get a good taxonomy!


Glossary, Index, Pull from multiple sources
Get a good document collection

27

Infrastructure Solutions: Taxonomy Development


Stage Two: Taxonomy Model
Enterprise Taxonomy
No single subject matter taxonomy
Need an ontology of facets or domains

Standards and Customization


Balance of corporate communication and departmental specifics
At what level are differences represented?
Customize pre-defined taxonomy additional structure, add
synonyms and acronyms and vocabulary

Enterprise Facet Model:


Actors, Events, Functions, Locations, Objects, Information
Resources
Combine and map to subject domains

28

Taxonomy Development: Tips and Techniques


Stage Three: Development and/or Customization

Combination of top down and bottom up (and Essences)

Top: Design an ontology, facet selection


Bottom: Vocabulary extraction documents, search logs,
interview authors and users
Develop essential examples (Prototypes)
Most Intuitive Level genus (oak, maple, rabbit)
Quintessential Chair all the essential characteristics, no more

Work toward the prototype and out and up and down


Repeat until dizzy or done

Map the taxonomy to communities and activities

Category differences
Vocabulary differences

29

Taxonomy Development: Tips and Techniques


Stage Four: Evaluate and Refine

Formal Evaluation

Quality of corpus size, homogeneity, representative


Breadth of coverage main ideas, outlier ideas (see next)
Structure balance of depth and width
Kill the verbs
Evaluate speciation steps understandable and systematic
Person Unwelcome person Unpleasant person - Selfish
person

Avoid binary levels, duplication of contrasts


Primary and secondary education, public and private

30

Taxonomy Development: Tips and Techniques


Stage Four: Evaluate and Refine

Practical Evaluation

Test in real life application


Select representative users and documents
Test node labels with Subject Matter Experts
Balance of making sense and jargon

Test with representative key concepts


Test for un-representative strange little concepts that only
mean something to a few people but the people and ideas are
key and are normally impossible to find

31

Sources
Books

Women, Fire, and Dangerous Things


What Categories Reveal about the Mind
George Lakoff

The Geography of Thought


Richard E. Nisbett

Software

Convera Retrievalware
Inxight Smart Discovery entity and fact extraction

Courses

Convera Taxonomy Certification

32

Conclusion
Taxonomy development is not just a project

It has no beginning and no end

Taxonomy development is not an end in itself

It enables the accomplishment of many ends

Taxonomy development is not just about search or browse

It is about language, cognition, and applied intelligence

Strategic Vision (articulated by K Map) is important

Even for your under the radar vocabulary project

Paying attention to theory is practical

So is adapting your language to business speak

33

Conclusion
Taxonomies are part of your intellectual infrastructure

Roads, transportation systems not cars or types of cars

Taxonomies are part of creating smart organizations

Self aware, capable of learning and evolving

Think Big, Start Small, Scale Fast


If we really are in a knowledge economy
We need to pay attention to
Knowledge!

34

Questions?
Tom Reamy
tomr@kapsgroup.com
KAPS Group
Knowledge Architecture Professional Services
http://www.kapsgroup.com

Das könnte Ihnen auch gefallen