Willkommen bei Scribd!

Abstract

Hochgeladen von

0% fanden dieses Dokument nützlich (0 Abstimmungen)

26 Ansichten3 Seiten

A new concept-based mining model that analyzes terms on the sentence, document, and corpus levels is introduced. The proposed model can efficiently find significant matching concepts between documents, according to the semantics of their sentences. Experiments demonstrate the substantial enhancement of the clustering quality using the sentence-based, document-based,corpus-based, and combined approach concept analysis.

Originalbeschreibung:

Copyright

Verfügbare Formate

DOCX, PDF, TXT oder online auf Scribd lesen

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Dieses Dokument melden

Copyright:

Attribution Non-Commercial (BY-NC)

Verfügbare Formate

Als DOCX, PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

0% fanden dieses Dokument nützlich (0 Abstimmungen)

26 Ansichten3 Seiten

Abstract

Hochgeladen von

Varun Kalpurath

Copyright:

Attribution Non-Commercial (BY-NC)

Verfügbare Formate

Als DOCX, PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

Zu Seite

Sie sind auf Seite 1von 3

Im Dokument suchen

Enhanced Text Clustering Based On Conceptual Mining

ABSTRACT:
Most of the common techniques in text mining are based on the statistical analysis of a term, either word or phrase.Statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in their documents, but one term contributes more to the meaning of its sentences than the other term. Thus, the underlying text mining model should indicate terms that capture the semantics of text. In this case, the mining model can capture terms that present the concepts of the sentence, which leads to discovery of the topic of the document. A new concept-based mining model that analyzes terms on the sentence, document, and corpus levels is introduced. The concept-based mining model can effectively discriminate between nonimportant terms with respect to sentence semantics and terms which hold the concepts that represent the sentence meaning. The proposed mining model consists of sentence-based concept analysis, documentbased concept analysis, corpus-based concept-analysis, and concept-based similarity measure. The term which contributes to the sentence semantics is analyzed on the sentence, document, and corpus levels rather than the traditional analysis of the document only. The proposed model can efficiently find significant matching concepts between documents, according to the semantics of their sentences. The similarity between documents is calculated based on a new concept-based similarity measure. The proposed similarity measure takes full advantage of using the concept analysis measures on the sentence, document, and corpus levels in calculating the similarity between documents. Large sets of experiments using the proposed concept-based mining model on different data sets in text clustering are conducted. The experiments demonstrate extensive comparison between the concept-based analysis and the traditional analysis. Experimental results demonstrate the substantial enhancement of the clustering quality using the sentence-based, document-based,corpus-based, and combined approach concept analysis.

EXISTING SYSTEM
Document clustering (also referred to as Text clustering) is closely related to the concept of data clustering.There are two ways of labeling data using learning techniques. 1.Unsupervised learning induces categories from unlabeled data. 2.Semi-supervised learning uses both labeled and unlabeled data to improve results

PROPOSED SYSTEM
Proposed system consists of Sentence-based concept analysis,Document-based concept analysis,Corpus-based conceptanalysis,Concept-based similarity measure. The term which contributes to the sentence semantics is analyzed on the sentence, document, and corpus levels rather than the traditional analysis of the document only. The proposed model can efficiently find significant matching concepts between documents

MODULES 1.USER interface:

User is given an option to search for contents based on his /her keywords.User can either use all algorithms(HAC,K-NN and single pass) or use them independently with the concept based analysis algorithm.

2.clustering:
This part deals with implementing algorithms for web search.The keywords are processed and categorized for results.

3.result categorizing:
Results can be showed in graphical structure,tree structure and also as lists.

4.performance management:
Results can be cahed and bookmarked for quick searching.older searches are archived in history

Das könnte Ihnen auch gefallen

Intelligent blurring image reconstruction methods
Dokument6 Seiten
Intelligent blurring image reconstruction methods
Varun Kalpurath
Noch keine Bewertungen
Airport Management System Final
Dokument48 Seiten
Airport Management System Final
Varun Kalpurath
100% (6)
Higher Algebra - Hall & Knight
Dokument593 Seiten
Higher Algebra - Hall & Knight
Ram Gollamudi
100% (2)
R-2008 Time Table May-June 2011 - New
Dokument4 Seiten
R-2008 Time Table May-June 2011 - New
aarumugam
Noch keine Bewertungen
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Von Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Bewertung: 4 von 5 Sternen
4/5 (5783)
The Yellow House: A Memoir (2019 National Book Award Winner)
Von Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Bewertung: 4 von 5 Sternen
4/5 (98)
Never Split the Difference: Negotiating As If Your Life Depended On It
Von Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Bewertung: 4.5 von 5 Sternen
4.5/5 (838)
Shoe Dog: A Memoir by the Creator of Nike
Von Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Bewertung: 4.5 von 5 Sternen
4.5/5 (537)
The Emperor of All Maladies: A Biography of Cancer
Von Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Bewertung: 4.5 von 5 Sternen
4.5/5 (271)
Fear: Trump in the White House
Von Everand
Fear: Trump in the White House
Bob Woodward
Bewertung: 3.5 von 5 Sternen
3.5/5 (738)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Von Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Bewertung: 4 von 5 Sternen
4/5 (890)
The Little Book of Hygge: Danish Secrets to Happy Living
Von Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Bewertung: 3.5 von 5 Sternen
3.5/5 (399)
Team of Rivals: The Political Genius of Abraham Lincoln
Von Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Bewertung: 4.5 von 5 Sternen
4.5/5 (234)
Yes Please
Von Everand
Yes Please
Amy Poehler
Bewertung: 4 von 5 Sternen
4/5 (1888)
Grit: The Power of Passion and Perseverance
Von Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Bewertung: 4 von 5 Sternen
4/5 (587)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Von Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Bewertung: 4.5 von 5 Sternen
4.5/5 (265)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Von Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Bewertung: 3.5 von 5 Sternen
3.5/5 (231)
On Fire: The (Burning) Case for a Green New Deal
Von Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Bewertung: 4 von 5 Sternen
4/5 (72)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Von Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Bewertung: 4.5 von 5 Sternen
4.5/5 (474)
Principles: Life and Work
Von Everand
Principles: Life and Work
Ray Dalio
Bewertung: 4 von 5 Sternen
4/5 (599)
Rise of ISIS: A Threat We Can't Ignore
Von Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Bewertung: 3.5 von 5 Sternen
3.5/5 (137)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Von Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Bewertung: 4.5 von 5 Sternen
4.5/5 (344)
The Unwinding: An Inner History of the New America
Von Everand
The Unwinding: An Inner History of the New America
George Packer
Bewertung: 4 von 5 Sternen
4/5 (45)
Steve Jobs
Von Everand
Steve Jobs
Walter Isaacson
Bewertung: 4.5 von 5 Sternen
4.5/5 (806)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Von Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Bewertung: 3.5 von 5 Sternen
3.5/5 (2219)
Angela's Ashes: A Memoir
Von Everand
Angela's Ashes: A Memoir
Frank McCourt
Bewertung: 4.5 von 5 Sternen
4.5/5 (440)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Von Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Bewertung: 4 von 5 Sternen
4/5 (1090)
John Adams
Von Everand
John Adams
David McCullough
Bewertung: 4.5 von 5 Sternen
4.5/5 (2409)
Bad Feminist: Essays
Von Everand
Bad Feminist: Essays
Roxane Gay
Bewertung: 4 von 5 Sternen
4/5 (1015)
The Glass Castle: A Memoir
Von Everand
The Glass Castle: A Memoir
Jeannette Walls
Bewertung: 4.5 von 5 Sternen
4.5/5 (1711)
The Outsider: A Novel
Von Everand
The Outsider: A Novel
Stephen King
Bewertung: 4 von 5 Sternen
4/5 (1800)
The Woman in Cabin 10
Von Everand
The Woman in Cabin 10
Ruth Ware
Bewertung: 3.5 von 5 Sternen
3.5/5 (2322)
A Man Called Ove: A Novel
Von Everand
A Man Called Ove: A Novel
Fredrik Backman
Bewertung: 4.5 von 5 Sternen
4.5/5 (4609)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Von Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Bewertung: 4.5 von 5 Sternen
4.5/5 (119)
The Light Between Oceans: A Novel
Von Everand
The Light Between Oceans: A Novel
M.L. Stedman
Bewertung: 4.5 von 5 Sternen
4.5/5 (789)
Brooklyn: A Novel
Von Everand
Brooklyn: A Novel
Colm Tóibín
Bewertung: 3.5 von 5 Sternen
3.5/5 (1937)
Wolf Hall: A Novel
Von Everand
Wolf Hall: A Novel
Hilary Mantel
Bewertung: 4 von 5 Sternen
4/5 (3811)
Manhattan Beach: A Novel
Von Everand
Manhattan Beach: A Novel
Jennifer Egan
Bewertung: 3.5 von 5 Sternen
3.5/5 (791)
Little Women
Von Everand
Little Women
Louisa May Alcott
Bewertung: 4 von 5 Sternen
4/5 (104)
The Perks of Being a Wallflower
Von Everand
The Perks of Being a Wallflower
Stephen Chbosky
Bewertung: 4.5 von 5 Sternen
4.5/5 (2099)
The Art of Racing in the Rain: A Novel
Von Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Bewertung: 4 von 5 Sternen
4/5 (4193)
A Tree Grows in Brooklyn
Von Everand
A Tree Grows in Brooklyn
Betty Smith
Bewertung: 4.5 von 5 Sternen
4.5/5 (1929)
Her Body and Other Parties: Stories
Von Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Bewertung: 4 von 5 Sternen
4/5 (821)
Sing, Unburied, Sing: A Novel
Von Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Bewertung: 4 von 5 Sternen
4/5 (1103)
The Constant Gardener: A Novel
Von Everand
The Constant Gardener: A Novel
John le Carré
Bewertung: 3.5 von 5 Sternen
3.5/5 (104)
Identifying Social Engineering Attacks - Read World Scenario
Dokument4 Seiten
Identifying Social Engineering Attacks - Read World Scenario
celeste jones
Noch keine Bewertungen
Astm STP 855-Eb.1044238-1
Dokument352 Seiten
Astm STP 855-Eb.1044238-1
Jayanta Mondal
100% (1)
1 Cash and Cash Equivalents
Dokument3 Seiten
1 Cash and Cash Equivalents
Skie Mae
Noch keine Bewertungen
Vytilla Mobility Hub - Thesis Proposal
Dokument7 Seiten
Vytilla Mobility Hub - Thesis Proposal
PamarthiNikita
100% (1)
Cement Lined Piping Specification
Dokument167 Seiten
Cement Lined Piping Specification
venkateshwaran
Noch keine Bewertungen
Introduction To TQM
Dokument24 Seiten
Introduction To TQM
SimantoPreeom
100% (1)
Office of The Integrity Commissioner - Investigation Report Regarding The Conduct of Councillor Mark Grimes (July 05, 2016)
Dokument44 Seiten
Office of The Integrity Commissioner - Investigation Report Regarding The Conduct of Councillor Mark Grimes (July 05, 2016)
T.O. Nature & Development
Noch keine Bewertungen
Strict Liability - Project
Dokument7 Seiten
Strict Liability - Project
Rushabh Lalan
100% (1)
Carino v. Insular Govt 212 U.S. 449 (1909)
Dokument3 Seiten
Carino v. Insular Govt 212 U.S. 449 (1909)
Wendy Peñafiel
Noch keine Bewertungen
Final Eligible Voters List North Zone 2017 118 1
Dokument12 Seiten
Final Eligible Voters List North Zone 2017 118 1
Bilal Ahmed
Noch keine Bewertungen
SMB Gist
Dokument7 Seiten
SMB Gist
N. R. Bharti
Noch keine Bewertungen
Chapter 9 PowerPoint
Dokument33 Seiten
Chapter 9 PowerPoint
Yusrah Jber
Noch keine Bewertungen
IBPS PO Preliminary Practice Set 5
Dokument41 Seiten
IBPS PO Preliminary Practice Set 5
Nive Admires
Noch keine Bewertungen
Sharp Ar-Bc260 P S Man
Dokument382 Seiten
Sharp Ar-Bc260 P S Man
xerox226
Noch keine Bewertungen
TOT Calendar Oct Dec. 2018 1
Dokument7 Seiten
TOT Calendar Oct Dec. 2018 1
Annamneedi Prasad
Noch keine Bewertungen
DATEM Capture For AutoCAD
Dokument195 Seiten
DATEM Capture For AutoCAD
manuel
Noch keine Bewertungen
@airbus: Component Maintenance Manual With Illustrated Part List
Dokument458 Seiten
@airbus: Component Maintenance Manual With Illustrated Part List
joker hot
Noch keine Bewertungen
Module 2 - Introduction To A Web-App
Dokument17 Seiten
Module 2 - Introduction To A Web-App
JASPER WESSLY
Noch keine Bewertungen
Tucker Northlake SLUPs
Dokument182 Seiten
Tucker Northlake SLUPs
Zachary Hansen
Noch keine Bewertungen
InkscapePDFLaTeX PDF
Dokument3 Seiten
InkscapePDFLaTeX PDF
Francesco Rea
Noch keine Bewertungen
Comprehensive Case 1 BKAR3033 A221
Dokument3 Seiten
Comprehensive Case 1 BKAR3033 A221
naufal hazim
Noch keine Bewertungen
The Strategy of IB: International Business - Chapter 13
Dokument20 Seiten
The Strategy of IB: International Business - Chapter 13
Yến Ngô Hoàng
Noch keine Bewertungen
List of Non-Scheduled Urban Co-Operative Banks: Sr. No. Bank Name RO Name Head Office Address Pincode
Dokument65 Seiten
List of Non-Scheduled Urban Co-Operative Banks: Sr. No. Bank Name RO Name Head Office Address Pincode
manoj
Noch keine Bewertungen
IELTS Writing Task 1 Combined Graphs Line Graph and Table 1
Dokument6 Seiten
IELTS Writing Task 1 Combined Graphs Line Graph and Table 1
Sugeng Riadi
Noch keine Bewertungen
Andhra Pradesh Land Reforms (Ceiling On Agricultural Holdings) (Amendment) Act, 2009
Dokument3 Seiten
Andhra Pradesh Land Reforms (Ceiling On Agricultural Holdings) (Amendment) Act, 2009
Latest Laws Team
Noch keine Bewertungen
Real Estate Merger Motives PDF
Dokument13 Seiten
Real Estate Merger Motives PDF
adonisghl
Noch keine Bewertungen
Acetanilide C H Nhcoch: Aniline Acetic Acid Reactor Filter Crystallizer Centrifuge Dryer
Dokument4 Seiten
Acetanilide C H Nhcoch: Aniline Acetic Acid Reactor Filter Crystallizer Centrifuge Dryer
Anonymous 4hx84J3
Noch keine Bewertungen
Section 3 Evidence Act 1950
Dokument3 Seiten
Section 3 Evidence Act 1950
ayesya
Noch keine Bewertungen
Stahl Cable Festoon Systems
Dokument24 Seiten
Stahl Cable Festoon Systems
Daniel Sherwin
Noch keine Bewertungen
8 - Vibration - F22-Vibration Isolation and Absorption
Dokument26 Seiten
8 - Vibration - F22-Vibration Isolation and Absorption
الأردني Jordanian
Noch keine Bewertungen