文本挖掘(英文版)

hgjhzy 24 0 PDF 2020-04-26 14:04:16

P1:JZZ 0521836573peCB1028Feldman0521836573 October12,200618:37 P1:JZZ 0521836573peCB1028Feldman0521836573 October12,200618:37 THETEXT MININGHANDBOOK Advancedapproachesin AnalyzingUnstructuredData RonenFeldman Bar-llanUniversity,Israel Jamessanger ABSVentures.WalthamMassachusetts 颗CAMBRIDGE 魔影UNIVERSITYPRESS CAMBRIDGEUNIVERSITYPRESS Cambridge,NewYork,Melbourne,Madrid,CapeTown,Singapore,Saopaulo CambridgeUniversitypress TheEdinburghBuilding,CambridgeCB28RU,UK PublishedintheUnitedStatesofAmericabyCambridgeUniversityPress,NewYork www.cambridge.ors Informationonthistitlewww.cambridge.org/9780521836579 oRonenFeldmanandJamesSanger2007 Thispublicationisincopyright.Subjecttostatutoryexceptionandtotheprovisionof relevantcollectivelicensingagreements,noreproductionotanypartmaytakeplace withoutthewrittenpermissionofCambridgeUniversitypress Firstpublishedinprintformat2006 ISBN-13978-0-511-33507-5eBook(Netlibrary ISBN-100-511-33507-5eBook(Netlibrary ISBN-13978-0-521-83657-9hardback ISBN-100-521-83657-3hardback CambridgeUniversityPresshasnoresponsibilityforthepersistenceoraccuracyofurls forexternalorthird-partyinternetwebsitesreferredtointhispublication,anddoesnot guaranteethatanycontentonsuchwebsitesis,orwillremain,accurateorappropriate P1:JZZ 0521836573peCB1028Feldman0521836573 October12,200618:37 Inlovingmemoryofmyfather,Issacfeldman P1:JZZ 0521836573peCB1028Feldman0521836573 October12,200618:37 P1:JZZ 0521836573peCB1028Feldman0521836573 October12,200618:37 Contents Preface pagex IntroductiontoTextMining 1 I1DefiningTextMining 1.2GeneralArchitectureofTextMiningSystems l.CoreTextMiningoperations II.1CoreTextMiningoperations II.2UsingBackgroundKnowledgeforTextMining 41 11.3TextMiningQueryLanguages 51 lI.TextMiningPreprocessingTechniques III1Task-OrientedApproaches 1.2FurtherReading 62 I.Categorization 64 TV.1ApplicationsofTextCategorization 65 IV2Definitionoftheproblem 66 IV3DocumentRepresentation 68 IV4KnowledgeEnginccringApproachtoTC 70 TV.5MachineLearningApproachtoTC 70 IV6UsingUnlabeleddatatolmproveclassification IV.Evaluationoftextclassifiers IV8Citationsandnotes V.Clustering 82 V1ClusteringTasksinTextAnalysis V.2TheGeneralClusteringProblem 84 V3ClusteringAlgorithms 85 V4ClusteringofTextualData V5Citationsandnotes 92 0521836573peCB1028Feldman0521836573 October12,200618:37 Contents ViInformationExtraction VI1IntroductiontoInformationextraction VI2HistoricalEvolutionofIE:TheMessageUnderstanding ConferencesandTipster VI3IEExamples 101 VI4ArchitectureofIESystems 104 VI5Anaphoraresolution 109 VI6InductiveAlgorithmsforIE 119 VI.StructuralIe 122 VI8FurtherReading 12 VIl.ProbabilisticModelsforInformationextraction 131 VIL.1Hiddenmarkovmodels 131 VIL.2StochasticContext-Freegrammars 137 VIl.3MaximalEntropymodeling 138 VIl.4MaximalEntropymarkovmodels 140 VIL.sConditionalrandomfields 142 VIl.6Furtherreading 145 VIll.PreprocessingapplicationsUsingprobabilistic andHybridApproaches 146 VIll.1ApplicationsofHMMtoTextualAnalysis 146 VIll.2UsingmemmforInformationExtraction 152 VIm.3ApplicationsofCrFstoTextualanalysis 153 VIIL.4TEG:UsingSCfGRulesforHybrid Statistical-Knowledge-BasedIE 155 VIII.5Bootstrapping 166 ⅤII.6Furtherreadins 175 IX.Presentation-LayerConsiderationsforBrowsing andQueryRefinement 177 IX1Browsing 177 IX2AccessingConstraintsandSimpleSpecificationFilters atthepresentationlayer 185 IX3AccessingtheUnderlyingQueryLanguage 186 IX.4Citationsandnotes 187 X.VisualizationApproaches 189 X.Introduction 189 X2Architecturalconsiderations 192 X3CommonVisualizationApproachesforTextMining 194 X4VisualizationTechniquesinLinkAnalysis 225 X5Real-WorldExample:TheDocumentExplorerSystem 235 XI.LinkAnalysis 244 XLIPreliminaries 244 P1:JZZ 0521836573peCB1028Feldman0521836573 October12,200618:37 Contentsix XI.2AutomaticlayoutofNetworks 46 XI.3PathsandCyclesinGraphs 250 XI.4Centrality 251 XI.5PartitioningofNctworks 259 XI.6PatternMatchinginNetworks 272 XI.7SoftwarePackagesforLinkAnalysis XI.8Citationsandnotes 74 XII.TextMiningApplications 275 XILIGeneralconsiderations 276 XIl.2CorporateFinance:MiningIndustryLiteraturefor Businessintelligence 281 XI.3A"Horizontal'TextMiningApplication:PatentAnalysis Solutionleveragingacommercialtextanalytics Platorm 297 XIl.4LifeSciencesResearch:MiningBiologicalPathway InformationwithGeneWays 309 AppendixA:DIAL:ADedicatedInformationExtractionLanguagefor TextMining 317 A.1WhatIsthediallanguage? A2Informationextractioninthedialenvironment a,3Texttokenization 320 A4ConceptandRuleStructure 320 A.5PatternMatching 322 A6Patternelements 323 A7Rulcconstraints 327 A8Conceptguard 328 A9CompleteDIALExamples 329 Bibliograph 337 Index 391

用户评论
请输入评论内容
评分:
Generic placeholder image 卡了网匿名网友 2020-04-26 14:04:16

初步看了,非常经典的英语介绍内容挖掘的资料,值得一看,如果有中文版更容易理解。

Generic placeholder image 卡了网匿名网友 2020-04-26 14:04:16