Foundations of Statistical Natural Language Processing.pdf

baidu_91739954 40 0 PDF 2018-12-25 14:12:55

Foundations of Statistical Natural Language Processing.pdfSecond printing, 19991999 Massachusetts Institute of TechnologySecond printing with corrections, 2000All rights reserved. No part of this book may be reproduced in any form by anyelectronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisherTypeset in 10/13 Lucida Bright by the authors using WATEX2Printed and bound in the United States of americaLibrary of Congress Cataloging-in-Publication InformationManning, Christopher DFoundations of statistical natural language processing Christopher DManning.inrich schutzep.Includes bibliographical references (p. )and indexISBN0-262-13360-11. Computational linguistics-Statistical methods. I. Schutze, HinrichI. TitleP98.5S83M361999410’.285-dc2199-21137CIPBrief ContentsI Preliminaries 11 Introduction 32 Mathematical foundations 393 Linguistic Essentials 81Corpus-Based work 117II words 1495 Collocations 1516 Statistical Inference: n-gram Models over Sparse Data 1917 Word Sense Disambiguation 2298 Lexical acquisition 265I Grammar 3159 Markov Models 31710 Part-of-Speech Tagging 34111 Probabilistic Context Free grammars 38112 Probabilistic Parsing407Iv Applications and Techniques 46113 Statistical Alignment and Machine Translation 46314 Clustering49515 Topics in Information Retrieval 52916 Text categorization 575ContentsList of tables xuList of figuresTable of Notations xxvPreface xxixRoad Map xxxvI Preliminaries 11 Introduction 31. 1 Rationalist and Empiricist Approaches to Language 41.2 Scientific Content 71. 2.1 Questions that linguistics should answer 81. 2.2 Non-categorical phenomena in language 111.2.3 Language and cognition as probabilistichenomena 151.3 The Ambiguity of Language: Why NLP Is Difficult 171.4 Dirty Hands 191. 4.1 Lexical resources 191. 4.2 Word counts 201.4.3Zipf” s laws231. 4.4 Collocations 291. 4.5 Concordances 311. 5 Further Reading 3 4Contents1. 6 Exercises 352 Mathematical foundations 392. 1 Elementary Probability Theory 402.1.1 Probabiliy spaces 402.1.2 Conditional probability and independence 422. 1. 3 Bayes' theorem 432.1.4 Random variables 452.1.5 Expectation and variance 462.1.6 Notation 4 72.1.7 Joint and conditional distributions 482.1.8 Determining P 482. 1. 9 Standard distributions 502.1.10 Bayesian statistics 542.1.11 Exercises 592.2 Essential Information Theory 602.2.1 Entry612.2.2 Joint entropy and conditional entropy 632.2. 3 Mutual information 662.2.4 The noisy channel model 682.2.5 Relative entropy or Kullback-Leibler divergence 722.2.6 The relation to language: Cross entropy 732.2.7 The entropy of English 762.2.8 Perplexity 782. 2. 9 Exercises 782.3 Further Reading 793 Linguistic Essentials 813. 1 Parts of Speech and Morphology 8 13.1.1Nouns romans 83.1.2 Words that accompany nouns: Determiners andadjectives 873.1.3 Verbs 883.1. 4 Other parts of speech 913.2 Phrase structure 933.2.1Phrasesdmachmears963.2.2 Dependency: Arguments and adjuncts 1013.2.3 X theory 1063.2.4 Phrase structure ambiguity 107Contents3.3 Semantics and Pragmatics 1093. 4 Other Areas 1123.5 Further Reading 1133.6 Exercises 11 44 Corpus-Based Work 1174.1 Getting Set Up 1184.1.1 Computers 1l84.1.2 Corpora ll84.13 Software 1204.2 Looking at Text 1234.2. 1 Low-level formatting issues 1234.2.2 Tokenization What is a word?244.2.3 Morphology 13 14.2. 4 Sentences 13 44.3 Marked-up Data 1364.3.1 Markup schemes 1374.3.2 Grammatical tagging 1394. 4 Further Reading 1454.5 Exercises 147II Words 1495 Collocations 1515.1 Frequency 1535.2 Mean and Variance 1575.3 Hypothesis Testing1625.3.1 The t t1635.3.2 Hypothesis testing of differences 1665.3.3 Pearsons chi-square test 1695.3. 4 Likelihood ratios 1725.4 Mutual Information 1785.5 The notion of collocation 1835.6 Further Reading 1876 Statistical Inference: n gram Models over Sparse data6.1 Bins: Forming Equivalence Classes 1926.1.1 Reliability vs. discrimination 1926.1.2 n-grammdels192Conte6.1.3Buildinggram models 1956.2 Statistical Estimators 1966.2. 1 Maximum Likelihood Estimation(MLE) 1976.2.2 Laplace's law, Lidstone's law and theJeffreys-Perks law 2026.2 3 Held out estimation 2056.2.4 Cross-validation (deleted estimation) 2106.2.5 Good-Turing estimation 2126.2.6 Briefly noted 2166.3 Combining Estimators 2176.3.1 Simple linear interpolation 2186.3.2 Katz's backing-off 2196.3.3 General linear interpolation 2206.3. 4 Briefly noted 2226.3.5 Language models for Austen 2236. 4 Conclusions 2245 Further Reading 2256.6 Exercises 2257 Word sense Disambiguation 2297. Methodological preliminaries 2327.1.1 Supervised and unsupervised learning 2327.1.2 Pseudowords 2337.1.3 Upper and lower bounds on performance 2337. 2 Supervised Disambiguation 2357. 2.1 Bayesian classification 2357.2.2 An information-theoretic approach 2397.3 Dictionary-Based Disambiguation 2417.3.1 Disambiguation based on sense definitions 2427.3.2 Thesaurus-based disambiguation 2447.3.3 Disambiguation based on translations in asecond-language corpus 2477.3.4 One sense per discourse, one sense percollocation 2497.4 Unsupervised Disambiguation 2527.5 What Is a Word sense? 2567.6 Further Reading 2607.7 Exercises 262Contents8 Lexical Acquisition 2658. 1 Evaluation measures 2678.2 Verb Subcategorization 2718.3 Attachment Ambiguity 2788.3.1 Hindle and Rooth (1993 )2808.3.2 General remarks on pp attachment 2848. 4 Selectional preferences 2888.5 Semantic Similarity 2948.5.1 Vectospace measures 2968.5.2 Probabilistic measures 3038.6 The role of Lexical Acquisition in Statistical NLP 3088.7 Further Reading 312III Grammar 3159 Markov Models 31791 Markoⅴ Models3189.2 Hidden markov models 3209.2.1 Why use HMMs? 3229.2.2 General form of an hmm 3249.3.2 Finding the best state sequence 33/ p9.3 The Three Fundamental Questions for HMMs 39.3. 1 Finding the probability of an observatic3269.3.3 The third problem: Parameter estimation 3339.4 HMMs: Implementation, Properties, and variants 3369.4.1 Implementation 33 69.4.2 Variants 3379.4.3 Multiple input observations 33 89.4.4 Initialization of parameter values 3399.5 Further Reading 33 910 Part-of-Speech Tagging 34110. 1 The Information Sources in Tagging 34310.2 Markov Model Taggers 34510.2. 1 The probabilistic model 34510.2.2 The Viterbi algorithm 34910.2.3 Variations 35110.3 Hidden Markov Model Taggers 356

资源预览

用户评论

whereabouts225 2024-12-28 02:12:30

经典书籍，值得一读

minako丨 2018-12-25 14:12:55

经典的一本书，值得你一读。

Mastering Natural Language Processing with Python PACKT2016

Natural Language Processing is one of the fields of computational linguistics and artificial intelli

40 2019-04-02
A Simple Introduction to Maximum Entropy Models for Natural Language Processing

这篇文章是opennlp中最大熵模型实现的参考文档，对最大熵模型从原理、示例到实现都讲得非常清晰，配合代码学习受益匪浅。

26 2019-05-25
Deep learning in natural language processing by Li Deng Springer

这本书是2018年springer出版的自然语言处理中的深度学习，这是一本论文集，但是汇集了2018年的最新进展。是个综述性质的。在这方面的专著以前几乎没有，因为太靠近前沿。

33 2019-07-08
Natural Language Processing with Python Cookbook_Code源码

Natural Language Processing with Python Cookbook_Code 源码本资源转载自网络，如有侵权，请联系上传者或csdn删除查看此书详细信息请在美国亚马逊

36 2019-04-19
Natural Language Processing and Computational Linguistics.7z

NaturalLanguageProcessingandComputationalLinguistics:ApracticalguidetotextanalysiswithPython,Gensim,

41 2019-07-23
The Ultimate Guide to Natural Language Processing with ChatGPT.zip

ChatGPT使用了自然语言处理和深度学习技术创建了一种聊天机器人。这份指南介绍了如何使用ChatGPT进行自然语言处理，它的优点和缺点，以及如何设计和训练自己的ChatGPT模型。您将学习到如何使用

8 2023-05-23
Natural language processing LHD2021NLPPython源码

NLP.js 如果您要查找版本3文档,则可以在此处找到它们。 “ NLP.js”是用于nodejs的通用自然语言实用程序。目前支持: 猜测短语的语言两根琴弦的快速Levenshtein距离搜索与

12 2021-04-26
Packt.Natural.Language.Processing.with.Java.Cookbook..rar

Book Description Natural Language Processing (NLP) has become one of the prime technologies for proc

7 2021-05-09
Hands On python natural language processing master.zip

利用Python语言编程实现机器学习和自然语言处理。具体包括：1.Building your NLP vocabulary. 2.Exploring Document, Sentence and Ch

10 2020-07-18
Building Chatbots with Python Using Natural Language Processing and Machine

本资源内包含高清+书签的PDF文档。 Apress出版社2018-12-13诚意出品,作者为NLP领域专家:Sumit Raj。透过本书的学习,你能利用Python和Chatbot搭建属于你自己专属

16 2021-04-24

Foundations of Statistical Natural Language Processing.pdf

资源预览

用户评论

推荐下载