需要的可以下载,只有PDF文件,不包含源代码。 摘要如下: A substantial subset of the web data follows some kind of underlying structure. Nevertheless, HTML does not contain any schema or semantic information about the data it represents. A program able to provide software applications with a structured view of those semi-stru