R爬取数据

The most important functions in rvest are:

Create an html document from a url, a file on disk or a string containing html with read_html().
Select parts of a document using css selectors: html_nodes(doc, "table td") (or if you've a glutton for punishment, use xpath selectors with html_nodes(doc, xpath = "//table//td")). If you haven't heard of selectorgadget, make sure to read vignette("selectorgadget") to learn about it.
注意的是，这里是html_nodes,因为依然是由html_node这个函数，而html_node只会选取一个节点
Extract components with html_tag() (the name of the tag), html_text() (all text inside the tag), html_attr()(contents of a single attribute) and html_attrs() (all attributes).
(You can also use rvest with XML files: parse with xml(), then extract components using xml_node(), xml_attr(), xml_attrs(), xml_text() and xml_tag().)
Parse tables into data frames with html_table().
Extract, modify and submit forms with html_form(), set_values() and submit_form().
Detect and repair encoding problems with guess_encoding() and repair_encoding().
Navigate around a website as if you're in a browser with html_session(), jump_to(), follow_link(), back(), forward(), submit_form() and so on. (This is still a work in progress, so I'd love your feedback.)
的