获取TypeError：当使用cheerio和jsonframe进行刮取时，selector.includes不是一个函数

我正试图用下面的代码来废弃一个网站：

const cheerio = require('cheerio'); const jsonframe = require('jsonframe-cheerio'); const $ = cheerio.load('https://coinmarketcap.com/all/views/all/'); jsonframe($); // initializes the plugin //exception handling process.on('uncaughtException', err => console.error('uncaught exception: ', err)) process.on('unhandledRejection', (reason, p) => console.error('unhandled rejection: ', reason, p)) const frame = { "crypto": { "selector": "tbody > tr", "data": [{ "name": "td:nth-child(2) > a:nth-child(3)", "url": { "selector": "td:nth-child(2) > a:nth-child(3)", "attr": "href" }, "marketcap": "tr > td:nth-child(4)", "price": "tr > td:nth-child(5) > a:nth-child(1)", }] } }; let companiesList = $('tbody').scrape(frame); console.log(companiesList);

但是，运行上面的示例代码时，我得到一个UnhandledPromiseRejectionWarning ：

 (node:3890) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): TypeError: selector.includes is not a function

任何build议我做错了什么？

我很感激你的回复！

UPDATE

我将我的代码更改为以下内容。但是，我只能取消第一个元素。

有什么build议为什么其他元素不会被废弃？

 const cheerio = require('cheerio') const jsonframe = require('jsonframe-cheerio') const got = require('got'); async function scrapCoinmarketCap() { const url = 'https://coinmarketcap.com/all/views/all/' const html = await got(url) const $ = cheerio.load(html.body) jsonframe($) // initializing the plugin let frame = { "Coin": "td.no-wrap.currency-name > a", "url": "td.no-wrap.currency-name > a @ href", "Symbol": "td.text-left.col-symbol", "Price": "td:nth-child(5) > a", } console.log($('body').scrape(frame, { string: true })) } scrapCoinmarketCap()

根据您更新的代码，您可以通过迭代每个tr来获取所有货币数据：

 $('body tr').each(function() { console.log($(this).scrape(frame, { string: true })) })

然而，我认为最干净的方法（正如我在另一个答案中所说的）使用jsonframe-cheerio 列表/数组框架模式，这正是为了做到这一点：

 let frame = { currency: { _s: "tr", // the selector _d: [{ // allow you to get an array of data, not just the first item "Coin": "td.no-wrap.currency-name > a", "Url": "td.no-wrap.currency-name > a @ href", "Symbol": "td.text-left.col-symbol", "Price": "td:nth-child(5) > a" }] } } console.log($('body').scrape(frame, { string: true }))

方法cheerio.load()不接受URL – 它需要HTML作为string。

虽然我没有看到cheerio的源代码，但似乎该模块试图parsingURL作为一个HTML文件，显然，失败，各种错误开始出现。

要解决这个问题，您需要先将该URL的HTML内容加载到variables中，然后将该HTML内容传递给cheerio。

您可以使用request或获取模块来做到这一点。

这是一个使用got加载页面的例子：

 const got = require('got') const cheerio = require('cheerio') got('https://google.com') .then(res => { const $ = cheerio.load(res.body) // Continue as usual }) .catch(console.error)

获取TypeError：当使用cheerio和jsonframe进行刮取时，selector.includes不是一个函数

无法使用Cheerio获取iframe

使用xpath和cheerio获取元素

剥离每个子元素的html

如何写HTML结构的cheerio查询？

jQuery访问站点中的DOM

Node.js Cheerioparsinghtml表格内的html表格

使用Node.js，请求和cheerio从网站上刮下链接？

jQueryselect器和Cheerio

jQuery获取除了子元素X之外的子元素的HTML

用Cheerio，NodeJs从文件中追加HTML