asynchronous并行HTTP请求
我有一个应用程序加载大量的URL的控制stream问题。 我正在使用Caolan Async和NPM请求模块。
我的问题是,一旦函数被添加到队列,HTTP响应就会启动。 理想情况下,我想构build我的队列,并且只在队列启动时才开始发出HTTP请求。 否则,在队列开始之前,callback开始启动 – 导致队列过早完成。
var request = require('request') // https://www.npmjs.com/package/request , async = require('async'); // https://www.npmjs.com/package/async var myLoaderQueue = []; // passed to async.parallel var myUrls = ['http://...', 'http://...', 'http://...'] // 1000+ urls here for(var i = 0; i < myUrls.length; i++){ myLoaderQueue.push(function(callback){ // Async http request request(myUrls[i], function(error, response, html) { // Some processing is happening here before the callback is invoked callback(error, html); }); }); } // The loader queue has been made, now start to process the queue async.parallel(queue, function(err, results){ // Done });
有没有更好的方法来攻击?
使用for
循环与asynchronous调用相结合是有问题的(使用ES5),并可能产生意想不到的结果(在你的情况下,错误的URL被检索)。
相反,考虑使用async.map()
:
async.map(myUrls, function(url, callback) { request(url, function(error, response, html) { // Some processing is happening here before the callback is invoked callback(error, html); }); }, function(err, results) { ... });
鉴于你有1000多个url来检索, async.mapLimit()
也可能值得考虑。
如果你愿意开始使用Bluebird
和Babel
利用promises
和ES7
async
/ await
你可以做以下事情:
let Promise = require('bluebird'); let request = Promise.promisify(require('request')); let myUrls = ['http://...', 'http://...', 'http://...'] // 1000+ urls here async function load() { try { // map myUrls array into array of request promises // wait until all request promises in the array resolve let results = await Promise.all(myUrls.map(request)); // don't know if Babel await supports syntax below // let results = await* myUrls.map(request)); // print array of results or use forEach // to process / collect them in any other way console.log(results) } catch (e) { console.log(e); } }
我很自信你遇到了一个不同的错误的结果。 当你的排队function正在评估,我已被重新定义,这可能会导致它看起来像你错过了第一个url。 当你排队的function,尝试一些closures。
var request = require('request') // https://www.npmjs.com/package/request , async = require('async'); // https://www.npmjs.com/package/async var myLoaderQueue = []; // passed to async.parallel var myUrls = ['http://...', 'http://...', 'http://...'] // 1000+ urls here for(var i = 0; i < myUrls.length; i++){ (function(URLIndex){ myLoaderQueue.push(function(callback){ // Async http request request(myUrls[URLIndex], function(error, response, html) { // Some processing is happening here before the callback is invoked callback(error, html); }); }); })(i); } // The loader queue has been made, now start to process the queue async.parallel(queue, function(err, results){ // Done });