NodeJS:将大量同步任务分解为asynchronous任务

我正在处理大量的工作,然后写入数据库。 工作stream程是:

  1. 将大约100 MB的数据读入缓冲区
  2. 循环访问数据,然后处理(同步工作)并写入磁盘(asynchronous工作)

我遇到的问题是,它将完成所有100 MB数据的循环,同时将所有写入磁盘的事务循环排队。 所以,它将首先遍历所有的数据,然后运行asynchronous作业。

我想打破迭代通过数组的同步任务,以便每个迭代排队作为后面的事件循环。

 var lotsOfWorkToBeDone = ['tens of thousands of job', 'tens of thousands of job', 'tens of thousands of job', 'tens of thousands of job', 'tens of thousands of job', 'tens of thousands of job', 'tens of thousands of job'] while (true) { var job = lotsOfWorkToBeDone.pop() if (!job) { break } var syncResult = syncWork(job) asyncWork(syncResult) } function syncWork(job) { console.log('sync work:', job) return 'Sync Result of ' + job }; function asyncWork(syncResult) { setTimeout(function() { console.log('async work: ', syncResult) }, 0) } // Desire Outcome // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // Actual Outcome // sync work: tens of thousands of job // sync work: tens of thousands of job // sync work: tens of thousands of job // sync work: tens of thousands of job // sync work: tens of thousands of job // sync work: tens of thousands of job // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // async work: Sync Result of tens of thousands of job // async work: Sync Result of tens of thousands of job // async work: Sync Result of tens of thousands of job // async work: Sync Result of tens of thousands of job // async work: Sync Result of tens of thousands of job // async work: Sync Result of tens of thousands of job 

注意:这个例子是现实的简化版本。 我没有可以迭代的数组。 我有一个很大的缓冲区,我处理,直到EOF(因此,while循环)

使用async.whilst似乎达到了预期的结果。

我现在不会接受我自己的答案,因为我对这个解决scheme有什么意见有兴趣。 可能有更好的解决办法

 var async = require('async') var lotsOfWorkToBeDone = ['tens of thousands of job', 'tens of thousands of job', 'tens of thousands of job', 'tens of thousands of job', 'tens of thousands of job', 'tens of thousands of job', 'tens of thousands of job'] var job; async.whilst(function() { job = lotsOfWorkToBeDone.pop() return job }, function(callback) { var syncResult = syncWork(job) asyncWork(syncResult, callback) }, function(err) { console.log('error: ', err) }) function syncWork(job) { console.log('sync work:', job) return 'Sync Result of ' + job }; function asyncWork(syncResult, callback) { setTimeout(function() { console.log('async work: ', syncResult) callback() }, 0) } // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job // sync work: tens of thousands of job // async work: Sync Result of tens of thousands of job 

考虑使用stream。 缓冲100 MB听起来像一个坏主意。 最终的代码将如下所示:

 inputStream.pipe(yourTransformStream).pipe(outputStream); 

所有的逻辑将被实现为Transformstream。