按优先级和组值sorting结果，然后筛选结果

这是我的DynamoDB当前数据：

在这里输入图像描述

我的目标是创build一个查询过滤结果在组集（像“默认”），然后按优先级进行sorting，然后筛选结果为那些login到== true和状态==空闲。

在SQL中，它会是这样的

SELECT * FROM userstatustable WHERE group == "default" AND loggedIn == true AND status == "idle" ORDER BY priority DESC LIMIT 1

我将如何创build一个查询来做到这一点？

以下是我对DynamoDB表的serverless.yml文件描述。

 userStatusTable: #This table is used to track a users current status. Type: AWS::DynamoDB::Table Properties: TableName: ${self:custom.userStatusTable} AttributeDefinitions: #UserID in this case will be created once and constantly updated as it changes with status regarding the user. - AttributeName: userId AttributeType: S KeySchema: - AttributeName: userId KeyType: HASH ProvisionedThroughput: ReadCapacityUnits: ${self:custom.dynamoDbCapacityUnits.${self:custom.pstage}} WriteCapacityUnits: ${self:custom.dynamoDbCapacityUnits.${self:custom.pstage}}

我已经尝试过的事情：

以下是我目前的代码：

  const userStatusParams = { TableName: process.env.USERSTATUS_TABLE, FilterExpression: "loggedIn = :loggedIn and #s = :status and contains(#g,:group) ", //Limit: 1, ExpressionAttributeValues: { ":loggedIn": true, ":status" : "idle", ":group" : "DEFAULT" }, ExpressionAttributeNames: {"#s": "status","#g" : "group"} }; var usersResult; try { usersResult = await dynamoDbLib.call("scan", userStatusParams); console.log(usersResult); }catch (e) { console.log("Error occurred querying for users belong to group."); console.log(e); }

这使用扫描，并能够返回符合条件的所有结果…但是它不会按优先级从高到低的顺序对结果进行sorting。

注意：status和group显然是保留的关键字，所以我不得不使用ExpressionAttributeNames来解决这个问题。另外请注意，这个表最终将有成千上万的用户。

索引不是关于sorting 。 sorting只是用来有效检索行的一种方法，因为sorting数组中的search可以在对数时间O （log n）中完成，而不是线性时间O （n）。这只是一个结果，行按sorting顺序返回。但是让我们把精力集中在寻找精确的行的能力上（比如读取磁盘的I / O）。

对于这种types的查询（按组，状态和更多的列）来说，筛选需求对于DynamoDB来说非常难以有效处理。效率是指DynamoDB需要从磁盘检索多less行以确定要返回到客户端的行。如果它返回从表中读取的总行数的10％，则效率不高。这就是为什么常规Scan与filters一起不如indexed query 。 filter是骗人的，因为它们仍然从数据库中读取项目并计入预置容量 。索引查询将从接近实际返回的数字的存储行中检索。这是通过DynamoDB实现的，但限制为单个分区 （具有相同分区/散列键的项目）以及sorting键的范围（以> =，<=开始）。

为什么在执行扫描时行不会按sorting键sorting？由于DynamoDB使用Item Collections内的sorting键，每个集合由散列键确定。当结果集包含例如2个唯一的散列键时，结果集将包含按sorting键sorting的2个单独的节，换句话说，行将不会被sorting在一个方向上 ，它们将在结果集。 在内存中sorting将需要有一个单独的sorting集合 。

为什么不创build一个索引的列可以有一个单一的值的所有行？如果我们运行一个扫描，那么行将按优先级sorting（sorting键）。但是，如果让所有项目包含相同的字段值，则这是一个数据exception 。

那么我应该什么时候创build索引？

我想查询一个字段，可能是空的一些/许多行，所以索引将包含更less的项目比整个表;
我的查询将应用范围运算符，将能够select一小部分的数据;

鉴于group属性应该是最有select性的 ，在全局索引上对该属性进行散列会更快，但是这会改变模型，需要将每个组存储在单独的项目中，而不是使用string集合。这在NoSQL世界中并不是很方便，需要更多的关系模型。

所以，考虑到可以使用扫描而没有单独的索引 ，一种方法是在内存中执行扫描和sorting。使用Array#sort()方法在node.js中执行它。性能特征更接近于二级指标的方法，在这种情况下只有一个指标就是浪费资源。因为如果对索引的查询/扫描返回相同的信息，则对表执行扫描，则使用表格方法。 记住：索引在检索行时是关于select性的 。

我怎么知道这对我的用例是一个好方法？那么，这不是一个明确的规则，但我想说如果你想检索超过50％的表行 ，这将是没有问题的。在成本方面，它不会保留一个单独的指数。甚至可能考虑另一种devise，因为这不是很有select性。现在，如果你想要20％或更less的数据，那么不同的方法将是很好的研究。

链接到我使用主表的devise的其他答案。

这种方法需要从一条logging中修改UserStatus中的groupbuild模，并将string设置为多个logging。这是因为DynamoDB不支持（但是，这使得一个很好的function请求，但键）集。

主表用于更新/插入/删除，如下所示：

 +--------+---------+-------+----------+----------+--------+ | userId | group | type | priority | loggedIn | status | +--------+---------+-------+----------+----------+--------+ | 123 | default | admin | 1 | true | idle | +--------+---------+-------+----------+----------+--------+ | 123 | orange | admin | 1 | true | idle | +--------+---------+-------+----------+----------+--------+ | 124 | default | admin | 3 | false | idle | +--------+---------+-------+----------+----------+--------+ | 125 | orange | admin | 2 | false | idle | +--------+---------+-------+----------+----------+--------+

分区/散列键：userId
sorting键：组

设置一个GSI（组，优先级）。这将用于查询。是的，为这个索引select的组合会有重复：DynamoDB不打扰这个，并很好地工作。

 +---------+----------+--------+-------+----------+--------+ | group | priority | userId | type | loggedIn | status | +---------+----------+--------+-------+----------+--------+ | default | 1 | 123 | admin | true | idle | +---------+----------+--------+-------+----------+--------+ | default | 3 | 124 | admin | false | idle | +---------+----------+--------+-------+----------+--------+ | orange | 1 | 123 | admin | true | idle | +---------+----------+--------+-------+----------+--------+ | orange | 2 | 125 | admin | false | idle | +---------+----------+--------+-------+----------+--------+

任务：

更新这个表上的用户需要更新/插入与用户所属的组一样多的行;
删除用户意味着删除用户的所有项目。
查询由group = :group and priority >= :priority ，过滤status = 'idle' and loggedIn = true
- 一个变体是sorting状态或login，因为你过滤他们，这有助于使查询更有select性，然后sorting优先级在客户端

我应该遵循这个方法吗？我认为这是一个很好的devise，当有很多组，一个组包含高达20％的总用户，而用户属于2或2组。

所以我find了一个有趣的解决这个问题。

这是我的新代码。

 const userStatusParams = { TableName: process.env.USERSTATUS_TABLE, IndexName:"typePriorityIndex", FilterExpression: "loggedIn = :loggedIn and #s = :status and contains(#g,:group) ", KeyConditionExpression: "#t = :type and priority >= :priority", Limit: 1, ExpressionAttributeValues: { ":loggedIn": true, ":status" : "idle", ":group" : "DEFAULT", ":priority" : 0, ":type" : "admin" }, ExpressionAttributeNames: {"#s": "status","#g" : "group", "#t" : "type"} }; var usersResult; try { usersResult = await dynamoDbLib.call("query", userStatusParams); console.log(usersResult); }catch (e) { console.log("Error occured quering for users belong to group."); console.log(e); }

注意使用IndexName：“typePriorityIndex”，这里的诀窍是find某些东西或者在你的表中做一些事情，logging将会全都相同，然后把这个哈希键，那么sorting键应该是你想要sorting的东西在我的情况下，这是优先。

该指数看起来像这样给出一个想法。

在这里输入图像描述

我的无服务器文件看起来像这样定义它

 userStatusTable: #This table is used to track a users current status. Type: AWS::DynamoDB::Table Properties: TableName: ${self:custom.userStatusTable} AttributeDefinitions: #UserID in this case will be created once and constantly updated as it changes with status regarding the user. - AttributeName: userId AttributeType: S - AttributeName: priority AttributeType: N - AttributeName: type AttributeType: S KeySchema: - AttributeName: userId KeyType: HASH ProvisionedThroughput: ReadCapacityUnits: ${self:custom.dynamoDbCapacityUnits.${self:custom.pstage}} WriteCapacityUnits: ${self:custom.dynamoDbCapacityUnits.${self:custom.pstage}} GlobalSecondaryIndexes: - IndexName: typePriorityIndex KeySchema: - AttributeName: type KeyType: HASH - AttributeName: priority KeyType: RANGE Projection: ProjectionType: ALL ProvisionedThroughput: ReadCapacityUnits: ${self:custom.dynamoDbCapacityUnits.${self:custom.pstage}} WriteCapacityUnits: ${self:custom.dynamoDbCapacityUnits.${self:custom.pstage}}

按优先级和组值sorting结果，然后筛选结果

启动AWS Serverless Framework无法安装Starter示例

我如何在AWS Lambda上安装GraphicsMagick或ImageMagick？

通过REST进行原生Kubeless调用

我如何检查Openwhisk被调用？

由DynamoDB支持的NodeJS lambda函数的inputvalidation

无服务器 – 离线可选path参数

Lambda /无服务器内联需求与头部需求

如何使用非主键字段查询DynamoDB？

无法在macosx上安装无服务器框架

debugging无服务器框架调用本地函数