DynamoDB GetItem vs Query – When to Use What?

Trying to understand the difference between DynamoDB GetItem and Query operations? This is the article for you.

DynamoDB has two important APIs to retrieve items from your table: GetItem and Query. There is a third API called Scan that typically should not be used in production use cases (more on that here) so we’ll ignore it for now.

At first glance, the two APIs may look similar. However, there is a very specific reason to use one API over the other. Lets compare the difference by first understanding how they work.

Note that I’ll be using the following example in this article. I’ll refer to it a couple of times in this article.

CustomerOrdersTable

CustomerId (PK) | OrderId (SK) | OrderDate (GSI)
----------------|--------------|-----------------------
C-1             | 123          | 2021-01-01
C-1             | 456          | 2021-01-02
C-2             | 789          | 2021-01-01

DynamoDB GetItem API

According to AWS Documentation, the GetItem API is defined as an operation that:

“Retrieves a single item from a table. This is the most efficient way to read a single item because it provides direct access to the physical location of the item.”

https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/SQLtoNoSQL.ReadData.html

Note that the usage of this API requires the partition key and the sort key (if you table doesn’t use sort key, you can leave this blank).

For example, if you have a table called CustomerOrders and the partition key is CustomerId, you would provide the id of the customer in your API call, like this.

   "TableName": "CustomerOrders",
    "Key": {
        "CustomerId": {
            "S": "C-1"
        }
    }

If your table also has a partition key AND SORT KEY, say… a OrderId field, your query would change to look like this.

   "TableName": "CustomerOrders",
    "Key": {
        "CustomerId": {
            "S": "C-1"
        },
        "OrderId": {
            "S": "123"
        }
    }

Note that not providing the sort key when your Table structure (or GSI, if you’re using that) will result in an exception.

Its also important to note that you cannot perform a GetItem request on a global secondary index, or GSI. More on that below.

GetItem on a GSI?

A GSI is a secondary index on your DynamoDB table. It allows you to lookup records on keys that are not your table’s partition key. More on GSIs here.

A quirk with the GetItem API is that you can not use it with a GSI. This is because DynamoDB is an unstructured database – meaning that it does not enforce data integrity on item content except for the Partition Key (and optionally sort key) of the main table.

In other words, users have the ability to add items who’s records have the same value for the GSI. Since this is true, multiple items can have the same partition key + sort Key, making the GetItem API incapable of returning just a single item.

The opposite also holds true in the sense that DynamoDB does not enforce the presence of a value that is marked as a GSI. So for example, we if add a GSI on a date field, and neglect to populate that field when inserting the CustomerOrder record, this is no problem for Dynamo. This actually has a name and is called a sparse index (more on that here).

Instead, users need to use the Query or Scan api to access items when using the GSI.

If you’d like to learn more about GSIs, check out the AWS Documentation here on “Using Global Secondary Indexes on DynamoDB”.

Getting Multiple Items at Once with BatchGetItem

If you’d like to retrieve multiple items at once (and you happen to know the partition key or partition key + range key), you can use the BatchGetItem API.

Similar to the GetItem API, the BatchGetItem API allows you to retrieve multiple items at once. This is provided that you already know the ids of the items you are looking for.

BatchGetItem is Different from Query because you can look for items that have a different partition key. For example, batch BatchGetItem would allow you to search for records with a CustomerId of “C1” and second with the name “C2”.

As an example, say we had this table setup:

CustomerId (PK) | OrderId (SK) | OrderDate (GSI)
----------------|--------------|-----------------------
C-1             | 123          | 2021-01-01
C-1             | 456          | 2021-01-02
C-2             | 789          | 2021-01-01

Since we have a Sort key, every item that we provide to BatchGetItem must also include the OrderId (SortKey). We can potentially retrieve all three items at once if we provide the corresponding PK + SK combinations, which are:

  1. C-1 + 123
  2. C-1 + 456
  3. C-2 + 789

Note that when using BatchGetItem, a single operation can retrieve up to 16 MB of data, which can contain as many as 100 items. If your value contains more than 16MB of data, DynamoDB will return a flag in the response called UnprocessedKeys. You can use this value to retry the operation for the unretrieved records. More details on this in the AWS documentation here.

DynamoDB Query API

The DynamoDB query API is useful in two cases:

  1. You have a partition key + sort key table structure, and would like to find all record with the same partition key.
  2. You are looking for an item on a GSI

In either cases, Queries are the applicable API to use.

In addition, you can use the sort key attribute to filter on specific records that share the same Partition key. For example, if we have the same table:

CustomerId (PK) | OrderId (SK) | OrderDate (GSI)
----------------|--------------|-----------------------
C-1             | 123          | 2021-01-01
C-1             | 456          | 2021-01-02
C-2             | 789          | 2021-01-01

We can query with the just the CustomerId with a value C-1. This query will return the 1st and 2nd records. We can also query our GSI looking for value 2021-01-01 which will return the 1st and 3rd records (since the values are common). If we were to provider CustomerId C-1 and OrderDate 2021-01-01 while querying our GSI, we would only retrieve the 1st record.

To learn how to use the Query API using Python, check out my article How To Query DynamoDB with Boto3.

Summary

This article has showed you the different between GetItem and Query Operations. in DynamoDB. As a summary, you can use this guide when trying to decide which API to use.

  1. Looking for just a single item on the main table index? Use GetItem
  2. Looking for just a single item on a GSI? Use Query.
  3. Looking for multiple items with different partition key and sort key combinations at once? Use BatchGetItem
  4. Looking for multiple items that share the same partition key? Use Query
Exit mobile version