Get Started with Amazon DocumentDB

Author

Reads 648

Photo of Man Using Scanner
Credit: pexels.com, Photo of Man Using Scanner

Getting started with Amazon DocumentDB is a breeze, and it's a great way to store and manage large amounts of data. You can start by creating a new DocumentDB instance in the AWS Management Console.

Amazon DocumentDB is a document-oriented database service that's designed to work with MongoDB-compatible workloads. It's a fully managed service, which means you don't have to worry about provisioning, patching, or backing up your database.

To get started, you'll need to create a new DocumentDB instance. This involves choosing a instance size, selecting a storage type, and configuring your VPC settings. You can choose from a variety of instance sizes to suit your needs, and you can also select from different storage types, including SSD and HDD.

Once you've created your instance, you can start loading data into your database. DocumentDB supports a range of data formats, including JSON and BSON. You can use the DocumentDB console to upload data, or you can use the DocumentDB API to load data programmatically.

DocumentDB Basics

Credit: youtube.com, Introduction to Amazon DocumentDB (with MongoDB compatibility)

DocumentDB is a fully managed, non-relational database service that supports MongoDB workloads, making it compatible with MongoDB drivers, applications, and tools.

You can use the same MongoDB drivers, applications, and tools to run, manage, and scale workloads on DocumentDB without worrying about managing the underlying infrastructure.

LocalStack allows you to use the DocumentDB APIs to create and manage DocumentDB clusters and instances.

A NoSQL document database, like DocumentDB, is a non-relational database, meaning it doesn't use the traditional Structured Query Language (SQL) used in relational databases.

Here are some examples of NoSQL databases:

  • Azure Table Storage
  • Column-based stores, like Cassandra
  • Graph databases, like NEO4
  • Document databases, like MongoDB and Azure DocumentDB

Introduction

DocumentDB is a fully managed, non-relational database service that supports MongoDB workloads. It's designed to make your life easier by allowing you to use the same MongoDB drivers, applications, and tools without worrying about the underlying infrastructure.

You can use DocumentDB with MongoDB, which means you can leverage the same expertise and knowledge you already have. LocalStack is also available to help you manage DocumentDB clusters and instances using the DocumentDB APIs.

DocumentDB's integration with LocalStack is quite impressive, with a wide range of supported APIs listed on their API coverage page. This provides a clear understanding of what you can do with DocumentDB and LocalStack together.

Getting Started

Credit: youtube.com, Getting Started with Amazon DocumentDB

To get started with Amazon DocumentDB, you'll need to familiarize yourself with some basic commands. The `docdb` command is related to Amazon DocumentDB for the AWS CLI.

You can create an Amazon DocumentDB cluster using the `create-db-cluster` command. This command requires a unique identifier for the cluster, which is specified with the `--db-cluster-identifier` option. For example, you can set it to `test-docdb-cluster`, but you can customize it to a name of your choice.

The `--engine` option specifies the database engine, which in this case is set to `docdb` to indicate the use of Amazon DocumentDB.

Here are the basic commands you'll need to get started with Amazon DocumentDB:

  • `docdb` command
  • `create-db-cluster` command
  • `--db-cluster-identifier` option (e.g. `test-docdb-cluster`)
  • `--engine` option (e.g. `docdb`)

As you create your cluster, notice that there are no other databases created in the DBClusterMembers field. This is because you didn't specify a MasterUsername or MasterUserPassword for the creation of the database, so the mongo-db will not set any credentials when starting the docker container.

Sub Documents

Credit: youtube.com, Learn JSON Document Storage (AWS DocumentDB Tutorial)

Sub documents can be a smaller subset of the source document, retrieved by specifying a sub-root as the source. This allows you to retrieve only a subtree in each document.

You can use sub documents to retrieve specific parts of a document, such as the parents sub documents, which are retrieved as a result of a query.

The complete JSON document for WakefieldFamily is returned when a query is executed, showing the entire document.

A query can also return specific parts of a document, such as the address without the location label.

You can use parameterized queries to construct queries that return specific parts of a document, such as the family name and address of each family.

Three documents are used as an example for constructing a parameterized query, showing how to specify the source and retrieve specific parts of the document.

Data Types and Indexing

Data types in DocumentDB are based on the JSON model, but with some additional features and data types. BSON, the binary-encoded format used by MongoDB, is also used by DocumentDB to provide a lightweight, fast, and highly traversable data type system.

Credit: youtube.com, DocumentDB Insider Hour | Episode 28 | Execution Engine and Indexing Basics

DocumentDB supports embedding objects and arrays within other objects and arrays, just like MongoDB's BSON implementation. However, not all BSON data types are supported by DocumentDB.

DocumentDB has limited support for data types compared to MongoDB, and it's essential to understand the differences to avoid any compatibility issues.

DocumentDB has a unique index on the _id field by default, which prevents clients from inserting two documents with the same value for the _id field. This index cannot be dropped.

DocumentDB provides several index types, including single field, compound index, multikey index, and geospatial index. These indexes support efficient queries and data retrieval.

Here are the index types supported by DocumentDB:

DocumentDB also supports unique, sparse, and TTL indexes, which can be used to enforce data consistency and manage data retention.

Data Types

MongoDB uses a binary-encoded format called BSON to represent JSON documents, which is lightweight, fast, and highly traversable.

BSON extends the JSON model to provide additional data types and ordered fields, making it efficient for encoding and decoding in different languages.

DocumentDB, on the other hand, doesn't support all of the BSON data types, so be sure to check the details if you're planning to use MongoDB.

Here are the JSON primitive types supported in the LINQ provider of DocumentDB's .NET SDK:

  • Numeric
  • Boolean
  • String
  • Null

Indexes

Credit: youtube.com, SQL indexing best practices | How to make your database FASTER!

Indexes are a powerful tool in DocumentDB that help improve the speed and efficiency of your database. They're essentially a shortcut that allows your database to quickly locate specific data.

DocumentDB supports several types of indexes, including the default _id index, which creates a unique index on the _id field during collection creation. This index prevents clients from inserting two documents with the same value for the _id field.

Single field indexes allow for user-defined ascending or descending indexes on a single field of a document. Compound indexes, on the other hand, support indexes on multiple fields.

DocumentDB also supports multikey indexes, which index the content stored in arrays. This allows queries to select documents that contain arrays by matching on element or elements of the arrays.

Here's a summary of the different types of indexes supported by DocumentDB:

  • Default _id index: creates a unique index on the _id field during collection creation.
  • Single field index: supports user-defined ascending/descending indexes on a single field of a document.
  • Compound index: supports user-defined indexes on multiple fields.
  • Multikey index: uses multikey indexes to index the content stored in arrays.
  • Geospatial index: supports efficient queries of geospatial coordinate data.

Unique indexes, on the other hand, reject duplicate values for the indexed field. Sparse indexes only contain entries for documents that have the indexed field, skipping documents that do not have the field.

Array Creation

Credit: youtube.com, Lec-93: Why Indexing is used | Indexing Beginning | DBMS

Array Creation is a powerful feature in DocumentDB SQL that allows you to create arrays as a result of a query.

You can use the array creation feature to shape nested arrays and embedded objects, as seen in Example 5. By enclosing fields like city, county, and state in square brackets, you can create an array named Address.

To create an array, simply enclose the fields you want to include in square brackets, like this: [city, county, state]. This will create an array of the specified fields.

Here's an example of how to create an array using the array creation feature:

  • To create an array of a single field, enclose the field in square brackets, like this: [city].
  • To create an array of multiple fields, enclose each field in square brackets, like this: [city, county, state].
  • To create an array of nested fields, use the same syntax, like this: [address.city, address.county, address.state].

By using the array creation feature, you can shape your data into the format you need for your application.

Between Keyword

The BETWEEN keyword is used to express queries against ranges of values, just like in SQL. It can be used against strings or numbers.

In DocumentDB, BETWEEN can be used against properties of mixed types, which is different from ANSI SQL. This means you can compare different types of results, but the document will be skipped if the comparison is "undefined".

Credit: youtube.com, Database Indexing for Dumb Developers

You can use BETWEEN to return all documents where a specific property falls within a certain range, like in the example where the query returns all family documents in which the first child's grade is between 1-5.

To display grades outside a specific range, you can use NOT BETWEEN, as shown in the example that produces the output of grades outside the range of 1-5.

BETWEEN is a useful keyword in DocumentDB, allowing you to express queries against ranges of values with ease.

In Keyword

The IN keyword is a powerful tool in DocumentDB SQL, allowing you to check if a specified value matches any value in a list. It's equivalent to chaining multiple OR clauses.

You can use IN to specify multiple values in a WHERE clause, making it easier to retrieve data that meets certain conditions. For example, if you want to retrieve data whose familyName is either "Smith" or "Wakefield", you can use the IN keyword in your query.

Credit: youtube.com, How do SQL Indexes Work

IN can also be used to retrieve data where the id is one of multiple values. For instance, if you want to retrieve all family documents where the id is either "SmithFamily" or "AndersenFamily", you can use the IN keyword in your query.

Here's a summary of how IN works:

  • It allows you to specify multiple values in a WHERE clause.
  • It's equivalent to chaining multiple OR clauses.

By using IN, you can simplify your queries and make them more efficient, which can be especially helpful when working with large datasets.

Value Keyword

The VALUE keyword is a useful feature in DocumentDB SQL that helps reduce the overhead of creating a full-blown object when returning a single value. It provides a way to return a JSON value.

You can use the VALUE keyword to produce a leaner result set, as seen in the example query that returns the scalar "Hello World, this is DocumentDB SQL Tutorial".

The VALUE keyword can be omitted, but it will return the entire object, including the address with a location label, as shown in the example query.

Parameterized

Credit: youtube.com, Easy database indexing strategies

Parameterized queries are a great way to avoid SQL injection attacks and provide robust handling of user input. They allow you to use placeholders for parameters and supply the values at execution time.

In DocumentDB, parameters in parameterized queries can be expressed with the familiar @ notation. This makes it easy to write queries that take parameters and then execute them for various values.

Using parameterized queries can also provide robust handling and escaping of user input. This is especially important when working with user input, as it can help prevent SQL injection attacks.

DocumentDB supports parameterized queries, and you can use the @ symbol to prefix parameter names. This makes it easy to write queries that can be used to delete any collection, without hardcoding the collectionId.

You can write a query that takes last name and address state as parameters, and then executes it for various values of lastname and location.state based on user input. This can be sent to DocumentDB as a parameterized JSON query.

Types of Databases

Credit: youtube.com, Types Of Indexing In DBMS With Examples | Indexing Basics 2/2

There are several types of databases, each with its own strengths and weaknesses.

Relational databases, also known as SQL databases, are designed for traditional applications and enforce referential integrity in relationships between tables.

NoSQL databases, on the other hand, provide a variety of data models, including key-value, document, and graph, which are optimized for performance and scale.

NoSQL databases are great for modern applications such as mobile, web, and gaming, which need flexible, scalable, high-performance, and highly functional databases to provide great user experiences.

Some popular types of NoSQL databases include key-value, document, graph, in-memory, and search databases.

Key-value databases, like Amazon DynamoDB, provide latency in milliseconds and are highly partitionable, allowing for horizontal scaling.

Document databases, such as MongoDB and Amazon DocumentDB, are used for storing, retrieving, and managing semi-structured data, and provide powerful and intuitive APIs for flexible and iterative development.

Here are the main types of NoSQL databases:

  • Key-value: Amazon DynamoDB provides latency in milliseconds.
  • Document: MongoDB and Amazon DocumentDB are popular document databases.
  • Graph: Amazon Neptune is a fully managed graph database service.
  • In Memory: Amazon ElastiCache is a fully managed, in-memory caching service.
  • Search: Amazon Elasticsearch Service (Amazon ES) is used for providing near-real-time visualizations and analytics of machine-generated data.

Frequently Asked Questions

What is DocumentDB used for?

DocumentDB is a versatile database solution suitable for big data, gaming, real-time applications, and more. It's ideal for applications that require high scalability, performance, and flexibility.

Is DocumentDB the same as MongoDB?

No, Amazon DocumentDB is not the same as MongoDB, but it emulates the MongoDB API to provide a similar experience. It's actually built on Amazon's Aurora backend platform, not the MongoDB server.

What is DynamoDB vs DocumentDB?

DynamoDB is a fully-managed service, while DocumentDB is an instance-based database that requires some administrative tasks

Is DocumentDB an RDS?

No, Amazon DocumentDB is not an RDS, but it shares a similar ARN format with Amazon RDS. However, DocumentDB ARNs contain 'rds' instead of 'docdb

Oscar Hettinger

Writer

Oscar Hettinger is a skilled writer with a passion for crafting informative and engaging content. With a keen eye for detail, he has established himself as a go-to expert in the tech industry, covering topics such as cloud storage and productivity tools. His work has been featured in various online publications, where he has shared his insights on Google Drive subtitle management and other related topics.

Love What You Read? Stay Updated!

Join our community for insights, tips, and more.