Getting started with Azure Elasticsearch for Business can seem daunting, but it's actually quite straightforward. Azure Elasticsearch is a managed service that allows you to run Elasticsearch on the Microsoft Azure cloud platform.
Azure Elasticsearch is built on top of the popular open-source search and analytics engine, Elasticsearch. This means you get all the benefits of Elasticsearch, including its robust search functionality and scalability, without the hassle of managing the underlying infrastructure.
To get started, you'll need to create an Azure Elasticsearch cluster. This involves selecting the desired instance type and storage capacity for your cluster, as well as choosing the Azure region where your data will be stored.
What Is Azure Elasticsearch
Azure Elasticsearch is a powerful tool that allows you to deploy Elasticsearch clusters in a fully automated manner.
It's essentially a deployment template that guides you through the required steps and feeds your inputs to an Azure Resource Manager (ARM) template.
This template then deploys the required resources to a resource group, making it easy to get started with Elasticsearch on Azure.
You can access the ARM template independently, either from the Azure portal, or using Azure's command-line interface (CLI) or PowerShell command-line tools.
Here are some useful resources to get you started:
- Elasticsearch Azure Resource Manager (ARM) Template
- Elasticsearch on Azure: Reference Architecture
- Getting Started with Elasticsearch on Azure
- Elasticsearch on Azure Q&A
- Elasticsearch on Azure with NetApp Cloud Volumes ONTAP
With Azure Elasticsearch, you get access to Elasticsearch as a software as a service, with many reliable features for subscriptions of Elastic stack and services like training, consulting, and consultative tech support.
The clusters of Elasticsearch are deployed on an elastic stack which includes the user to access Logstash, Kibana, and the entire Elasticsearch in an entirely automated way.
This offers an attractive user interface that assists the user in feeding the input to the template of the Azure resource manager and deploy the required sources to the concerned group.
Getting Started with Azure Elasticsearch
To get started with Azure Elasticsearch, you need to log into the Azure Marketplace portal and locate Elasticsearch. Click on Get it now, then Create, to start the process.
First, set credentials that will allow you to access the solution's virtual machines (VMs). This includes setting a valid Ubuntu username and selecting either a password or a Secure Socket Shell (SSH) key for authentication.
To deploy the cluster, select a subscription, resource group, and a location where the solution should be deployed. You can also choose to deploy a new virtual network (VNet) or use an existing one.
The Cluster Settings tab is where you choose the preferred version of Elasticsearch and give the name to the cluster. You can also specify the disk size and type, as well as the number of disks for data nodes.
Kibana and Logstash can be deployed alongside Elasticsearch, and you can decide whether you wish to deploy Kibana and/or Logstash. Kibana deploys to a separate VM, which receives a public IP address and network security group.
To define users and role configuration, click on the Security tab. This feature is only available during the 30-day trial period offered by the Azure Marketplace. You can set up six built-in user accounts, providing credentials for each.
Here are the eight mandatory tabs that need to be filled by the user:
- Elastic user account
- Kibana user account
- Logstash system account
- Beats system user account
- APM system user account
- Remote monitoring user system user account
- Kibana system user account (from version 7.8.0 and kibana_system)
- Logstash system user account (for saving monitoring data in Elasticsearch)
Once you have filled in all the required tabs, click on the Certificates and Review + Create tabs to confirm your input values and click OK to launch your new deployment.
Azure Elasticsearch Architecture
Azure Elasticsearch Architecture is a well-structured setup that allows for efficient data processing and search capabilities. This architecture is generated automatically using an ARM template, which deploys various nodes to handle different tasks.
Data nodes are the core of the Elasticsearch cluster, performing search, aggregation, and other data-related operations. By default, three data nodes are deployed as virtual machines (VMs) that connect to the backend load balancers.
The dataNodesAreMasterEligible parameter is set to No, meaning data nodes cannot be elected as master nodes. Instead, three dedicated master nodes are deployed for larger clusters, which is a recommended option.
Coordinating nodes can be added to help gather incoming requests from clients, forward them to data nodes, and aggregate results. This is an optional feature for clusters with more than 100 data nodes.
Ingest and machine learning nodes are also part of the architecture. By default, all deployed nodes function as Ingest Nodes, and if machine learning features are included in the Elasticsearch license, these nodes can double as Machine Learning Nodes.
Here's a breakdown of the node types in Azure Elasticsearch Architecture:
- Data nodes: perform search, aggregation, and other data-related operations
- Master nodes: dedicated nodes for larger clusters
- Coordinating nodes: help gather incoming requests and forward them to data nodes
- Ingest nodes: handle data ingestion by default
- Machine learning nodes: handle machine learning features if included in the license
Use Separate Disks
When designing your Azure Elasticsearch architecture, it's essential to use separate disks for each VM to ensure easy upgrades and data redundancy. This approach allows you to destroy machines without copying files.
Mounting drives for logging and data files is a good practice. You can attach one new disk for every VM and have it use it as a data storage.
Having one shared disk with separate folders for each VM is also a viable option, especially for logs. This way, you can have a single Logstash instance running on top of one disk.
Changing the data and log paths is relatively simple and can be done by modifying the elasticsearch.yml file. Specifically, you can change the path.logs and path.data settings to point to the new disk locations.
Reference Architecture
The reference architecture for Elasticsearch on Azure is a well-structured framework that ensures efficient deployment and management of Elasticsearch resources.
Data nodes are virtual machines that perform search, aggregate, and other data-related operations and connect to the backend load balancers. By default, three data nodes are deployed.
The system will deploy three dedicated master nodes, a recommended option for larger clusters, when the dataNodesAreMasterEligible parameter is set to No.
Coordinating nodes can optionally be added to help gather incoming requests from clients, forward them to data nodes, and aggregate results for clusters that deploy more than 100 data nodes.
Ingest and machine learning nodes are deployed by default, but if machine learning features are included in your Elasticsearch license, these nodes can double as Machine Learning Nodes.
The template uses incremental deployment mode, which means existing Elasticsearch resources in the resource group will stay the same, while new resources or those with different settings will be added or modified.
Managing and Troubleshooting Azure Elasticsearch
You can access all virtual network deployed VMs through a Kibana or Jumpbox VM, which can be accessed via TCP through port 22. This allows you to use SSH to securely connect to VMs, using either a password or SSH key.
To troubleshoot issues with Elasticsearch, it's essential to track your logs. Speaking of Elasticsearch logs, you should track them, and if anything goes south, you should know where to look and that place should be nearby.
Using a good monitoring tool can also help you stay on top of your Elasticsearch cluster. See BigDesk, ElasticHQ, and Marvel, which are all great options - the first two are free, but Marvel requires a purchase.
Use a Monitoring Tool
Using a good monitoring tool is crucial for keeping an eye on your Elasticsearch cluster. BigDesk and ElasticHQ are both great options, and the best part is that they're free.
You won't be able to install these tools as a site-plugin unless you use HTTP authentication, so you'll need to deploy them via a secure host.
Marvel is another great option, but it does require a purchase.
For those who are interested in trying out Marvel, be aware that it needs to be purchased.
BigDesk and ElasticHQ are both great free options for monitoring your Elasticsearch cluster.
Connecting to VMS for Troubleshooting
You can access all virtual network deployed VMs through a Kibana or Jumpbox VM, which operate in a network security group.
To connect, use SSH through port 22, and you'll need to know the private IP address or hostname, if DNS is defined.
You can use either a password or SSH key to connect, as defined in the configuration Basics step.
Once connected, you'll have access to troubleshoot the VMs securely.
Network Access
To access your Elasticsearch cluster from your applications, you have two options. You can deploy your application to Azure and run it on the same Virtual Network as your cluster.
This allows you to access the 9200 HTTP port, as well as the 9300 port if your app is in Java. Unfortunately, Azure Websites do not currently support running on a Virtual Network, so you can only deploy websites as a Cloud Service if you want them to access the cluster this way.
Alternatively, you can open the 9200 port and protect it with authentication. This is a good option if you need to access the cluster from elsewhere, or if you have to use Azure Websites and cannot convert them to a Cloud Service.
To open the 9200 port, you can use a Cloud Service, and to protect it with authentication, you can use plugins like elasticsearch-http-basic or elasticsearch-jetty.
Here are your options for accessing your Elasticsearch cluster:
- Deploy your application to Azure as a Cloud Service and run it on the same Virtual Network as your cluster.
- Open the 9200 port and protect it with authentication using plugins like elasticsearch-http-basic or elasticsearch-jetty.
Benefits and Implications of Azure Elasticsearch
Being a Certified Software Solution for Azure means that Elasticsearch has passed rigorous testing and validation processes, ensuring our product adheres to Microsoft’s high standards for security, performance, and reliability.
This certification offers peace of mind when selecting cloud-native solutions for your critical applications. By achieving this designation, Elasticsearch has joined an exclusive group of applications that meet Microsoft’s stringent criteria, ensuring a fully optimized experience for organizations operating on the Azure cloud.
Elasticsearch’s Azure-certified solution simplifies integration for current and new customers, making it easier to deploy and integrate our solution within your Azure environment. This streamlined deployment process minimizes setup time, enabling your teams to focus on extracting insights from your data faster.
Certified solutions are trusted by Microsoft to meet enterprise-level compliance standards, providing you with robust security controls in alignment with Azure’s regulatory requirements. This means you can have confidence in the security of your data.
As a native integration within Azure, Elasticsearch uses Azure’s infrastructure to provide optimized performance and scalability. You can scale your search, analytics, and observability functions without worrying about the technical complexities of managing infrastructure.
Here are the key benefits of Elasticsearch’s Azure-certified solution:
- Seamless deployment: Available directly in the Azure Marketplace, Elasticsearch allows you to easily deploy and integrate our solution within your Azure environment.
- Enhanced security and compliance: Certified solutions are trusted by Microsoft to meet enterprise-level compliance standards.
- Optimized performance on Azure: As a native integration within Azure, Elasticsearch uses Azure’s infrastructure to provide optimized performance and scalability.
Azure Elasticsearch Configuration and Customization
With Azure Elasticsearch, you can get up and running quickly with preconfigured solutions and deployment templates.
You don't need to worry about sizing the cluster, as the platform takes care of it for you.
Customization is a breeze, allowing you to adjust settings at any time to suit your needs.
Customizable Settings
Customizable settings allow you to fine-tune your deployment to suit your needs.
Preconfigured solutions and deployment templates make it easy to get started with minimal setup required. You can quickly get up and running without worrying about sizing the cluster.
You have the flexibility to customize deployments at any time, which is particularly useful when your needs change. For example, you can increase memory, and the system will automatically adjust for capacity and performance.
Changing the level of fault tolerance is also possible, giving you more control over the reliability of your deployment. Adding features like machine learning can also be done with ease.
Using the Plugin
The official Elasticsearch Azure plugin can help you figure out what nodes are available for you, eliminating the need to specify IPs in the elasticsearch.yml config file.
I personally prefer not using the plugin for small and solid-state clusters as it adds extra friction.
For large, vibrant clusters, this plugin will likely make a big difference in terms of ease of management.
You can find more information about the plugin on the Elasticsearch GitHub page at https://github.com/elasticsearch/elasticsearch-cloud-azure.
Another benefit of the plugin is enabling snapshot/restore for Azure Storage, which you can leverage even if you don't enable Azure Multicast.
Frequently Asked Questions
What is the Microsoft equivalent of Elasticsearch?
Azure Search is the Microsoft equivalent of Elasticsearch, utilizing indexers to extract and index data from Azure data sources for fast and accurate search results. It offers separate indexers for various Azure services, including Azure SQL and Cosmos.
Sources
- https://bluexp.netapp.com/blog/azure-cvo-blg-azure-elasticsearch-on-azure-a-quick-start-guide
- https://www.elastic.co/blog/getting-started-with-elastic-cloud-on-microsoft-azure
- https://www.educba.com/azure-elasticsearch/
- https://code972.com/blog/2014/07/the-definitive-guide-for-elasticsearch-on-windows-azure-74
- https://www.elastic.co/blog/elasticsearch-certified-software-solution-microsoft-azure
Featured Images: pexels.com