Using OpenSearch in Fedora Linux

Photo by Markus Winkler on Unsplash

OpenSearch is Amazon’s open-source search engine and analytics suite. Individuals, businesses, and organizations can use the service to search for a wide range of information and use visualization tools to better understand user behavior and search trends. This article will discuss how you can use OpenSearch in Fedora Linux.

Prerequisites

What can OpenSearch do?

OpenSearch provides several features and tools. These are:

  • Applications that monitor and debug your cluster.
  • Manage security and event information.
  • Enable seamless, personalized search results.
  • A web-based user interface for searching and browsing search results.
  • The ability to search for specific terms or phrases within a document or webpage.
  • The ability to filter search results by date, relevance, or other criteria.
  • The ability to create and save searches for later use.
  • The ability to customize the appearance and functionality of the search results page.
  • Advanced analytics and reporting tools to help users understand and analyze search traffic and user behavior.

The following sections will guide you through the basics of creating a domain, uploading test data, and visualizing your information with OpenSearch Dashboards.

What is an OpenSearch Service domain?

An OpenSearch Service domain is a service provided by AWS that allows you to create, manage, and configure your cluster(s) using either the AWS console or the AWS command-line interface (CLI). This tutorial, will use the AWS console to create and configure your domain.

Getting started

To begin the domain setup, launch your preferred browser and log in to your AWS console. Navigate to the Amazon OpenSearch Service page, then click Create domain.

Create domain page segment which features options to choose your domain name and create a custom endpoint.

Choose your domain name and leave the Enable custom endpoint box unmarked.

Create domain page segment which features options to choose your deployment type, which version of OpenSearch or Elastic search you'd like to use, and enable compatibility mode.

OpenSearch is a fork of Elasticsearch version 7.10. You can choose any version up to Elasticsearch version 7.10 in addition to OpenSearch versions.

Choose Development and testing for your deployment type, the most recent OpenSearch version, and enable compatibility.

Create domain page segment which features options to enable Auto-Tune or add a maintenance window.

Leave Auto-Tune enabled and Add maintenance window unmarked.

Create domain page segment which features options to configure your nodes based on the needs for you application.

The Data nodes options allows you to customize your nodes based on the needs of your applications:

  • Availability Zones (AZ)
    • Amazon Web Services (AWS) Availability Zones are physically separate and isolated data center locations within an AWS region. Each Availability Zone is designed to be fault-tolerant, with redundant power, networking, and cooling infrastructure.
  • Instance type:
    • Refers to the type of virtual server you’d like to use for your application.
  • Number of nodes:
    • The number of nodes you’d like to allocate to each of your AZs.

Since we’re running in a small development setting, set your AZ to 2, your Instance type to t3.small.search, and Number of nodes to 2. Don’t change the default settings for your Storage type, EBS volume type, and EBS storage size per node.

Create domain page segment which features options to select Warm and cold data storage and the number of master nodes you'd like to use. Warm and cold data storage are cost effective solutions for storing large amounts of data and the default frequency of snapshots taken of your cluster is hourly.

Ignore these options for now, but read on for more information:

  • Warm and cold data storage:
    • For use cases that require a cost effective solution for storing large amounts of non-mutable data.
  • Dedicated master nodes
    • Allows you to choose how many master nodes you’d like to use for your domain.
  • Snapshot configuration:
    • Set to hourly by default.
Create domain page segment which features options to set what type of network access you'd like to use and enable granular level control over your data.

VPC access is recommended for production environments. You’ll also need to create a master user login to access OpenSearch Dashboards, OpenSearch’s data visualization tool. We’ll discuss how to use OpenSearch dashboards after you configure your domain.

Select Public access and Create master user, and set up your login.

Create domain page segment which features options to integrate your already existing authentication and Amazon Cognito authentication and set your domain's access policy.

Leave Prepare SAML authentication and Enable Amazon Cognito authentication option boxes unchecked and select Only use fine-grained access control for your access policy.

Create domain page segment which features option to set what type of encryption you'd like your domain to use.

Select Use AWS owned key, ignore the optional configurations, click Create to create your domain, then wait for your domain to activate.

Using OpenSearch Dashboards

OpenSearch Dashboards is a tool that allows you to create and customize interactive dashboards to visualize the data your site receives from user interaction. These dashboards are visual representations of data from various sources such as logs, metrics, and security events, which can be customized to meet your specific needs, including:

  • Dragging and dropping different types of visualizations, such as graphs, maps, and tables, onto a dashboard.
  • Filtering and manipulating data to highlight specific trends or patterns.
  • Sharing dashboards with other users or embedding them in other applications.
  • Collaborating with other users in real-time on the same dashboard.

Navigate to domains and select it from the list.

A list of your domains that provides information on metrics such as Cluster Health, Searchable documents, Total free space, and more.

Click OpenSearch Dashboards URL to access your OpenSearch Dashboard.

Your domain page that lists general information (such as name and Cluster health) and cluster configuration.

You’ll be presented with one of the following screens after you’ve logged into your dashboard:

OpenSearch Dashboard initial login prompt. The prompt asks if you would like to add data or explore the platform.
Upon first login
OpenSearch Dashboards home page. Has options to add sample data or interact with the OpenSearch API
Upon subsequent logins

Visualization options

Click Add sample data to add sample data provided by AWS.

Page showing 3 options of sample data you can upload to your domain. The options are eCommerce orders. flight data, and web logs.

You may select any of the three options. The Sample web logs option will be used, here, to view examples of types of visualization options you can use to analyze your data.

OpenSearch Dashboard visualizations which include Unique visitors, Visitors by OS, and a search query to search for what OS users use in other countries.
OpenSearch Dashboard visualizations which include response codes over time + Annotations and Unique Visitors vs Average Bytes.
OpenSearch Dashboard visualizations which include a file type scatter plot, and a table that shows what hosts, and how many bytes and unique vists the site received in the last hour.
OpenSearch Dashboard visualizations which that shows a heatmap of what country a visitor came from throughout the day.
OpenSearch Dashboard visualizations which that shows a map of which part of the world visitors viewed the site from.
OpenSearch Dashboard visualizations which that shows a Source and Destination Sankey Chart.

Click Create new to add more visualization options:

Analyze your own data to analyze

You can upload one or more of your documents by entering commands through a CLI.

Add a single document

curl -XPUT -u 'master-user:[master-user-password]' 'domain-endpoint/[domain name]/_doc/1' -d '{"field1": "string1", "field2": ["string3","string4"]}' -H 'Content-Type: application/json'

Add multiple documents

Create a JSON file with your documents and run a command to add multiple documents:

JSON file format:

{ "index" : { "_index": "indexname", "_id" : "2" } }
{"field1": "string1", "field2": ["string2", "string3", "string4"], "field3": 1234, "field4": ["String, 5", "String, 6"]}
{ "index" : { "_index": "indexname", "_id" : "3" } }
{"field5": "string7", "field6": ["string8", "string9", "string10"], "field7": 5678, "field8": ["String, 11", "String, 12"]}
{ "index" : { "_index": "indexname", "_id" : "4" } }
{"field9": "string13", "field10": ["string14", "string15", "string16"], "field11": 1011, "field12": ["String, 17", "String, 18"]}

JSON file naming restrictions:

  • All letters must be lowercase.
  • Index names cannot begin with _ or – .
  • Index names can’t contain spaces, commas, : , ” , * , + , / , \ , | , ? , # , > , or < .

Command to run:

curl -XPOST -u 'master-user:[master-user-password]' 'domain-endpoint/_bulk' --data-binary @bulk_[domain name].json -H 'Content-Type: application/json'

You can now create and configure your own domain and use OpenSearch Dashboards to visualize the data your domain receives.

Fedora Project community

13 Comments

  1. KarlisK

    Also worth noting that you can run a local instance of Opensearch ON your Fedora machine via Podman!

    You can grab the official container image off Docher Hub: https://hub.docker.com/r/opensearchproject/opensearch

    Or build it all from source if you fancy: https://github.com/opensearch-project/opensearch-build

  2. Anonymous Coward

    I’m disappointed in this article. It’s just an advertisement for AWS’s proprietary OpenSearch service. I would have expected an article about running an OpenSearch instance on Fedora Server. The above article doesn’t even mention Fedora once.

    • Anonymous Coward

      Agreed, this has litrrally nothing to do with Fedora and does not belong on this blog!

      • Admittedly, when I approved this article proposal, I thought the article would cover how to install the application locally. It seems the author took the article in a direction different from what I expected. Sorry that I did not catch it before publication. FWIW, it looks like this software can operate locally without relying on third-party servers:

        … You can run OpenSearch locally on a laptop—its system requirements are minimal—but you can also scale a single cluster to hundreds of powerful machines in a data center.

        In a single node cluster, such as a laptop, one machine has to do everything: manage the state of the cluster, index and search data, and perform any preprocessing of data prior to indexing it. As a cluster grows, however, you can subdivide responsibilities. Nodes with fast disks and plenty of RAM might be great at indexing and searching data, whereas a node with plenty of CPU power and a tiny disk could manage cluster state. For more information on setting node types, see Cluster formation. …

        (https://opensearch.org/docs/latest/about/)

        • I did not realize this article was already published. I’ll take this article down and take it into the direction the comments suggested.

          • Unfortunately, once they go out, they are pretty much out for good. We can’t really take them back from the various feed readers et al. You could write another version of the same topic if you want and then maybe we could add a redirect from this one to the new and improved version.

        • Another Anonymous Coward

          I am also very disappointed to see this. I think the post should be updated to remove the focus on an Amazon service. This community should not promote FAANG companies for free.

          • Amazon actually has (and continues to) donate to the Fedora Project (see https://communityblog.fedoraproject.org/special-thanks-to-nest-platinum-sponsor-amazon-aws/). So the “for free” part of your argument doesn’t apply and I’m not going to attempt rework this post to remove all references to Amazon. I do agree, however, that as a general policy, we don’t want Fedora Magazine to be seen as just a bunch of ads for companies. That this one went through the way it did was accidental and we’ll try to make sure it doesn’t happen again.

            When the original author completes a revised version that shows how to install and use the software locally, I will set a redirect from this post’s URL to the new article.

            • That anonymous coward

              Either the article is free advertising because the decisions to post it, etc, were made independently of the existing sponsorship OR it was posted because of that sponsorship, in which case your argument that it’s not free advertising is correct but we’ve got the bigger problem that it’s an ad not marked as an ad.

              Might be a good idea in the future to make sure that any post which mentions a sponsor of Fedora explicitly mentions that relationship in a disclaimer.

          • RH

            I agree with the others expressing disappointment with this post. The only thing it seems to require Fedora for is a browser, which most other OSs have. Only the first comment suggesting the possibility of using a container running on your Fedora system, would seem to make the article about Fedora, which the article isn’t. I’ve set up multiple Elastic clusters in AWS so I have seen these screenshots before for AWS administration via a web browser. If the article is admitted to not be free as AWS has paid for it with it’s donations, then shouldn’t it be labeled as an advertisement? Even if it wasn’t paid for, shouldn’t it be label as an ad for 3rd party product?

            An article showing how to run opensource elastic or opensearch via boxes, virtualbox VMs, containers, or other virtualization seems like it would be more appropriate to display Fedora’s capability and functionality. Maybe better yet, an article showing how to setup a lab or small production cluster on Fedora server and using Fedora workstation terminal cli and browser for administration and content development (in Kibana). Just an idea.

  3. michaek

    never hurd such a thing . always thought of it . its way cool and i will take furthger look into it.

  4. excellent post

Comments are Closed

The opinions expressed on this website are those of each author, not of the author's employer or of Red Hat. Fedora Magazine aspires to publish all content under a Creative Commons license but may not be able to do so in all cases. You are responsible for ensuring that you have the necessary permission to reuse any work on this site. The Fedora logo is a trademark of Red Hat, Inc. Terms and Conditions