API, Golang, Microservice

How to Create a Search Microservice

An essential part of every application is search, the ability to find items via a search bar. There are many ways to do it, I've seen it done client side using a loop, or done via full-text indexing in Postgres or MySQL. But there comes the point where you need a separate service for search, and people often reach to Elasticsearch for that.

In this post, I will walk you through how to build a basic search microservice using Golang. We are going to be searching users by email, username, and real name. You can find the entire source code on GitHub, here.

Architecture

Okay so before we start writing any code lets break down how this should work. It is important to understand is Elasticsearch isn't supposed to be directly exposed to the client, so building an intermediary microservice is essential. For this example, the microservice will need one endpoint for search. We also want to populate the Elasticsearch cache so we will need another endpoint for that. In a real application, you will use some sort of queue with producers and consumers to populate the system. That is out of the scope of this post though.

We are going to need two endpoints then:

  • /search
  • /populate

These endpoints are going to need parameters as well.

The search endpoint will need a couple get parameters, firstly we need to specify the string for it to search on. We also want basic pagination to specify how many results to return, and where in the list of results to start.

  • q - string for search term
  • from - start index in the list of results
  • size - number of results to return

Populate

The populate endpoint needs only one get parameter which is used to specify the number of results for it to generate.

  • number - the number of results to insert into Elasticsearch

Endpoints

Now that we know the structure of our microservice and what endpoints and parameters we need let's start with the code!

First things first, lets set up a basic main.go file with the endpoints and basic http server.

package main

import (
    "log"
    "net/http"
)

func main() {
    mux := http.NewServeMux()

    mux.HandleFunc("/populate", func(w http.ResponseWriter, req *http.Request) {
        
    })

    mux.HandleFunc("/search", func(w http.ResponseWriter, req *http.Request) {

    })

    log.Fatal(http.ListenAndServe(":8000", mux))
}

Since this is a reasonably small microservice, we are going to build the endpoints out in the main.go file. But, generally in production, or if you have more logic, it is good to move these endpoints out into a separate file.

Next, let's build out the search endpoint. We're going to need to retrieve the get parameters, which is relatively ugly and long when you use the Golang standard library so I'm leaving that out for the tutorial if you're interested it's here on Github.

After getting the parameters, we check if they're valid, if not send a response back letting the client know it's an invalid request. After that, we pass the term, from, and size parameters into our Elasticsearch Search helper function which we will create in the next section. The function will query Elasticsearch and return the results, and an error if an issue occurs. From there we will json encode the response and send it to the client. Pretty simple eh.

mux.HandleFunc("/search", func(w http.ResponseWriter, req *http.Request) {
    term, from, size, ok := getQueryParams(req)
    if !ok {
        w.WriteHeader(http.StatusBadRequest)
        w.Write([]byte("Attach proper parameters"))
        return
    }
    res, err := Search(term, from, size)
    if err != nil {
        w.WriteHeader(http.StatusInternalServerError)
        w.Write([]byte("Error searching"))
        return
    }

    w.WriteHeader(http.StatusOK)
    json.NewEncoder(w).Encode(res)
})

The next task is to create the endpoint for populating Elasticsearch. Having a search endpoint won't be much good if we have no data to search for!

The first thing we do is grab the number get parameter and turn it into an integer. If it is invalid, then we let return a bad request response to the client. The next step is to populate Elasticsearch, this is done with a helper function that we will build later in this post. Finally we return an error if the populate function errors out.

mux.HandleFunc("/populate", func(w http.ResponseWriter, req *http.Request) {
    numberArr, ok := req.URL.Query()["number"]
    if !ok {
        w.WriteHeader(http.StatusBadRequest)
        w.Write([]byte("Attach proper parameters"))
        return
    }
    numberStr := numberArr[0]
    number, err := strconv.Atoi(numberStr)
    if err != nil {
        w.WriteHeader(http.StatusBadRequest)
        w.Write([]byte("Attach proper parameters"))
        return
    }
    err = Populate(number)
    if err != nil || !ok {
        w.WriteHeader(http.StatusBadRequest)
        w.Write([]byte(err.Error()))
        return
    }
})

Now that we've built out the endpoints we need to tie it all together with the Elasticsearch helper functions.

Elasticsearch Helpers

The final piece of building the microservice is hooking in Elasticsearch. Let's make the helpers now, we'll do this in a new file, elastic.go.

Lets scaffold out how the file should look, first we need a model to create and retrieve json for Elasticsearch. You need to create a user struct with the following fields:

  • Username with a json decorator of username
  • Email with a json decorator of email
  • RealName with a json decorator of real_name

We also need to create the Populate and Search functions.

package main

import (
    "context"
    "github.com/olivere/elastic"
    "github.com/icrowley/fake"
    "encoding/json"
)

type User struct {
    Username string `json:"username"`
    Email    string `json:"email"`
    RealName string `json:"real_name"`
}

func Populate(number int) error {

}

func Search(term string, from, size int) ([]*User, error) {

}

Now that we have the layout figured out the next step is to create the methods. We are using github.com/olivere/elastic to communicate with Elasticsearch.

Let's now create the logic for the search function. The first step is to connect to Elastic search, and we do that by creating a new client. If the connection is successful then we build up a query. We are using a multimatch query, if you are curious about the different options here are the Elasticsearch docs.

Finally, we call the Search method and pass in the proper parameters, we're using the user's index, passing the query from the line above in as the query, and adding the pagination.

The response has quite a few interesting parts, but for this microservice, we aren't doing any logging or analytics, so we're only interested in the results. So we loop through them and use json unmarshal to add the users into our struct and add the struct into an array to show the client.

func Search(term string, from, size int) ([]*User, error) {
    client, err := elastic.NewClient(elastic.SetURL("http://elasticsearch:9200"))
    if err != err {
        return nil, err
    }
    q := elastic.NewMultiMatchQuery(term, "username", "email", "real_name").Fuzziness("AUTO:2,5")
    res, err := client.Search().
        Index("users").
        Query(q).
        From(from).
        Size(size).
        Do(context.Background())
    if err != nil {
        return nil, err
    }
    users := make([]*User, 0)

    for _, hit := range res.Hits.Hits {
        var user User
        err := json.Unmarshal(*hit.Source, &user)
        if err != nil {
            return nil, err
        }
        users = append(users, &user)
    }
    return users, nil
}

The populate function starts the same, we need to connect to the client. But we check if the index exists, and if not we create it. Then we use a faker library to generate the number of users passed into the function and insert them into Elasticsearch.

func Populate(number int) error {
    client, err := elastic.NewClient(elastic.SetURL("http://elasticsearch:9200"))
    if err != nil {
        return err
    }

    idxExists, err := client.IndexExists("users").Do(context.Background())
    if err != nil {
        return err
    }
    if !idxExists {
        client.CreateIndex("users").Do(context.Background())
    }

    for i := 0; i < number; i++ {
        user := User{
            Username: fake.UserName(),
            Email: fake.EmailAddress(),
            RealName: fake.FullName(),
        }
        _, err = client.Index().
            Index("users").
            Type("doc").
            BodyJson(user).
            Do(context.Background())
        if err != nil {
            return err
        }
    }
    return nil
}

Docker Configuration

For this system, I choose docker-compose to link the search microservice to Elasticsearch. There is an excellent tutorial on doing this in the Elasticsearch docs. I followed that with a couple of small modifications. The only part we need to focus on really is the Dockerfile as well as linking it to the Elasticsearch via docker-compose.

The Dockerfile is pretty basic, we grab the Goland 1.10 alpine docker image, add dep for our dependency manager, add our code into the GOPATH, get dependencies with dep, build the program, and then run it.

FROM golang:1.10-alpine

LABEL authors="Ryan McCue <[email protected]>"

RUN apk add --no-cache ca-certificates openssl git
RUN wget -O /usr/local/bin/dep https://github.com/golang/dep/releases/download/v0.4.1/dep-linux-amd64 && \
  chmod +x /usr/local/bin/dep

RUN mkdir /go/src/app

ADD . /go/src/app/

WORKDIR /go/src/app

RUN dep ensure

RUN go build -o main .

CMD ["/go/src/app/main"]

The docker-compose.yml is pretty standard, it is based off this with the search service added and added to the esnet network.

version: '2.2'
services:
  search:
    container_name: search
    build:
      context: .
      dockerfile: ./Dockerfile
    volumes:
      - ./search:/www
    ports:
      - "8080:8000"
    networks:
      - esnet
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:6.2.4
    container_name: elasticsearch
    environment:
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - esdata1:/usr/share/elasticsearch/data
    networks:
      - wsdnet
    healthcheck:
      test: "curl -f http://localhost:9200 || exit 1"
      interval: 1s
      retries: 20
    networks:
      - esnet
  elasticsearch2:
    image: docker.elastic.co/elasticsearch/elasticsearch:6.2.4
    container_name: elasticsearch2
    environment:
      - cluster.name=docker-cluster
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
      - "discovery.zen.ping.unicast.hosts=elasticsearch"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - esdata2:/usr/share/elasticsearch/data
    networks:
      - esnet

volumes:
  esdata1:
    driver: local
  esdata2:
    driver: local

networks:
  esnet:

To test and make sure the files work together as expected run docker-compose build to build the search microservice, and then run docker-compose up to run the code. Elasticsearch will be built on docker-compose up so don't be alarmed if you notice that.

Running It

Now that that the microservice is built, and communicating with Elasticsearch it's time to put it to the test. Let's hit the endpoints and see what happens!

First, we must populate Elasticsearch with results, so lets hit the /populate endpoint. You can run it with the url below:

http://localhost:8080/populate?number=100

Once it is populated, the next step is to search the results. Since we're using Faker, the names aren't predetermined so you might have to try a couple names before you see results, but if you make the number over 10 and search common names, you should get results. You can hit the search endpoint with the url below:

http://localhost:8080/search?q=ryan&from=0&size=20

Conclusion

This post is meant to show how you can easily make a microservice for services such as search, and hook it up to communicate with Elasticsearch. In a real-world system, you would not be generating random data, you would likely be using webhooks or queue systems to populate the search, but the premise is the same. If you have any questions or would like to know why I did something please comment below!

Author image

About Ryan McCue

Hi, my name is Ryan! I am a Software Developer with experience in many web frameworks and libraries including NodeJS, Django, Golang, and Laravel.