Elasticsearch + Elixir create own SDK

This article is second part of  http://radzserg.com/2017/01/12/how-your-code-should-interact-with-elasticsearch/. But this time we finally going to add some code. And yes we going to work with elixir. But we also want to create our own SDK to work with elasticsearch.

Before we go

First of all – I know that there are existed packages and I even tried some of them. I tried Zatvobor/tirexs that looks impressive but when I wanted to add something more complex I could not do that. Probably there’s a way to send raw json< I couldn’t find it. And I can send low level json by myself.

On other hand I think REST API that elasticsearch provides is all you need. I mean all you need is HTTP client and JSON decoder. That’s it. From my experience as soon as you need something more complex from ES – your SDK will either make you create raw JSON or cannot do that 🙂

And finally I still working with Elixir for fun. So that’s another chance to do something interesting. Nobody says that you have to go this way, that’s only one possible way to go.

What we need

Getting back to part #1. We need:

  • low level client that will talk with elasticsearch
  • modules that will implement ES models – they will know how to build index, how to search in index
  • High level module that will simplify common CRUD actions
  • some extra tools to sync PostgreSQL and ES

Final Result

This is a draft of what we going to build.

Low level client

I think that’s the most interesting part. As I mentioned above – I refused of idea of using existed solutions – be cause create a custom solution is really easy. We can add what we need and extend our module when we need some extra functionality.

Now let’s see what happens here. I’m sure if you are familiar with ES DSL this code is straightforward for you. Most of functions are reflections of ES DSL. For example index_exists(index)put_mapping(index, mapping) etc

But let’s talk about build_url function. Except other functions this one is most custom here. Here I rely on convention over configuration principle. Some time ago I used different types inside one index. But then elastic team provides an update in their blog – index vs type I reviewed my projects and figured out that I don’t need types at all. In all my projects I needed one type per index. So I made a following convention. Index name will equal to type name i.e. index/type = product/product. Although for local env-s you can need an ability to keep indexes for multiple ENVs, main index and test db. For this reason I also added ability to specify index prefix. So you can use dev-product/product and test-product/product. build_url method uses this conventions. Probably I’ll change or extend it in future but so fat I’m satisfied with it.

upsert – it’s another method that is built on a convention. As I mentioned in previous article I do not use elasticsearch as primary data storage. It’s just a snapshot of data that we need from main database storage(Mysql, PostgreSQL) for searching. This means that I need to have ability to match data in PG and ES. Obviously we can use PK that will be that same in PG and ES. So I can say that PG product.id = 3 will match product/product/3 URL in ES. So for any insert/update operation in ES we will have ID from PG. This means that we do not need POST operation for our updates in any case we can rely on PUT. Check more info here

Time to build an index

Now we have low level client and we can work on our domain models. As an example I’m going to build an index for account model. But until we start working on some certain model I want to incapsulate common functionality that will be used in all EsDocument.* modules in additional module.

As you see we inject 3 methods:

  • truncate – that will delete index and then recreate it by put mapping
  • save – this method uses anothe one build_body that have to be overrided in EsDocument.* modules. build_body – will fetch data from PG model that we need for searhing. Then it does upsert explained above.
  • search –  similar to save this method relies on custom implementation of build_query_dsl – this method knows how to build query DSL from raw filter params. It also keeps some logic to format results.

So things that we need to implement in custom EsDocument.* modules –  build_body, build_query_dsl, we also need to know index name index() and we also need to provide mapping data for specific index.

Building custom index

Say we want to search by first_name, last_name, category_id and tags. This means that we will need to fetch this data from PostgreSQL Account model and related models.

This code is very custom and refer to your specific app and there domain models. It’s a demo how I compile ES document from PostgreSQL models. A result is just simple map that will be encoded to JSON.

Searching

For searching I need another helper module. It keeps some common functions that will build final DSL query. (This code could be changed).

It’s very likely that this module will be updated and extended. So far I defined a structure that keeps main fields for search. This struct will be encoded in final quesy DSL JSON string.

Note: by default I’m going to search only id field. And then use find PostgreSQL models using received IDs. In most cases it’s more convenient way for me. I can preload another related data to resulted models or can info that I don’t have in ES from PG Account model. Exception is autocomplete search – it should be quick and all i need as a rule ID and name/title fields. We will handle this case in next example.

It’s custom code that builds DSL query based on received filter_params. App.EsDocument.EsAdvAccount knows how to build it for a certain index. As you see if we encode resulted map to json we will get raw ES query DSL. As I mentioned before probably I’ll update EsQueryDsl in future but so far it’s simple.

Sync data

Additionally I add an example of mix task then can synchronize data between PG and ES.

Conclusion

I repeat myself – it’s not a final solution it’s a draft a pattern that you can use. My main purpose was to show how simple it could be implemented without any SDK. Frankly speaking I rewrote some modules a few times while I write this article. But finally I’m satisfied with a result at least today LOL.

Original post is here http://radzserg.com/2017/02/01/elasticsearch-elixir-create-own-sdk/

Leave a Reply

Your email address will not be published. Required fields are marked *