Azure Search Indexers – Index data without writing code

Azure Search logo big

The Microsoft Azure Search team recently released a new preview version which includes some interesting new features. I’ve briefly covered them in my previous Azure Search Services post but wanted to dive a bit deeper into the Indexers.

An indexer is basically a content crawler written for a specific data source eliminating the need for writing code responsible for extract data from your data source and feed this data within the index. The indexer runs within the Azure context and it’s therefore not required to create a service for hosting this process (billing details are not at available yet).

NOTE: At the time of writing, it’s only possible to configure a DocumentDB and Azure SQL Database as a data source. However, this might change in the future depending on customer needs.

Azure Search Indexers – Getting started

Before we start creating an indexer, I necessary to understand what’s involved. There are basically 4 parts when setting up an indexer; a supported data source (containing your data), a pointer to your data source (also called data source, storing the connection string), an index and the Indexer.

For now, it isn’t possible to create neither the data source pointer nor the indexer using the Azure portal/PowerShell management API. Therefore it’s required to use the REST based API. I will be using Chrome’s Postman for creating and sending HTTP requests.

API Details can be found here https://msdn.microsoft.com/en-us/library/azure/dn946891.aspx

Test Project Setup

I just wanted to keep everything very simple and therefore configured the following test setup using the Azure Management Portal.

  • Install Chrome’s Postman (or a similar tool) for creating and sending HTTP requests (getpostman.com)
  • Create a new Azure SQL Database. I’m using the AdventureWorksLT [V12] sample database for sample data.

    Azure Search Indexers Database

  • Open the Properties blade of the database, select “Show database connection strings” and copy the ADO.NET version of the connection string

    Azure-Search-Indexers-Database-2

  • Connect to the database using SQL Server Management Studio and create the following view. This because it’s required having a column named id

    UPDATE: Eugene Shvets, who is working on the Azure Search team, informed me that you only need to have a column with the same name as index’s key field. Therefore creating a new view or altering the existing view isn’t required for this example.

  • Create a new Azure Search service
  • After completion, open the Search Service Essentials Blade and copy one of the Administration keys.
    Azure Search Indexers Key
  • Create a new Index (I’ve named mine index1) and add the following fields
    Name: String,Searchable,Retrievable
    Description: String,Searchable,Retrievable
    Azure Search Indexers Index Fields

Creating a new Search Service Data Source

  • Open Postman and make sure change the type to POST
  • Provide the URL pointing to your Search Service endpoint including the api-version as shown below https://[servicename].search.windows.net/datasources?api-version=2015-02-28-Preview
  • Within the headers section add Content-Type : application/json and api-key : [your admin key]
  • Provide the following message body however make sure to alter the connection string

    The container value [SalesLT].[vProductAndDescriptionIndex] points to the view created earlier.

Azure Search Indexers Data Source

Creating a new Search Service Indexer

  • Open Postman and make sure to change the type to POST
  • Provide the URL pointing to your Search Service endpoint including the api-version as shown below https://[servicename].search.windows.net/indexers?api-version=2015-02-28-Preview
  • Within the headers section add Content-Type : application/json and api-key : [your admin key]

    In this example index1 refers to the name of the search Index created earlier.

Run the indexer and retrieve indexer status information

  • Within Postman and make sure to change the type to POST
  • Provide the URL pointing to your Search Service endpoint including the api-version and name of the indexer as shown below
    https://[servicename].search.windows.net/indexers/[indexer name]//run?api-version=2015-02-28-Preview
  • Within the headers section add api-key : [your admin key]

The Get Indexer Status operation retrieves the current status and execution history of an indexer:

  • Change the type to POST
  • Provide the URL pointing to your Search Service endpoint including the api-version and name of the indexer as shown below
    https://[servicename].search.windows.net/indexers/[indexer name]/status?api-version=2015-02-28-Preview
  • Within the headers section add api-key : [your admin key]

Testing the index

The following URL template can be used to test the index. Just keep in mind that it might take some time before the data has been indexed, depending on the amount of data and the selected tier.

https:// [servicename].search.windows.net/indexes/[index name]//docs?search=finger&api-version=2015-02-28

As you can see, it’s really simple to get started with Azure Search. Just keep in mind that I haven’t covered options like Data Change and Deletion Detection Policies, indexing schedules and parameters, resetting an indexer and Mapping constraints. However the detailed documentation makes it easy to explore on your own.

  • DSWH

    Thank you so much for this!! I have spent more hours than I care to remember working on many applications to process search indexes. Getting back to the basics and working with an SQL view was just what I needed. THANK YOU!!

    • Kevin Bronsdijk

      Thank you for your comment. Great to hear this was helpful for you.

  • Scott Simpson

    I don’t see any details in their documentation about multiple indexers and collection fields. If you have data in two tables with the same key field, can you run them both as separate indexers into the same index, or do you have to compile it all into one view?
    Can you populate collections? Can you have multiple rows in a table populate the same field, e.g. for employment – UserID, Employer in a table and convert that to an Employer collection field with multiple values? If anyone can point me in the right direction, it would be much appreciated.

    • Kevin Bronsdijk

      Hi Scott, thank you for your comment. I’ve included the answers below:

      1) You can have multiple indexers writing into the same index, and you can reuse the same data source for multiple indexers. However, an indexer can only consume one data source at a time, and can only write to a single index.

      Now it’s up to you to decide if your scenario would benefit from having multiple indexers instead of using a view within SQL server. Just keep in mind that you will be using additional resources (50 indexers max)

      2) Azure Search supports using a collection based datatype called Collection(Edm.String). There are some restrictions, which are all well documented.

      https://msdn.microsoft.com/en-us/library/azure/dn798938.aspx – Supported data types (Azure Search)

      I’ve tested using this datatype in conjunction with an Azure Search Indexer and can confirm that this is working as expected.

      • Scott Simpson

        Thanks Kevin, much appreciated. I’m still not clear on the format you have to have the source data table/view in though to process sql data (that usually exists as multiple records) into a collection field in the index. Do you have to coalesce data into a single field to be pulled in (tab, space separated?). What exactly is a “list of strings” in this context.

        • Within the view I’m using a function responsible for flattening the data (in my case I’m just using simple tags like Wi-Fi, Parking, Nature etc…)

          The output of the view looks like this:

          | id | col_x | … | tags |
          | 12 | xyz | … | Wi-Fi, Wildlife, Countryside |
          | 14 | xyz | … | Parking, Honeymoon, Pool, Family |

          The documentation is somewhat disordered to say the least… for example the page below contains relevant information regarding the limitations of a collection field.

          https://msdn.microsoft.com/en-us/library/azure/dn798941.aspx – Create Index (Azure Search Service REST API)

          – Fields of type Edm.String or Collection(Edm.String) are searchable by default.
          – Collection(Edm.String) that are filterable do not undergo word-breaking, so comparisons are for exact matches only.
          – Fields of typeCollection(Edm.String) cannot be sortable.

        • Just wanted to let you know that I wrote a blog post covering cover all the necessary steps in great detail. – http://devslice.net/2015/06/azure-search-indexers-indexing-collections/

  • Roberto Prevato

    Hi, thanks for the article. I noticed an error: to get an indexer status, it’s necessary to use GET method, and not POST.

Post Navigation