Transferring data to Elasticsearch from MS SQL Server using ElasticsearchCRUD and Entity Framework

This article shows how to transfer data from a MS SQL Server 2014 to Elasticsearch. The MS SQL Server is accessed using Entity Framework (Code first from a database). Elasticsearch documents are created using ElasticsearchCRUD and inserted in bulk requests. The Entity Framework entities can be used directly in ElasticsearchCRUD. Either the whole entity including all nested entities can be saved as a single document, or just the main entity without any child objects.

Code: https://github.com/damienbod/DataTransferSQLWithEntityFrameworkToElasticsearch

Other tutorials:

Part 1: ElasticsearchCRUD introduction
Part 2: MVC application search with simple documents using autocomplete, jQuery and jTable
Part 3: MVC Elasticsearch CRUD with nested documents
Part 4: Data Transfer from MS SQL Server using Entity Framework to Elasticsearch
Part 5: MVC Elasticsearch with child, parent documents
Part 6: MVC application with Entity Framework and Elasticsearch
Part 7: Live Reindex in Elasticsearch
Part 8: CSV export using Elasticsearch and Web API
Part 9: Elasticsearch Parent, Child, Grandchild Documents and Routing
Part 10: Elasticsearch Type mappings with ElasticsearchCRUD
Part 11: Elasticsearch Synonym Analyzer using ElasticsearchCRUD
Part 12: Using Elasticsearch German Analyzer
Part 13: MVC google maps search using Elasticsearch
Part 14: Search Queries and Filters with ElasticsearchCRUD
Part 15: Elasticsearch Bulk Insert
Part 16: Elasticsearch Aggregations With ElasticsearchCRUD
Part 17: Searching Multiple Indices and Types in Elasticsearch
Part 18: MVC searching with Elasticsearch Highlighting
Part 19: Index Warmers with ElasticsearchCRUD

AdventureWorks2012 is used as the database.It can be downloaded here. You need to install the database before the code will work.

The Application

Create a new console application and download ElasticsearchCRUD and Entity Framework from NuGet.

dt_el_sql_01

Creating the code first database from the AdventureWorks database.

Add a new item to the project, select ADO.NET Entity Data Model:

dt_el_sql_02

Now select the code first from database option. The database already exists.
dt_el_sql_03

Add all the tables from the Person schema. The Address table and the Person table will be used as the document roots.

dt_el_sql_04

The created Address class needs to be changed. The DbGeography SpatialLocation has to be removed because this is not a supported type. In ElasticsearchCRUD V1.0.8 or later, this can be ignored using the JsonIgnore attribute.

namespace DataTransferSQLToEl.SQLDomainModel
{
    using System;
    using System.Collections.Generic;
    using System.ComponentModel.DataAnnotations;
    using System.ComponentModel.DataAnnotations.Schema;
    using System.Data.Entity.Spatial;

    [Table("Person.Address")]
    public partial class Address
    {
        public int AddressID { get; set; }

        [Required]
        [StringLength(60)]
        public string AddressLine1 { get; set; }

        [StringLength(60)]
        public string AddressLine2 { get; set; }

        [Required]
        [StringLength(30)]
        public string City { get; set; }

        public int StateProvinceID { get; set; }

        [Required]
        [StringLength(15)]
        public string PostalCode { get; set; }

		// This type is not supported yet...
        //public DbGeography SpatialLocation { get; set; }

        public Guid rowguid { get; set; }

        public DateTime ModifiedDate { get; set; }

        public virtual StateProvince StateProvince { get; set; }
    }
}

Select the entities and add the documents to Elasticsearch. The Address transfer is implemented as follows:

public void SaveToElasticsearchAddress()
{
	IElasticsearchMappingResolver elasticsearchMappingResolver = new ElasticsearchMappingResolver();
	using (var elasticsearchContext = new ElasticsearchContext("http://localhost:9200/", elasticsearchMappingResolver))
	{
		//elasticsearchContext.TraceProvider = new ConsoleTraceProvider();
		using (var modelPerson = new ModelPerson())
		{
			int pointer = 0;
			const int interval = 100;
			int length = modelPerson.CountryRegion.Count();

			while (pointer < length)
			{
				stopwatch.Start();
				var collection = modelPerson.Address.OrderBy(t => t.AddressID).Skip(pointer).Take(interval).ToList<Address>();
				stopwatch.Stop();
				Console.WriteLine("Time taken for select {0} AddressID: {1}", interval, stopwatch.Elapsed);
				stopwatch.Reset();

				foreach (var item in collection)
				{
					elasticsearchContext.AddUpdateDocument(item, item.AddressID);
					string t = "yes";
				}

				stopwatch.Start();
				elasticsearchContext.SaveChanges();
				stopwatch.Stop();
				Console.WriteLine("Time taken to insert {0} AddressID documents: {1}", interval, stopwatch.Elapsed);
				stopwatch.Reset();
				pointer = pointer + interval;
				Console.WriteLine("Transferred: {0} items", pointer);
			}
		}
	}
}

One hundred entities are selected at a time and added to the ElasticsearchCRUD context. This just adds the objects to an in memory collection. When the SaveChanges method is called, each entity is serialized to a JSON object. when all items have been serialized, a HttpClient instance sends all objects in a HTTP bulk POST request to Elasticsearch. This is repeated until all items have been transferred. ElasticsearchCRUD serializes all child elements, 1-N only. Any references to parent objects, which have already been converted are ignored and saved as a null property.

The created Elasticsearch mapping is as follows (For both Person and Address documents):

{
    "addresss": {
        "mappings": {
            "address": {
                "properties": {
                    "addressid": {
                        "type": "long"
                    },
                    "addressline1": {
                        "type": "string"
                    },
                    "addressline2": {
                        "type": "string"
                    },
                    "city": {
                        "type": "string"
                    },
                    "modifieddate": {
                        "type": "date",
                        "format": "dateOptionalTime"
                    },
                    "postalcode": {
                        "type": "string"
                    },
                    "rowguid": {
                        "type": "string"
                    },
                    "stateprovince": {
                        "properties": {
                            "countryregion": {
                                "properties": {
                                    "countryregioncode": {
                                        "type": "string"
                                    },
                                    "modifieddate": {
                                        "type": "date",
                                        "format": "dateOptionalTime"
                                    },
                                    "name": {
                                        "type": "string"
                                    }
                                }
                            },
                            "countryregioncode": {
                                "type": "string"
                            },
                            "isonlystateprovinceflag": {
                                "type": "boolean"
                            },
                            "modifieddate": {
                                "type": "date",
                                "format": "dateOptionalTime"
                            },
                            "name": {
                                "type": "string"
                            },
                            "rowguid": {
                                "type": "string"
                            },
                            "stateprovincecode": {
                                "type": "string"
                            },
                            "stateprovinceid": {
                                "type": "long"
                            },
                            "territoryid": {
                                "type": "long"
                            }
                        }
                    },
                    "stateprovinceid": {
                        "type": "long"
                    }
                }
            }
        }
    },
    "persons": {
        "mappings": {
            "person": {
                "properties": {
                    "additionalcontactinfo": {
                        "type": "string"
                    },
                    "businessentityid": {
                        "type": "long"
                    },
                    "demographics": {
                        "type": "string"
                    },
                    "emailaddress": {
                        "properties": {
                            "businessentityid": {
                                "type": "long"
                            },
                            "emailaddress1": {
                                "type": "string"
                            },
                            "emailaddressid": {
                                "type": "long"
                            },
                            "modifieddate": {
                                "type": "date",
                                "format": "dateOptionalTime"
                            },
                            "rowguid": {
                                "type": "string"
                            }
                        }
                    },
                    "emailpromotion": {
                        "type": "long"
                    },
                    "firstname": {
                        "type": "string"
                    },
                    "lastname": {
                        "type": "string"
                    },
                    "middlename": {
                        "type": "string"
                    },
                    "modifieddate": {
                        "type": "date",
                        "format": "dateOptionalTime"
                    },
                    "namestyle": {
                        "type": "boolean"
                    },
                    "personphone": {
                        "properties": {
                            "businessentityid": {
                                "type": "long"
                            },
                            "modifieddate": {
                                "type": "date",
                                "format": "dateOptionalTime"
                            },
                            "phonenumber": {
                                "type": "string"
                            },
                            "phonenumbertype": {
                                "properties": {
                                    "modifieddate": {
                                        "type": "date",
                                        "format": "dateOptionalTime"
                                    },
                                    "name": {
                                        "type": "string"
                                    },
                                    "phonenumbertypeid": {
                                        "type": "long"
                                    }
                                }
                            },
                            "phonenumbertypeid": {
                                "type": "long"
                            }
                        }
                    },
                    "persontype": {
                        "type": "string"
                    },
                    "rowguid": {
                        "type": "string"
                    },
                    "suffix": {
                        "type": "string"
                    },
                    "title": {
                        "type": "string"
                    }
                }
            }
        }
    }

}

The saved objects can be read as follows using ElasticsearchCRUD.

public Address GetAddressFromElasticsearch(int id)
{
	Address address;
	IElasticsearchMappingResolver elasticsearchMappingResolver = new ElasticsearchMappingResolver();
	using (var elasticsearchContext = new ElasticsearchContext("http://localhost:9200/", elasticsearchMappingResolver))
	{
		address = elasticsearchContext.GetDocument<Address>(id);
	}

	return address;
}

This can then be used in the console application. When completed, all objects with the nested child objects are saved as documents in Elasticsearch.

using System;

namespace DataTransferSQLToEl
{
	class Program
	{
		static void Main(string[] args)
		{
			var repo = new Repo();
			repo.SaveToElasticsearchPerson();
			repo.SaveToElasticsearchAddress();

			var personX = repo.GetPersonFromElasticsearch(345);
			var addressX = repo.GetAddressFromElasticsearch(22);
			Console.WriteLine(addressX);
			Console.WriteLine(personX);
		}
	}
}

If you want that only the parent entities are saved and that it contains no child entities (NESTED objects in Elasticsearch), you can set this option in the ElasticsearchCRUD context constructor:

bool saveChildEntitiesAsNestedObjects = false;

using (var elasticSearchContext = 
              new ElasticsearchContext(
                 "http://localhost:9200/", 
                  elasticsearchMappingResolver, 
                  saveChildEntitiesAsNestedObjects 
              )
      )
{
  // Do you coding here...
}

Links:

https://damienbod.wordpress.com/2014/09/22/elasticsearch-crud-net-provider/

http://www.elasticsearch.org/

One comment

  1. david · · Reply

    i got error at ElasticsearchContext.SaveChanges();
    stack trace: StackTrace = ” at ElasticsearchCRUD.Utils.SyncExecute.b__7[T](Exception x) in c:\\git\\damienbod\\ElasticsearchCRUD\\ElasticsearchCRUD\\Utils\\SyncExecute.cs:line 74\r\n at System.AggregateException.Handle(Func`2 predicate)\r\n at Elasticsear…

    i dont know how to fix it

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.