This article shows how to transfer data from a MS SQL Server 2014 to Elasticsearch. The MS SQL Server is accessed using Entity Framework (Code first from a database). Elasticsearch documents are created using ElasticsearchCRUD and inserted in bulk requests. The Entity Framework entities can be used directly in ElasticsearchCRUD. Either the whole entity including all nested entities can be saved as a single document, or just the main entity without any child objects.
Code: https://github.com/damienbod/DataTransferSQLWithEntityFrameworkToElasticsearch
Other tutorials:
Part 1: ElasticsearchCRUD introduction
Part 2: MVC application search with simple documents using autocomplete, jQuery and jTable
Part 3: MVC Elasticsearch CRUD with nested documents
Part 4: Data Transfer from MS SQL Server using Entity Framework to Elasticsearch
Part 5: MVC Elasticsearch with child, parent documents
Part 6: MVC application with Entity Framework and Elasticsearch
Part 7: Live Reindex in Elasticsearch
Part 8: CSV export using Elasticsearch and Web API
Part 9: Elasticsearch Parent, Child, Grandchild Documents and Routing
Part 10: Elasticsearch Type mappings with ElasticsearchCRUD
Part 11: Elasticsearch Synonym Analyzer using ElasticsearchCRUD
Part 12: Using Elasticsearch German Analyzer
Part 13: MVC google maps search using Elasticsearch
Part 14: Search Queries and Filters with ElasticsearchCRUD
Part 15: Elasticsearch Bulk Insert
Part 16: Elasticsearch Aggregations With ElasticsearchCRUD
Part 17: Searching Multiple Indices and Types in Elasticsearch
Part 18: MVC searching with Elasticsearch Highlighting
Part 19: Index Warmers with ElasticsearchCRUD
AdventureWorks2012 is used as the database.It can be downloaded here. You need to install the database before the code will work.
The Application
Create a new console application and download ElasticsearchCRUD and Entity Framework from NuGet.
Creating the code first database from the AdventureWorks database.
Add a new item to the project, select ADO.NET Entity Data Model:
Now select the code first from database option. The database already exists.
Add all the tables from the Person schema. The Address table and the Person table will be used as the document roots.
The created Address class needs to be changed. The DbGeography SpatialLocation has to be removed because this is not a supported type. In ElasticsearchCRUD V1.0.8 or later, this can be ignored using the JsonIgnore attribute.
namespace DataTransferSQLToEl.SQLDomainModel { using System; using System.Collections.Generic; using System.ComponentModel.DataAnnotations; using System.ComponentModel.DataAnnotations.Schema; using System.Data.Entity.Spatial; [Table("Person.Address")] public partial class Address { public int AddressID { get; set; } [Required] [StringLength(60)] public string AddressLine1 { get; set; } [StringLength(60)] public string AddressLine2 { get; set; } [Required] [StringLength(30)] public string City { get; set; } public int StateProvinceID { get; set; } [Required] [StringLength(15)] public string PostalCode { get; set; } // This type is not supported yet... //public DbGeography SpatialLocation { get; set; } public Guid rowguid { get; set; } public DateTime ModifiedDate { get; set; } public virtual StateProvince StateProvince { get; set; } } }
Select the entities and add the documents to Elasticsearch. The Address transfer is implemented as follows:
public void SaveToElasticsearchAddress() { IElasticsearchMappingResolver elasticsearchMappingResolver = new ElasticsearchMappingResolver(); using (var elasticsearchContext = new ElasticsearchContext("http://localhost:9200/", elasticsearchMappingResolver)) { //elasticsearchContext.TraceProvider = new ConsoleTraceProvider(); using (var modelPerson = new ModelPerson()) { int pointer = 0; const int interval = 100; int length = modelPerson.CountryRegion.Count(); while (pointer < length) { stopwatch.Start(); var collection = modelPerson.Address.OrderBy(t => t.AddressID).Skip(pointer).Take(interval).ToList<Address>(); stopwatch.Stop(); Console.WriteLine("Time taken for select {0} AddressID: {1}", interval, stopwatch.Elapsed); stopwatch.Reset(); foreach (var item in collection) { elasticsearchContext.AddUpdateDocument(item, item.AddressID); string t = "yes"; } stopwatch.Start(); elasticsearchContext.SaveChanges(); stopwatch.Stop(); Console.WriteLine("Time taken to insert {0} AddressID documents: {1}", interval, stopwatch.Elapsed); stopwatch.Reset(); pointer = pointer + interval; Console.WriteLine("Transferred: {0} items", pointer); } } } }
One hundred entities are selected at a time and added to the ElasticsearchCRUD context. This just adds the objects to an in memory collection. When the SaveChanges method is called, each entity is serialized to a JSON object. when all items have been serialized, a HttpClient instance sends all objects in a HTTP bulk POST request to Elasticsearch. This is repeated until all items have been transferred. ElasticsearchCRUD serializes all child elements, 1-N only. Any references to parent objects, which have already been converted are ignored and saved as a null property.
The created Elasticsearch mapping is as follows (For both Person and Address documents):
{ "addresss": { "mappings": { "address": { "properties": { "addressid": { "type": "long" }, "addressline1": { "type": "string" }, "addressline2": { "type": "string" }, "city": { "type": "string" }, "modifieddate": { "type": "date", "format": "dateOptionalTime" }, "postalcode": { "type": "string" }, "rowguid": { "type": "string" }, "stateprovince": { "properties": { "countryregion": { "properties": { "countryregioncode": { "type": "string" }, "modifieddate": { "type": "date", "format": "dateOptionalTime" }, "name": { "type": "string" } } }, "countryregioncode": { "type": "string" }, "isonlystateprovinceflag": { "type": "boolean" }, "modifieddate": { "type": "date", "format": "dateOptionalTime" }, "name": { "type": "string" }, "rowguid": { "type": "string" }, "stateprovincecode": { "type": "string" }, "stateprovinceid": { "type": "long" }, "territoryid": { "type": "long" } } }, "stateprovinceid": { "type": "long" } } } } }, "persons": { "mappings": { "person": { "properties": { "additionalcontactinfo": { "type": "string" }, "businessentityid": { "type": "long" }, "demographics": { "type": "string" }, "emailaddress": { "properties": { "businessentityid": { "type": "long" }, "emailaddress1": { "type": "string" }, "emailaddressid": { "type": "long" }, "modifieddate": { "type": "date", "format": "dateOptionalTime" }, "rowguid": { "type": "string" } } }, "emailpromotion": { "type": "long" }, "firstname": { "type": "string" }, "lastname": { "type": "string" }, "middlename": { "type": "string" }, "modifieddate": { "type": "date", "format": "dateOptionalTime" }, "namestyle": { "type": "boolean" }, "personphone": { "properties": { "businessentityid": { "type": "long" }, "modifieddate": { "type": "date", "format": "dateOptionalTime" }, "phonenumber": { "type": "string" }, "phonenumbertype": { "properties": { "modifieddate": { "type": "date", "format": "dateOptionalTime" }, "name": { "type": "string" }, "phonenumbertypeid": { "type": "long" } } }, "phonenumbertypeid": { "type": "long" } } }, "persontype": { "type": "string" }, "rowguid": { "type": "string" }, "suffix": { "type": "string" }, "title": { "type": "string" } } } } } }
The saved objects can be read as follows using ElasticsearchCRUD.
public Address GetAddressFromElasticsearch(int id) { Address address; IElasticsearchMappingResolver elasticsearchMappingResolver = new ElasticsearchMappingResolver(); using (var elasticsearchContext = new ElasticsearchContext("http://localhost:9200/", elasticsearchMappingResolver)) { address = elasticsearchContext.GetDocument<Address>(id); } return address; }
This can then be used in the console application. When completed, all objects with the nested child objects are saved as documents in Elasticsearch.
using System; namespace DataTransferSQLToEl { class Program { static void Main(string[] args) { var repo = new Repo(); repo.SaveToElasticsearchPerson(); repo.SaveToElasticsearchAddress(); var personX = repo.GetPersonFromElasticsearch(345); var addressX = repo.GetAddressFromElasticsearch(22); Console.WriteLine(addressX); Console.WriteLine(personX); } } }
If you want that only the parent entities are saved and that it contains no child entities (NESTED objects in Elasticsearch), you can set this option in the ElasticsearchCRUD context constructor:
bool saveChildEntitiesAsNestedObjects = false; using (var elasticSearchContext = new ElasticsearchContext( "http://localhost:9200/", elasticsearchMappingResolver, saveChildEntitiesAsNestedObjects ) ) { // Do you coding here... }
Links:
https://damienbod.wordpress.com/2014/09/22/elasticsearch-crud-net-provider/
i got error at ElasticsearchContext.SaveChanges();
stack trace: StackTrace = ” at ElasticsearchCRUD.Utils.SyncExecute.b__7[T](Exception x) in c:\\git\\damienbod\\ElasticsearchCRUD\\ElasticsearchCRUD\\Utils\\SyncExecute.cs:line 74\r\n at System.AggregateException.Handle(Func`2 predicate)\r\n at Elasticsear…
i dont know how to fix it