ElasticSearch索引可通过REST API使用,但不能使用C#代码

我正在尝试索引包含Elastic Search中的地理点的数据.当我通过代码建立索引时,它将失败.当我通过REST端点进行索引时,它会成功.但是我找不到通过REST端点发送的JSON与使用代码时发送的JSON之间的区别.

以下是用于配置索引的代码(作为LINQPad程序):

async Task Main()
{
    var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
    var connectionSettings = new ConnectionSettings(pool)
        .DefaultMappingFor<DataEntity>(m => m.IndexName("data").TypeName("_doc"));

    var client = new ElasticClient(connectionSettings);

    await client.CreateIndexAsync(
        "data",
        index => index.Mappings(mappings => mappings.Map<DataEntity>(mapping => mapping.AutoMap().Properties(
            properties => properties.GeoPoint(field => field.Name(x => x.Location))))));

//    var data = new DataEntity(new GeoLocationEntity(50, 30));
//            
//    var json = client.RequestResponseSerializer.SerializeToString(data);
//    json.Dump("JSON");
//            
//    var indexResult = await client.IndexDocumentAsync(data);
//    indexResult.DebugInformation.Dump("Debug Information");
}

public sealed class GeoLocationEntity
{
    [JsonConstructor]
    public GeoLocationEntity(
        double latitude,
        double longitude)
    {
        this.Latitude = latitude;
        this.Longitude = longitude;
    }

    [JsonProperty("lat")]
    public double Latitude { get; }

    [JsonProperty("lon")]
    public double Longitude { get; }
}

public sealed class DataEntity
{
    [JsonConstructor]
    public DataEntity(
        GeoLocationEntity location)
    {
        this.Location = location;
    }

    [JsonProperty("location")]
    public GeoLocationEntity Location { get; }
}

运行此命令后,我的映射看起来正确,因为GET / data / _doc / _mapping返回:

{
  "data" : {
    "mappings" : {
      "_doc" : {
        "properties" : {
          "location" : {
            "type" : "geo_point"
          }
        }
      }
    }
  }
}

我可以通过开发控制台将文档成功添加到索引中:

POST /data/_doc
{
  "location": {
    "lat": 88.59,
    "lon": -98.87
  }
}

结果是:

{
  "_index" : "data",
  "_type" : "_doc",
  "_id" : "RqpyjGgBZ27KOduFRIxL",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

但是,当我取消注释上面的LINQPad程序中的代码并执行时,索引时出现此错误:

Invalid NEST response built from a unsuccessful low level call on POST: /data/_doc
# Audit trail of this API call:
 - [1] BadResponse: Node: http://localhost:9200/ Took: 00:00:00.0159927
# OriginalException: Elasticsearch.Net.ElasticsearchClientException: The remote server returned an error: (400) Bad Request.. Call: Status code 400 from: POST /data/_doc. ServerError: Type: mapper_parsing_exception Reason: "failed to parse" CausedBy: "Type: parse_exception Reason: "field must be either [lat], [lon] or [geohash]"" ---> System.Net.WebException: The remote server returned an error: (400) Bad Request.
   at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
   at Elasticsearch.Net.HttpWebRequestConnection.<>c__DisplayClass5_0`1.<RequestAsync>b__1(IAsyncResult r)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task)
   at Elasticsearch.Net.HttpWebRequestConnection.<RequestAsync>d__5`1.MoveNext()
   --- End of inner exception stack trace ---
# Request:
<Request stream not captured or already read to completion by serializer. Set DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>
# Response:
<Response stream not captured or already read to completion by serializer. Set DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>

转储的JSON如下所示:

{
  "location": {
    "latitude": 50.0,
    "longitude": 30.0
  }
}

因此,它与从开发人员控制台运行的JSON结构相匹配.

为了解决这个问题,我编写了一个自定义JsonConverter,以{lat},{lon}的格式序列化GeoLocationEntity对象:

public sealed class GeoLocationConverter : JsonConverter
{
    public override bool CanConvert(Type objectType) =>
        objectType == typeof(GeoLocationEntity);

    public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
    {
        var token = JToken.Load(reader);

        if (!(token is JValue))
        {
            throw new JsonSerializationException("Token was not a primitive.");
        }

        var stringValue = (string)token;
        var split = stringValue.Split(',');
        var latitude = double.Parse(split[0]);
        var longitude = double.Parse(split[1]);

        return new GeoLocationEntity(latitude, longitude);
    }

    public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
    {
        var geoLocation = (GeoLocationEntity)value;

        if (geoLocation == null)
        {
            writer.WriteNull();
            return;
        }

        var geoLocationValue = $"{geoLocation.Latitude},{geoLocation.Longitude}";
        writer.WriteValue(geoLocationValue);
    }
}

将此JsonConverter应用于序列化程序设置使我摆脱了这个问题.但是,我不想解决这样的问题.

谁能启发我如何解决这个问题?

解决方法:

6.x Elasticsearch高级客户端NEST通过以下方式内部化了Json.NET依赖关系:

> IL合并Json.NET程序集
>将所有类型转换为内部
>在Nest下对它们进行重命名.*

实际上,这意味着客户端不直接依赖Json.NET(已阅读release blog post以了解我们为什么这样做),并且不了解Json.NET类型,包括JsonPropertyAttribute或JsonConverter.

有几种解决方法.首先,在开发过程中,以下设置可能会有所帮助

var defaultIndex = "default-index";
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));

var settings = new ConnectionSettings(pool)
    .DefaultMappingFor<DataEntity>(m => m
        .IndexName(defaultIndex)
        .TypeName("_doc")
    )
    .DisableDirectStreaming()
    .PrettyJson()
    .OnRequestCompleted(callDetails =>
    {
        if (callDetails.RequestBodyInBytes != null)
        {
            Console.WriteLine(
                $"{callDetails.HttpMethod} {callDetails.Uri} \n" +
                $"{Encoding.UTF8.GetString(callDetails.RequestBodyInBytes)}");
        }
        else
        {
            Console.WriteLine($"{callDetails.HttpMethod} {callDetails.Uri}");
        }

        Console.WriteLine();

        if (callDetails.ResponseBodyInBytes != null)
        {
            Console.WriteLine($"Status: {callDetails.HttpStatusCode}\n" +
                     $"{Encoding.UTF8.GetString(callDetails.ResponseBodyInBytes)}\n" +
                     $"{new string('-', 30)}\n");
        }
        else
        {
            Console.WriteLine($"Status: {callDetails.HttpStatusCode}\n" +
                     $"{new string('-', 30)}\n");
        }
    });

var client = new ElasticClient(settings);

这会将所有请求和响应写出到控制台,因此您可以查看客户端从Elasticsearch发送和接收的内容. .DisableDirectStreaming()将请求和响应字节缓冲在内存中,以使传递给.OnRequestCompleted()的委托可以使用它们,因此它对于开发很有用,但由于性能而可能在生产中不希望使用成本.

现在,解决方案:

1.使用PropertyNameAttribute

除了使用JsonPropertyAttribute,还可以使用PropertyNameAttribute命名要序列化的属性.

public sealed class GeoLocationEntity
{
    public GeoLocationEntity(
        double latitude,
        double longitude)
    {
        this.Latitude = latitude;
        this.Longitude = longitude;
    }

    [PropertyName("lat")]
    public double Latitude { get; }

    [PropertyName("lon")]
    public double Longitude { get; }
}

public sealed class DataEntity
{
    public DataEntity(
        GeoLocationEntity location)
    {
        this.Location = location;
    }

    [PropertyName("location")]
    public GeoLocationEntity Location { get; }
}

并使用

if (client.IndexExists(defaultIndex).Exists)
    client.DeleteIndex(defaultIndex);


var createIndexResponse = client.CreateIndex(defaultIndex, c => c 
    .Mappings(m => m
        .Map<DataEntity>(mm => mm
            .AutoMap()
            .Properties(p => p
                .GeoPoint(g => g
                    .Name(n => n.Location)
                )
            )
        )
    )
);

var indexResponse = client.Index(
    new DataEntity(new GeoLocationEntity(88.59, -98.87)), 
    i => i.Refresh(Refresh.WaitFor)
);

var searchResponse = client.Search<DataEntity>(s => s
    .Query(q => q
        .MatchAll()
    )
);

PropertyNameAttribute的行为类似于您通常在Json.NET中使用JsonPropertAttribute的方式.

2.使用DataMemberAttribute

如果您不希望POCO拥有NEST类型的属性,则此属性与PropertyNameAttribute相同(尽管我认为POCO与Elasticsearch绑定,因此将它们绑定到.NET Elasticsearch类型可能不行).一个问题).

3.使用地理位置类型

您可以将GeoLocationEntity类型替换为Nest的GeoLocation类型,该类型映射到geo_point字段数据类型映射.使用此方法,它的POCO减少了一个,并且可以从属性类型推断出正确的映射

public sealed class DataEntity
{
    public DataEntity(
        GeoLocation location)
    {
        this.Location = location;
    }

    [DataMember(Name = "location")]
    public GeoLocation Location { get; }
}

// ---

if (client.IndexExists(defaultIndex).Exists)
    client.DeleteIndex(defaultIndex);

var createIndexResponse = client.CreateIndex(defaultIndex, c => c 
    .Mappings(m => m
        .Map<DataEntity>(mm => mm
            .AutoMap()
        )
    )
);

var indexResponse = client.Index(
    new DataEntity(new GeoLocation(88.59, -98.87)), 
    i => i.Refresh(Refresh.WaitFor)
);

var searchResponse = client.Search<DataEntity>(s => s
    .Query(q => q
        .MatchAll()
    )
);

4.连接JsonNetSerializer

NEST允许custom serializer to be hooked up处理序列化类型.一个单独的nuget包NEST.JsonNetSerializer允许您使用Json.NET来序列化您的类型,而序列化程序则委派给内部序列化程序以获取NEST类型的属性.

首先,您需要将JsonNetSerializer传递到ConnectionSettings构造函数中

var settings = new ConnectionSettings(pool, JsonNetSerializer.Default)

然后,您的原始代码将按预期工作,而无需自定义JsonConverter

public sealed class GeoLocationEntity
{
    public GeoLocationEntity(
        double latitude,
        double longitude)
    {
        this.Latitude = latitude;
        this.Longitude = longitude;
    }

    [JsonProperty("lat")]
    public double Latitude { get; }

    [JsonProperty("lon")]
    public double Longitude { get; }
}

public sealed class DataEntity
{
    public DataEntity(
        GeoLocationEntity location)
    {
        this.Location = location;
    }

    [JsonProperty("location")]
    public GeoLocationEntity Location { get; }
}


// ---

if (client.IndexExists(defaultIndex).Exists)
    client.DeleteIndex(defaultIndex);


var createIndexResponse = client.CreateIndex(defaultIndex, c => c 
    .Mappings(m => m
        .Map<DataEntity>(mm => mm
            .AutoMap()
            .Properties(p => p
                .GeoPoint(g => g
                    .Name(n => n.Location)
                )
            )
        )
    )
);

var indexResponse = client.Index(
    new DataEntity(new GeoLocationEntity(88.59, -98.87)), 
    i => i.Refresh(Refresh.WaitFor)
);

var searchResponse = client.Search<DataEntity>(s => s
    .Query(q => q
        .MatchAll()
    )
);

我最后列出了此选项,因为在内部,以这种方式将序列化传递给Json.NET会产生性能和分配开销.包含它是为了提供灵活性,但是我建议仅在确实需要(例如,序列化结构不是常规结构的情况下)完成POCO的自定义序列化时才使用它.我们正在努力以更快的速度进行序列化,这将在未来减少这种开销.

上一篇:基于FPGA的类脑计算平台 —PYNQ 集群的无监督图像识别类脑计算系统


下一篇:c# elasticsearch.net +NEST 返回指定字段的查询方式