Indices automatically created while es.index.auto.create = false

`es.index.auto.create` should govern whether elasticsearch-hadoop automatically creates indices or not. At least for Hadoop MapReduce, a check for whether the index exists is done in `org.elasticsearch.hadoop.mr.EsOutputFormat#init` which is called when a job is submitted. However, after that check, auto-creation is then no longer checked.

This causes an issue that if an index is deleted while it is being written to, the index can be recreated in `org.elasticsearch.hadoop.mr.EsOutputFormat.EsRecordWriter#init`. This happens in the first write to the `EsRecordWriter`.

If for instance `action.auto_create_index` is disabled for an Elasticsearch cluster when an index is deleted, writes to it will fail. However, if e.g. a MapReduce task is retried because of this, the check in `EsOutputFormat#init` is not done, so the index is (re-)created in `EsRecordWriter#init`. In case of a bare index (e.g., not managed by index templates) the index is created without a mapping which can cause all sorts of trouble.

A partial stacktrace is included for reference below:

```
"REDACTED" prio=5 tid=0x215 nid=NA runnable
  java.lang.Thread.State: RUNNABLE
	  at org.elasticsearch.hadoop.rest.RestClient.touch(RestClient.java:556)
	  at org.elasticsearch.hadoop.rest.RestRepository.touch(RestRepository.java:373)
	  at org.elasticsearch.hadoop.rest.RestService.initSingleIndex(RestService.java:658)
	  at org.elasticsearch.hadoop.rest.RestService.createWriter(RestService.java:634)
	  at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.init(EsOutputFormat.java:175)
	  at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.write(EsOutputFormat.java:150)
...
```

A possible solution could be to check `es.index.auto.create` somewhere around / in `org.elasticsearch.hadoop.rest.RestRepository#touch`.

I'd be happy to do the coding and provide a PR. But I'd like to get some feedback first.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Indices automatically created while es.index.auto.create = false #2370

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Indices automatically created while es.index.auto.create = false #2370

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions