Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impossible to import collection with binary _id #590

Open
nitmir opened this issue Feb 13, 2017 · 1 comment
Open

Impossible to import collection with binary _id #590

nitmir opened this issue Feb 13, 2017 · 1 comment

Comments

@nitmir
Copy link

nitmir commented Feb 13, 2017

Hi

I have installed this river following the wiki, here is my config:

{
  "index": {
    "name": "testdb",
    "type": "torrents"
  },
  "mongodb": {
    "db": "testdb",
    "servers": [
      {
        "port": 27017,
        "host": "127.0.0.1"
      }
    ],
    "credentials": [
      {
        "db": "admin",
        "password": "password",
        "user": "username"
      }
    ],
    "collection": "torrents_data",
    "options": {
      "exclude_fields": [
        "files"
      ],
      "secondary_read_preference": true
    }
  },
  "type": "mongodb"
}

Here some logs:

[2017-02-13 12:24:40,272][INFO ][river.mongodb            ] [Nomad] Creating MongoClient for [[127.0.0.1:27017]]
[2017-02-13 12:24:41,793][INFO ][river.mongodb            ] [Nomad] [mongodb][testdb] MongoDB version - 3.2.11
[2017-02-13 12:24:41,923][INFO ][river.mongodb            ] [Nomad] [mongodb][testdb] MongoDBRiver is beginning initial import of btdht-crawler.torrents_data
[2017-02-13 12:24:42,649][DEBUG][action.bulk              ] [Nomad] [testdb][2] failed to execute bulk item (index) index {[testdb][torrents][[B@4c438a69], source[{"seeds_peers":0,"file_nb":1,"added":1.486630897982914E9,"_id":"AMAiYk0SsXkBnCD9lxr55m6m/F0=","complete":0,"created":1486630897,"name":"Setup Terraria 1.3.0.3 GOG Version.exe","peers":0,"categories":["software"],"seeds":0,"last_scrape":1486630899,"size":137288792}]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse [_id]
    at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:411)
    at org.elasticsearch.index.mapper.internal.IdFieldMapper.parse(IdFieldMapper.java:295)
    at org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:706)
    at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:497)
    at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:544)
    at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:493)
    at org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:493)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:409)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:148)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase.performOnPrimary(TransportShardReplicationOperationAction.java:574)
    at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$PrimaryPhase$1.doRun(TransportShardReplicationOperationAction.java:440)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.elasticsearch.index.mapper.MapperParsingException: Provided id [[B@4c438a69] does not match the content one [AMAiYk0SsXkBnCD9lxr55m6m/F0=]
    at org.elasticsearch.index.mapper.internal.IdFieldMapper.parseCreateField(IdFieldMapper.java:310)
    at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:401)
    ... 14 more

ending up with an IMPORT_FAILED status.

Here the mongodb document:

rs1:PRIMARY> db.torrents_data.find({_id: BinData(0,"AMAiYk0SsXkBnCD9lxr55m6m/F0=")})
{ "_id" : BinData(0,"AMAiYk0SsXkBnCD9lxr55m6m/F0="), "files" : null, "added" : 1486630897.982914, "name" : "Setup Terraria 1.3.0.3 GOG Version.exe", "created" : 1486630897, "file_nb" : 1, "size" : 137288792, "peers" : 0, "seeds" : 0, "last_scrape" : 1486630899, "complete" : 0, "seeds_peers" : 0, "categories" : [ "software" ] }

So I am unable to index my mongodb collection: for every document, I get the error in the logs above.
I am guessing that this may be due to the fact that my _id are binary data (non ascii, 20 bytes binary data), but I am no sure.

Does anyone known how to solve this ?

@nitmir
Copy link
Author

nitmir commented Feb 13, 2017

I have tested with a cloned collection where _id are hexadecimally encoded and all the documents are successfully indexed, so I think this confirm that there is an issue with binary _id.

@nitmir nitmir changed the title MapperParsingException: failed to parse [_id] Impossible to import collection with binary _id Feb 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant