ClassNotFoundException with MongoDB-Hadoop in Hive

Published by in Mogodb, Hive, Jar, Classnotfoundexception at https://rmoff.net/2016/06/15/classnotfoundexception-with-mongodb-hadoop-in-hive/

I wasted literally two hours on this one, so putting down a note to hopefully help future Googlers.

Symptom ðŸ”—

Here’s all the various errors that I got in the hive-server2.log during my attempts to get a CREATE EXTERNABLE TABLE to work against a MongoDB table in Hive:

Caused by: java.lang.ClassNotFoundException: com.mongodb.hadoop.io.BSONWritable
Caused by: java.lang.ClassNotFoundException: com.mongodb.util.JSON
Caused by: java.lang.ClassNotFoundException: org.bson.conversions.Bson
Caused by: java.lang.ClassNotFoundException: org.bson.io.OutputBuffer

Whilst Hive would throw errors along the lines of:

Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. org/bson/io/OutputBuffer (state=08S01,code=1)

Solution ðŸ”—

If you’re using the MongoDB-Hadoop connector with Hive, you need three JARs:

  • mongo-java-driver
  • mongo-hadoop-core
  • mongo-hadoop-hive

The latter two are part of the Mongo-Hadoop package and can be downloaded pre-compiled here. It’s the first one on the list, mongo-java-driver, that caused me much gnashing of teeth and wailing – because I mistakenly downloaded mongodb-driver instead. Stupid me, right? Because to be fair, the documentation does say:

The connector requires at least version 3.0.0 of the driver “uber” jar (called “mongo-java-driver.jar”).

(my emphasis)

But the link to the download leads to http://mongodb.github.io/mongo-java-driver/, on which mongodb-driver is the default, not mongo-java-driver.

Hey ho, lesson learnt…