Skip to content

Source/Sink configuration for using MongoDB with Cascading

Notifications You must be signed in to change notification settings

z00b/cascading.mongodb

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NOTE: This hasn't gotten much traction lately - I've been swamped with some other things.
Nonetheless, there some massive updates to push up, and given a little free time, I intend to
complete this work.  Message me if you have any questions.

This is the Cascading.MongoDB module.

 It provides support for writing data to MongoDB 
 when bound to a Cascading data processing flow.

 Cascading is a feature rich API for defining and executing complex,
 scale-free, and fault tolerant data processing workflows on a Hadoop
 cluster. It can be found at the following location:

   http://www.cascading.org/

Building

 This release requires at least Cascading 1.1.1. Hadoop 0.19.x,
 and the related mongo-java-driver release. 

 To build a jar,

 > ant -Dcascading.home=... -Dhadoop.home=... -Dmongo.driver.home=... jar

 To test,

 > ant -Dcascading.home=... -Dhadoop.home=... -Dmongo.driver.home=... test

where "..." is the install path of each of the dependencies.


Using

  The cascading-mongodb.jar file should be added to the "lib"
  directory of your Hadoop application jar file along with all
  Cascading dependencies.

  You must also include the mongo-java-driver library compatible with your database.

  The current master branch only is usable for sinking to MongoDB.  The API for that is still a little rough, and subject to change once I can simplify the parameters.

About

Source/Sink configuration for using MongoDB with Cascading

Resources

Stars

Watchers

Forks

Packages

No packages published