Cascading | Cascading
http://www.cascading.org/projects/cascading/
GitHub - scalding-io/ScaldingUnit: TDD utils for Scalding developers
https://github.com/scalding-io/ScaldingUnit
GitHub - Cascading/cascading: Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on a Hadoop cluster. Please see https://github.com/cwensel/cascading for access to all WIP branches.
https://github.com/Cascading/cascading
Extensions, the SDK, and DSLs
There are a number of projects based on and extensions to Cascading available.
Visit the Cascading Extensions page for a current list.
Or download the Cascading SDK which includes many pre-built binaries.
Of note are three top level projects:
Fluid - A fluent Java API for Cascading that is compatible with the default API.
Lingual - ANSI SQL and JDBC on Cascading
Pattern - Machine Learning scoring and PMML support with Cascading
And alternative languages:
Scalding - A Scala based DSL
Cascalog - A Clojure based DSL
PigPen - A Clojure based DSL
And a third-party computing platform:
Apache Flink - Faster than MapReduce cluster computing
Extensions | Cascading
http://www.cascading.org/extensions/
[datalog方言]Cascalog | Cascading
http://www.cascading.org/projects/cascalog/
[sql]Lingual | Cascading
http://www.cascading.org/projects/lingual/
[Driven] for Cascading | Cascading
http://www.cascading.org/driven/
[算法]Pattern | Cascading
http://www.cascading.org/projects/pattern/
[scala]Scalding | Cascading
http://www.cascading.org/projects/scalding/
[流式API]Fluid | Cascading
http://www.cascading.org/fluid/
Fluid (Cascading Fluid 1.0.0)
http://docs.cascading.org/fluid/1.0/javadoc/fluid-api/
GitHub - Cascading/fluid: A Fluent Java API for Cascading
https://github.com/Cascading/fluid
FlowDef flowDef = flowDef()
.addSource( "lower", sourceLower )
.addSource( "upper", sourceUpper )
.addSink( "result", sink )
.addTails( tails );