Apache Spark: Graph Analysis via GraphX
From the original source:
GraphX is a component of Spark for graphs and graph-parallel computation. At a high level, GraphX extends the Spark RDD by introducing a new Graph abstraction: a directed multigraph with properties attached to each vertex and edge. To support graph computation, GraphX exposes a set of fundamental operators (e.g., subgraph, joinVertices, and aggregateMessages) as well as an optimized variant of the Pregel API. In addition, GraphX includes a growing collection of graph algorithms and builders to simplify graph analytics tasks.
GraphX extends the Spark RDD with a Resilient Distributed Property Graph.
The property graph is a directed multigraph which can have multiple edges in parallel. Every edge and vertex have user defined properties associated with it. The parallel edges allow multiple relationships between the same vertices.
What are Graphs?
A Graph is a mathematical structure that defines as a set of objects in which some pairs of the objects are related in some sense. These relations can be presented using edges and vertices forming a graph. The vertices represent the objects and the edges show the various relationships between those objects.