Skip to content

Commit

Permalink
improving the solution
Browse files Browse the repository at this point in the history
  • Loading branch information
angelip2303 committed Jun 11, 2023
1 parent 6492293 commit f518eab
Show file tree
Hide file tree
Showing 4 changed files with 51 additions and 100 deletions.
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "pregel-rs"
version = "0.0.11"
version = "0.0.12"
authors = [ "Ángel Iglesias Préstamo <[email protected]>" ]
description = "A Graph library written in Rust for implementing your own algorithms in a Pregel fashion"
documentation = "https://docs.rs/crate/pregel-rs/latest"
Expand Down
82 changes: 41 additions & 41 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,73 +6,73 @@
[![documentation](https://img.shields.io/docsrs/pregel-rs/latest)](https://docs.rs/pregel-rs/latest/pregel_rs/)

`pregel-rs` is a Graph processing library written in Rust that features
a Pregel-based Framework for implementing your own algorithms in a
message-passing fashion. It is designed to be efficient and scalable,
a Pregel-based Framework for implementing your own algorithms in a
message-passing fashion. It is designed to be efficient and scalable,
making it suitable for processing large-scale graphs.

## Features

- _Pregel-based framework_: `pregel-rs` is a powerful graph processing model
that allows users to implement graph algorithms in a message-passing fashion,
where computation is performed on vertices and messages are passed along edges.
`pregel-rs` provides a framework that makes it easy to implement graph
algorithms using this model.
that allows users to implement graph algorithms in a message-passing fashion,
where computation is performed on vertices and messages are passed along edges.
`pregel-rs` provides a framework that makes it easy to implement graph
algorithms using this model.

- _Rust-based implementation_: `pregel-rs` is implemented in Rust, a systems
programming language known for its safety, concurrency, and performance.
Rust's strong type system and memory safety features help ensure that `pregel-rs`
is robust and reliable.
- _Rust-based implementation_: `pregel-rs` is implemented in Rust, a systems
programming language known for its safety, concurrency, and performance.
Rust's strong type system and memory safety features help ensure that `pregel-rs`
is robust and reliable.

- _Efficient and scalable_: `pregel-rs` designed to be efficient and scalable,
making it suitable for processing large-scale graphs. It uses parallelism and
optimization techniques to minimize computation and communication overhead,
allowing it to handle graphs with millions or even billions of vertices and edges.
For us to achieve this, we have built it on top of [polars](https://github.com/pola-rs/polars)
a blazingly fast DataFrames library implemented in Rust using Apache Arrow
Columnar Format as the memory model.

- _Graph abstraction_: `pregel-rs` provides a graph abstraction that makes
it easy to represent and manipulate graphs in Rust. It supports both directed and
undirected graphs, and provides methods for adding, removing, and querying vertices
and edges.
making it suitable for processing large-scale graphs. It uses parallelism and
optimization techniques to minimize computation and communication overhead,
allowing it to handle graphs with millions or even billions of vertices and edges.
For us to achieve this, we have built it on top of [polars](https://github.com/pola-rs/polars)
a blazingly fast DataFrames library implemented in Rust using Apache Arrow
Columnar Format as the memory model.

- _Graph abstraction_: `pregel-rs` provides a graph abstraction that makes
it easy to represent and manipulate graphs in Rust. It supports both directed and
undirected graphs, and provides methods for adding, removing, and querying vertices
and edges.

- _Customizable computation_: `pregel-rs` allows users to implement their own
computation logic by defining vertex computation functions. This gives users the
flexibility to implement their own graph algorithms and customize the behavior
of `pregel-rs` to suit their specific needs.
computation logic by defining vertex computation functions. This gives users the
flexibility to implement their own graph algorithms and customize the behavior
of `pregel-rs` to suit their specific needs.

## Getting started

To get started with `pregel-rs`, you can follow these steps:

1. _Install Rust_: `pregel-rs` requires Rust to be installed on your system.
You can install Rust by following the instructions on the official Rust website:
https://www.rust-lang.org/tools/install
You can install Rust by following the instructions on the official Rust website:
https://www.rust-lang.org/tools/install

2. _Create a new Rust project_: Once Rust is installed, you can create a new Rust
project using the Cargo package manager, which is included with Rust. You can
create a new project by running the following command in your terminal:
project using the Cargo package manager, which is included with Rust. You can
create a new project by running the following command in your terminal:

```sh
cargo new my_pregel_project
```

3. _Add `pregel-rs` as a dependency_: Next, you need to add `pregel-rs` as a
dependency in your `Cargo.toml` file, which is located in the root directory
of your project. You can add the following line to your `Cargo.toml` file:
3. _Add `pregel-rs` as a dependency_: Next, you need to add `pregel-rs` as a
dependency in your `Cargo.toml` file, which is located in the root directory
of your project. You can add the following line to your `Cargo.toml` file:

```toml
[dependencies]
pregel-rs = "0.0.11"
pregel-rs = "0.0.12"
```

4. _Implement your graph algorithm_: Now you can start implementing your graph
algorithm using the `pregel-rs` framework. You can define your vertex computation
functions and use the graph abstraction provided by `pregel-rs` to manipulate the graph.
algorithm using the `pregel-rs` framework. You can define your vertex computation
functions and use the graph abstraction provided by `pregel-rs` to manipulate the graph.

5. _Build and run your project_: Once you have implemented your graph algorithm, you
can build and run your project using the Cargo package manager. You can build your
project by running the following command in your terminal:
can build and run your project using the Cargo package manager. You can build your
project by running the following command in your terminal:

```sh
cargo build
Expand All @@ -87,14 +87,14 @@ cargo run
## Acknowledgments

Read [Pregel: A System for Large-Scale Graph Processing](https://15799.courses.cs.cmu.edu/fall2013/static/papers/p135-malewicz.pdf)
for a reference on how to implement your own Graph processing algorithms in a Pregel fashion. If you want to take some
for a reference on how to implement your own Graph processing algorithms in a Pregel fashion. If you want to take some
inspiration from some curated-sources, just explore the [/examples](https://github.com/angelip2303/graph-rs/tree/main/examples)
folder of this repository.

## Related projects

1. [GraphX](https://github.com/apache/spark/tree/master/graphx) is a library enabling Graph processing in the context of
Apache Spark.
1. [GraphX](https://github.com/apache/spark/tree/master/graphx) is a library enabling Graph processing in the context of
Apache Spark.
2. [GraphFrames](https://github.com/graphframes/graphframes) is the DataFrame-based equivalent to GraphX.

## License
Expand All @@ -108,11 +108,11 @@ the Free Software Foundation, either version 3 of the License, or

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
along with this program. If not, see <https://www.gnu.org/licenses/>.

**By contributing to this project, you agree to release your
contributions under the same license.**
4 changes: 4 additions & 0 deletions examples/pagerank.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ fn main() -> Result<(), Box<dyn Error>> {
.max_iterations(4)
.with_vertex_column(Custom("rank"))
.initial_message(lit(1.0 / num_vertices))
.send_messages(
MessageReceiver::Subject,
Column::subject(Column::Custom("rank")) / Column::subject(Column::Custom("out_degree")),
)
.send_messages(
MessageReceiver::Object,
Column::subject(Custom("rank")) / Column::subject(Custom("out_degree")),
Expand Down
63 changes: 5 additions & 58 deletions src/pregel.rs
Original file line number Diff line number Diff line change
Expand Up @@ -233,17 +233,6 @@ impl<'a> SendMessage<'a> {
/// each iteration of the algorithm. The vertex program can take as input the current
/// state of the vertex, the messages received from its neighbors or and any other
/// relevant information.
///
/// * `replace_nulls`: `replace_nulls` is an expression that defines how null values
/// in the vertex DataFrame should be replaced. This is useful when the vertex
/// DataFrame contains null values that need to be replaced during the execution of
/// the Pregel algorithm. As an example, when not all vertices are connected to an
/// edge, the edge DataFrame will contain null values in the `dst` column. These
/// null values need to be replaced.
///
/// * `parquet_path` is a property of the `PregelBuilder` struct that represents
/// the path to the Parquet file where the results of the Pregel computation
/// will be stored.
pub struct Pregel<'a> {
/// The `graph` property is a `GraphFrame` struct that represents the
/// graph data structure used in the Pregel algorithm. It contains information about
Expand Down Expand Up @@ -277,13 +266,6 @@ pub struct Pregel<'a> {
/// current state of the vertex, the messages received from its neighbors or
/// and any other relevant information.
v_prog: FnBox<'a>,
/// `replace_nulls` is an expression that defines how null values in the vertex
/// DataFrame should be replaced. This is useful when the vertex DataFrame
/// contains null values that need to be replaced during the execution of the
/// Pregel algorithm. As an example, when not all vertices are connected to an
/// edge, the edge DataFrame will contain null values in the `dst` column. These
/// null values need to be replaced.
replace_nulls: Expr,
}

/// The `PregelBuilder` struct represents a builder for configuring the Pregel
Expand Down Expand Up @@ -325,13 +307,6 @@ pub struct Pregel<'a> {
/// each iteration of the algorithm. The vertex program can take as input the current
/// state of the vertex, the messages received from its neighbors or and any other
/// relevant information.
///
/// /// * `replace_nulls`: `replace_nulls` is an expression that defines how null values
/// in the vertex DataFrame should be replaced. This is useful when the vertex
/// DataFrame contains null values that need to be replaced during the execution of
/// the Pregel algorithm. As an example, when not all vertices are connected to an
/// edge, the edge DataFrame will contain null values in the `dst` column. These
/// null values need to be replaced.
pub struct PregelBuilder<'a> {
/// The `graph` property is a `GraphFrame` struct that represents the
/// graph data structure used in the Pregel algorithm. It contains information about
Expand Down Expand Up @@ -365,13 +340,6 @@ pub struct PregelBuilder<'a> {
/// current state of the vertex, the messages received from its neighbors or
/// and any other relevant information.
v_prog: FnBox<'a>,
/// `replace_nulls` is an expression that defines how null values in the vertex
/// DataFrame should be replaced. This is useful when the vertex DataFrame
/// contains null values that need to be replaced during the execution of the
/// Pregel algorithm. As an example, when not all vertices are connected to an
/// edge, the edge DataFrame will contain null values in the `dst` column. These
/// null values need to be replaced.
replace_nulls: Expr,
}

/// This code is defining an enumeration type `MessageReceiver` in Rust with
Expand Down Expand Up @@ -419,7 +387,6 @@ impl<'a> PregelBuilder<'a> {
send_messages: Default::default(),
aggregate_messages: Box::new(Default::default),
v_prog: Box::new(Default::default),
replace_nulls: Default::default(),
}
}

Expand Down Expand Up @@ -665,27 +632,6 @@ impl<'a> PregelBuilder<'a> {
self
}

/// This function sets the value of a field called "replace_nulls" in a struct to a
/// given expression and returns the modified struct.
///
/// Arguments:
///
/// * `replace_nulls`: `replace_nulls` is a parameter of type `Expr` that is used in
/// a method of a struct. The method takes ownership of the struct (`self`) and the
/// `replace_nulls` parameter, and sets the `replace_nulls` field of the struct to the
/// value of the `replace_nulls` parameter.
///
/// Returns:
///
/// The `replace_nulls` method returns `Self`, which refers to the same struct
/// instance that the method was called on. This allows for method chaining, where
/// multiple methods can be called on the same struct instance in a single
/// expression.
pub fn replace_nulls(mut self, replace_nulls: Expr) -> Self {
self.replace_nulls = replace_nulls;
self
}

/// The function returns a Pregel struct with the specified properties. This is,
/// Pregel structs are to be created using the `Builder Pattern`, a creational
/// design pattern that provides a way to construct complex structs in a
Expand Down Expand Up @@ -743,7 +689,6 @@ impl<'a> PregelBuilder<'a> {
send_messages: self.send_messages,
aggregate_messages: self.aggregate_messages,
v_prog: self.v_prog,
replace_nulls: self.replace_nulls,
}
}
}
Expand Down Expand Up @@ -898,7 +843,6 @@ impl<'a> Pregel<'a> {
col(Column::VertexId.as_ref()), // id column of the current_vertices DataFrame
Column::msg(Some(Column::VertexId)), // msg.id column of the message_df DataFrame
)
.with_column(Column::msg(None).fill_null(self.replace_nulls.to_owned()))
.select(&[
col(Column::VertexId.as_ref()),
v_prog().alias(self.vertex_column.as_ref()),
Expand Down Expand Up @@ -971,7 +915,11 @@ mod tests {
.max_iterations(iterations)
.with_vertex_column(Column::Custom("rank"))
.initial_message(lit(1.0 / num_vertices))
.replace_nulls(lit(0.0))
.send_messages(
MessageReceiver::Subject,
Column::subject(Column::Custom("rank"))
/ Column::subject(Column::Custom("out_degree")),
)
.send_messages(
MessageReceiver::Object,
Column::subject(Column::Custom("rank"))
Expand Down Expand Up @@ -1077,7 +1025,6 @@ mod tests {
v_prog: Box::new(|| {
max_exprs([col(Column::Custom("max_value").as_ref()), Column::msg(None)])
}),
replace_nulls: lit(0),
})
}

Expand Down

0 comments on commit f518eab

Please sign in to comment.