This assignment project demonstrates how to calculate marine vessel proximity events using Python libraries such as Pandas, Numpy, and Math. It utilizes the Haversine formula. It also showcases the Pandas vectorization method and visualizes the final result using Plotly.
- data/sample_data.csv: Contains the Vessel locations, Vessel IDs, and Timestamp data.
- src/Marine_Vessel_Proximity_Analysis.py and Marine_Vessel_Proximity_Analysis.ipynb: Python scripts to run the analysis.
- README.md: Assignment documentation.
- requirements.txt: Python dependencies.
-
Clone the repository:
git clone https://github.com/<your-username>/<repository-name>.git cd <repository-name>
-
Install required Python packages:
pip install -r requirements.txt
-
Run the analysis script:
python src/Marine_Vessel_Proximity_Analysis.py
-
pandas (pd):
- Used for reading CSV files and handling structured data.
-
numpy (np):
- Used for performing mathematical operations.
-
math (radians, sin, cos, sqrt, atan2):
- Used for calculating distances using the haversine formula.
-
plotly.express (px):
- Used for creating scatter and line plots.
-
plotly.graph_objects (go):
- Used for enhancing plot customization.
-
Importing Libraries: Started with importing Pandas for data handling, NumPy & Math for mathematical operations, and Plotly for visualizations.
-
Reading Data: Read the data from the 'sample_data.csv' file into a pandas DataFrame to understand its structure.
-
Haversine Distance Function: Defined the
haversine_distance
function to calculate the haversine distance, which is the shortest distance over the Earth's surface between two points, given their latitude and longitude. -
Finding Vessel Proximity Events: Defined the
find_vessel_proximity
function to identify the pairs of vessels that are within a specified distance of each other. It groups the data by timestamp and calculates the distance between each pair of vessels. -
Setting Threshold Distance: Set a threshold distance of 1 kilometer to identify proximity events.
-
Creating Proximity Events DataFrame: Stored the proximity events in a DataFrame, which includes the timestamps and pairs of vessels that were close to each other.
-
Visualizing Results: Created two plots using Plotly. The first plot shows the number of proximity events over time, and the second plot displays the pairs of vessels that were in proximity.
Time Series Plot: This plot shows the number of vessel proximity events over time. The x-axis represents the timestamp, and the y-axis represents the number of proximity events. It helps us see trends and patterns in vessel interactions over time.
Scatter Plot: This plot displays pairs of vessels that were in proximity. Each point represents a pair of vessels that were within the threshold distance of each other. The color indicates the timestamp, helping us understand when these events occurred.
Conclusion: This code reads the vessel data, identifies proximity events using the haversine formula, and visualizes these events to help analyze marine traffic patterns. This approach is crucial for understanding vessel interactions and potential collision risks.