Skip to content

Commit

Permalink
GMM change detection (#100)
Browse files Browse the repository at this point in the history
* anomaly/gmm: Add GMM-based change detection algorithm

Signed-off-by: Jamie Santos <[email protected]>

* caching: add directory for saving files

Signed-off-by: Jamie Santos <[email protected]>

* emd: update the PuLP solver to glpk for compatibility with ARM architecture

Signed-off-by: Jamie Santos <[email protected]>

* gmm_change_detection: remove redundant packages

Signed-off-by: Jamie Santos <[email protected]>

* caching: streamline file handling

Signed-off-by: Jamie Santos <[email protected]>

* caching: add predictions for points for complete model

With this commit, the caching is now in a functional state.

Signed-off-by: Jamie Santos <[email protected]>

* Remove extraneous code from previous clustering algorithm

Signed-off-by: Jamie Santos <[email protected]>

* gmm: remove extra model fitting

Signed-off-by: Jamie Santos <[email protected]>

* caching: remove files for GMM cache debugging

Signed-off-by: Jamie Santos <[email protected]>

* passing python code through linters

* Add object disappearance detection

Signed-off-by: Jamie Santos <[email protected]>

* wip: preprocess: only save images once point cloud is obtained

Signed-off-by: Jamie Santos <[email protected]>

* artificial_data: randomly generate point clouds with specified changes

- Can now get random point clouds with specified start, appearances, and
disappearances

Signed-off-by: Jamie Santos <[email protected]>

* organize scripts into package structure

* adding some headers and converting file from dos to unix

* remote duplicate commented code that is in artificial_data.py

* running on ubuntu 20 the example fake data

* first pass at refactoring the code to simplify it

* removing test data since it can be generated easily

* starting jupyter notebook with basic features and no plot customization

* tested all modes

* merge ground truth with pre-process data

---------

Signed-off-by: Jamie Santos <[email protected]>
Co-authored-by: Jamie Santos <[email protected]>
  • Loading branch information
marinagmoreira and jamiesantos committed Jul 19, 2023
1 parent 4c9c3ff commit 66aa2f0
Show file tree
Hide file tree
Showing 15 changed files with 1,971 additions and 0 deletions.
448 changes: 448 additions & 0 deletions analyst/workspace/gmm-change-detection.ipynb

Large diffs are not rendered by default.

33 changes: 33 additions & 0 deletions anomaly/gmm-change-detection/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Copyright (c) 2021, United States Government, as represented by the
# Administrator of the National Aeronautics and Space Administration.
#
# All rights reserved.
#
# The "ISAAC - Integrated System for Autonomous and Adaptive Caretaking
# platform" software is licensed under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with the
# License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.

cmake_minimum_required(VERSION 3.0 FATAL_ERROR)
project(gmm)

## Compile as C++14, supported in ROS Kinetic and newer
add_compile_options(-std=c++14)


## Find catkin macros and libraries
find_package(catkin REQUIRED COMPONENTS)

# Allow other packages to use python scripts from this package
catkin_python_setup()

catkin_package()

21 changes: 21 additions & 0 deletions anomaly/gmm-change-detection/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
\page gmm GMM Change Detection

# Overview

This implementation of a GMM-based anomaly detection algorithm was created by Jamie Santos, for the purposes of a [Master thesis]().
This algorithm is able to detect changes on environments such as the ISS using 3D point depth cloud data.

# Requirements
pip3 install pulp
pip3 install scikit-learn
pip3 install pyntcloud
pip3 install pandas
pip3 install open3d
apt-get install glpk-utils
apt-get install ros-noetic-ros-numpy

## Usage

rosrun gmm gmm_change_detection.py


21 changes: 21 additions & 0 deletions anomaly/gmm-change-detection/package.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
<?xml version="1.0"?>
<package format="2">
<name>gmm</name>
<version>0.0.0</version>
<description>GMM Change Detection package</description>
<license>
Apache License, Version 2.0
</license>
<author email="[email protected]">
ISAAC Flight Software
</author>
<maintainer email="[email protected]">
ISAAC Flight Software
</maintainer>

<buildtool_depend>catkin</buildtool_depend>
<build_depend>roscpp</build_depend>
<build_depend>cv_bridge</build_depend>
<build_export_depend>roscpp</build_export_depend>
<exec_depend>roscpp</exec_depend>
</package>
Empty file.
68 changes: 68 additions & 0 deletions anomaly/gmm-change-detection/scripts/gmm/artificial_data.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
#!/usr/bin/env python
# Copyright (c) 2017, United States Government, as represented by the
# Administrator of the National Aeronautics and Space Administration.
#
# All rights reserved.
#
# The "ISAAC - Integrated System for Autonomous and Adaptive Caretaking
# platform" software is licensed under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with the
# License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.

import copy

import numpy as np


def generate_data(n_start, n_disappearances, n_appearances):
N = 1000 # N points in each cluster
point_set_1 = []
point_set_2 = []

# Define first set of points
means_1 = np.random.uniform(-2, 2, (n_start, 3)).round(2)
covs_1 = np.zeros(shape=(n_start, 3, 3))
for i in range(n_start):
covs_1[i] = np.diag(np.random.uniform(0.0, 0.1, (1, 3))[0].round(2))

# Remove old clusters in second set of points
means_2 = copy.deepcopy(means_1)
covs_2 = copy.deepcopy(covs_1)

for i in range(n_disappearances):
means_2 = np.delete(means_2, 0, 0)
covs_2 = np.delete(covs_2, 0, 0)

# Add new clusters in second set of points
means_appearances = np.random.uniform(-2, 2, (n_appearances, 3)).round(2)
covs_appearances = np.zeros(shape=(n_appearances, 3, 3))
for i in range(n_appearances):
covs_appearances[i] = np.diag(np.random.uniform(0.0, 0.1, (1, 3))[0].round(2))
means_2 = np.vstack((means_2, means_appearances))
covs_2 = np.vstack((covs_2, covs_appearances))

# Concatenate clusters into point clouds
for i in range(len(means_1)):
x = np.random.multivariate_normal(means_1[i], covs_1[i], N)
point_set_1.append(x)
points_1 = np.concatenate(point_set_1)

for i in range(len(means_2)):
x = np.random.multivariate_normal(means_2[i], covs_2[i], N)
point_set_2.append(x)
points_2 = np.concatenate(point_set_2)

return points_1, points_2


if __name__ == "__main__":
points_1, points_2 = generate_data(5, 3, 9)
print(points_1.shape, points_2.shape)
103 changes: 103 additions & 0 deletions anomaly/gmm-change-detection/scripts/gmm/emd_gmm.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
#!/usr/bin/env python
# Copyright (c) 2017, United States Government, as represented by the
# Administrator of the National Aeronautics and Space Administration.
#
# All rights reserved.
#
# The "ISAAC - Integrated System for Autonomous and Adaptive Caretaking
# platform" software is licensed under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with the
# License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.

# https://towardsdatascience.com/linear-programming-using-python-priyansh-22b5ee888fe0
import numpy as np
from pulp import *


class EMDGMM:
def __init__(self, gmm1_weights, gmm2_weights):
self.warehouse_supply = gmm1_weights # Supply Matrix
self.cust_demands = gmm2_weights # Demand Matrix
self.n_warehouses = gmm1_weights.size
self.n_customers = gmm2_weights.size
self.weight_sum1 = np.sum(self.warehouse_supply)
self.weight_sum2 = np.sum(self.cust_demands)
self.distances = None
self.emd = None

def get_distance(self, means1, means2):
"""Given two GMMs, generate a distance matrix between all cluster
representatives (means) of GMM1 and GMM2. Output: K1 x K2 matrix"""

distances = np.zeros((means1.shape[0], means2.shape[0]))
for i, row1 in enumerate(means1):
for j, row2 in enumerate(means2):
distances[i][j] = np.linalg.norm(row1 - row2)
self.distances = distances

def calculate_emd(self):
"""Optimize the cost-distance (weight-distance) flow between the
two GMMs and use the optimized distance as the EMD distance metric."""

# Cost Matrix
cost_matrix = self.distances

# Initialize Model
model = LpProblem("Supply-Demand-Problem", LpMinimize)

# Define Variable Names
variable_names = [
str(i) + "_" + str(j)
for j in range(1, self.n_customers + 1)
for i in range(1, self.n_warehouses + 1)
]
variable_names.sort()

# Decision Variables
DV_variables = LpVariable.matrix(
"X", variable_names, cat="Continuous", lowBound=0
)
allocation = np.array(DV_variables).reshape(self.n_warehouses, self.n_customers)

# Objective Function
obj_func = lpSum(allocation * cost_matrix)
model += obj_func

# Constraints
for i in range(self.n_warehouses):
# print(lpSum(allocation[i][j] for j in range(self.n_customers)) <= warehouse_supply[i])
model += lpSum(
allocation[i][j] for j in range(self.n_customers)
) <= self.warehouse_supply[i], "Supply Constraints " + str(i)

for j in range(self.n_customers):
# print(lpSum(allocation[i][j] for i in range(self.n_warehouses)) >= cust_demands[j])
model += lpSum(
allocation[i][j] for i in range(self.n_warehouses)
) >= self.cust_demands[j], "Demand Constraints " + str(j)

model.solve(GLPK_CMD(msg=0))
status = LpStatus[model.status]
# print(status)

# print("Total Cost:", model.objective.value())
# for v in model.variables():
# try:
# print(v.name, "=", v.value())
# except:
# print("error couldn't find value")

# for i in range(self.n_warehouses):
# print("Warehouse ", str(i+1))
# print(lpSum(allocation[i][j].value() for j in range(self.n_customers)))

total_flow = min(self.weight_sum1, self.weight_sum2)
self.emd = model.objective.value() / total_flow
Loading

0 comments on commit 66aa2f0

Please sign in to comment.