Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FrustumPointNet model #129

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open

Conversation

jaswanthbjk
Copy link
Contributor

Initial Checking in the FrustumPointNet model and the utilities for Frustum PointNet model

Files added:

paz/models/detection/frustum_pointnet.py
paz/models/detection/model_util.py

Copy link
Owner

@oarriaga oarriaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi thank you so much for taking the time for submitting the PR. Please before submitting read the "Contributing" documentation: https://oarriaga.github.io/paz/contributing/
As general remarks please consider installing a python linter in your IDE to solve all the Pep8 conventions including the 80 columns. Also add documentation to the functions and please include unit-tests of the functions that you make. Thank you! and I am looking forward to make this PR work! Also maybe consider making smaller PRs next time :)

Comment on lines 180 to 185
def huber_loss(error, delta):
abs_error = tf.abs(error)
quadratic = tf.minimum(abs_error, delta)
linear = (abs_error - quadratic)
losses = 0.5 * quadratic ** 2 + delta * linear
return tf.reduce_mean(losses)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this part of the loss this should go in the loss file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved, by moving huber_loss function to frustumpointnet_loss.py

Comment on lines 8 to 11
from paz.models.detection.model_util import NUM_HEADING_BIN, NUM_SIZE_CLUSTER
from paz.models.detection.model_util import parse_output_to_tensors
from paz.models.detection.model_util import point_cloud_masking
from paz.optimization.losses.frustumpointnet_loss import FPointNet_loss
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't add a new file having all utils unless it's really required. Importing constants doesn't seem like it justifies having another file.

from paz.optimization.losses.frustumpointnet_loss import FPointNet_loss


def Instance_Seg_Net(point_cloud, one_hot_vec):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use full names. Also follow the same convention as in other PAZ models i.e. CamelCase. For example: InstanceSegmentationNet

Copy link
Contributor Author

@jaswanthbjk jaswanthbjk Feb 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed and Renamed all functions following the Camelcase style


net = tf.expand_dims(point_cloud, 2)

net = Conv2D(64, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_1')(net)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to comply to pep8 conventions. Specifically having 80 characters per line. I would highly encourage to install a pythin linter in your IDE to automatically check for this. Also try to keep the values in the given order of the function such that you don't specify more than required; thus, taking more space than required. In other words the convolution layer can be specified as well as:

x = Conv2D(64, 1, (1,1), 'valid', activation='relu', name='conv1_1')(x)

Also the name "net" is misleading. It should be a tensor name rather than a model name often we use the variable "x" to indicate a tensor during the model building or "inputs" or "outputs" if it's the first or last tensor respectively.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes done

Comment on lines 22 to 48
net = Conv2D(64, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_2')(net)

point_feat = Conv2D(64, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_3')(net)

net = Conv2D(128, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_4')(
point_feat)

net = Conv2D(1024, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_5')(net)

global_feat = MaxPooling2D(pool_size=[num_points, 1], padding='VALID')(net)

global_feat = tf.concat([global_feat, tf.expand_dims(tf.expand_dims(one_hot_vec, 1), 1)], axis=3)
global_feat_expand = tf.tile(global_feat, [1, num_points, 1, 1])
concat_feat = tf.concat(axis=3, values=[point_feat, global_feat_expand])

net = Conv2D(512, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_6')(
concat_feat)

net = Conv2D(256, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_7')(net)

net = Conv2D(128, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_8')(net)

net = Conv2D(128, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_9')(net)

net = Dropout(rate=0.5)(net)

logits = Conv2D(2, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation=None, name='conv1_10')(net)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 4 to 5
from paz.models.detection.model_util import NUM_HEADING_BIN, NUM_SIZE_CLUSTER, g_mean_size_arr, \
get_box3d_corners_helper, get_box3d_corners
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these variables not as input to the loss. They should not exist as "global" variables.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also use full names i.e. what does "g" and "arr" mean in "g_mean_size_arr"?

Comment on lines 16 to 28
def FPointNet_loss(args, corner_loss_weight=10.0, box_loss_weight=1.0, mask_weight=1.0):
""" Loss functions for 3D object detection.
Input:
args: dict
mask_label: TF int32 tensor in shape (B,N)
center_label: TF tensor in shape (B,3)
heading_class_label: TF int32 tensor in shape (B,)
heading_residual_label: TF tensor in shape (B,)
size_class_label: TF tensor int32 in shape (B,)
size_residual_label: TF tensor tensor in shape (B,)
end_points: dict, outputs from our model
corner_loss_weight: float scalar
box_loss_weight: float scalar
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these are too many inputs for a single loss. Try to separate the loss into simple functions that try to take (y_true, y_pred) inputs.

Comment on lines 37 to 64
mask_label, center_label, heading_class_label, heading_residual_label, size_class_label, size_residual_label, \
end_points = args[0], args[1], args[2], args[3], args[4], args[5], args[6]

mask_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=end_points['mask_logits'],
labels=tf.cast(mask_label, tf.int64)))

center_dist = tf.norm(center_label - end_points['center'], axis=-1)
center_loss = huber_loss(center_dist, delta=2.0)
stage1_center_dist = tf.norm(center_label - end_points['stage1_center'], axis=-1)
stage1_center_loss = huber_loss(stage1_center_dist, delta=1.0)

heading_class_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(
logits=end_points['heading_scores'], labels=tf.cast(heading_class_label, tf.int64)))

hcls_onehot = tf.one_hot(tf.cast(heading_class_label, tf.int64), depth=NUM_HEADING_BIN, on_value=1, off_value=0,
axis=-1) # BxNUM_HEADING_BIN
heading_residual_normalized_label = heading_residual_label / (np.pi / NUM_HEADING_BIN)
heading_residual_normalized_loss = huber_loss(tf.reduce_sum(
end_points['heading_residuals_normalized'] * tf.cast(hcls_onehot, dtype=tf.float32), axis=1) - \
heading_residual_normalized_label, delta=1.0)

size_class_loss = tf.reduce_mean(
tf.nn.sparse_softmax_cross_entropy_with_logits(
logits=end_points['size_scores'], labels=tf.cast(size_class_label, tf.int64)))

scls_onehot = tf.one_hot(tf.cast(size_class_label, tf.int64),
depth=NUM_SIZE_CLUSTER,
on_value=1, off_value=0, axis=-1) # BxNUM_SIZE_CLUSTER
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate into smaller functions and make unit-test for each.

Comment on lines 65 to 90
scls_onehot_tiled = tf.tile(tf.expand_dims(tf.cast(scls_onehot, dtype=tf.float32), -1),
[1, 1, 3]) # BxNUM_SIZE_CLUSTERx3
predicted_size_residual_normalized = tf.reduce_sum(end_points['size_residuals_normalized'] *
scls_onehot_tiled, axis=[1]) # Bx3

mean_size_arr_expand = tf.expand_dims(tf.constant(g_mean_size_arr, dtype=tf.float32), 0) # 1xNUM_SIZE_CLUSTERx3
mean_size_label = tf.reduce_sum(scls_onehot_tiled * mean_size_arr_expand, axis=[1]) # Bx3
size_residual_label_normalized = size_residual_label / mean_size_label
size_normalized_dist = tf.norm(size_residual_label_normalized - predicted_size_residual_normalized, axis=-1)
size_residual_normalized_loss = huber_loss(size_normalized_dist, delta=1.0)

corners_3d = get_box3d_corners(end_points['center'],
end_points['heading_residuals'],
end_points['size_residuals']) # (B,NH,NS,8,3)
gt_mask = tf.tile(tf.expand_dims(hcls_onehot, 2), [1, 1, NUM_SIZE_CLUSTER]) * \
tf.tile(tf.expand_dims(scls_onehot, 1), [1, NUM_HEADING_BIN, 1]) # (B,NH,NS)
corners_3d_pred = tf.reduce_sum(tf.cast(tf.expand_dims(tf.expand_dims(gt_mask, -1), -1), dtype=tf.float32) *
corners_3d, axis=[1, 2]) # (B,8,3)

heading_bin_centers = tf.constant(np.arange(0, 2 * np.pi, 2 * np.pi / NUM_HEADING_BIN), dtype=tf.float32) # (NH,)
heading_label = tf.expand_dims(heading_residual_label, 1) + tf.expand_dims(heading_bin_centers, 0) # (B,NH)
heading_label = tf.reduce_sum(tf.cast(hcls_onehot, dtype=tf.float32) * heading_label, 1)
mean_sizes = tf.expand_dims(tf.constant(g_mean_size_arr, dtype=tf.float32), 0) # (1,NS,3)
size_label = mean_sizes + tf.expand_dims(size_residual_label, 1) # (1,NS,3) + (B,1,3) = (B,NS,3)
size_label = tf.reduce_sum(tf.expand_dims(tf.cast(scls_onehot, dtype=tf.float32), -1) * size_label,
axis=[1]) # (B,3)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate into smaller functions and make unit-tests for each

Comment on lines 91 to 104
corners_3d_gt = get_box3d_corners_helper(center_label, heading_label, size_label) # (B,8,3)
corners_3d_gt_flip = get_box3d_corners_helper(center_label, heading_label + np.pi, size_label) # (B,8,3)

corners_dist = tf.minimum(tf.norm(corners_3d_pred - corners_3d_gt, axis=-1),
tf.norm(corners_3d_pred - corners_3d_gt_flip, axis=-1))
corners_loss = huber_loss(corners_dist, delta=1.0)

total_loss = mask_loss * mask_weight + box_loss_weight * (center_loss + heading_class_loss + size_class_loss +
heading_residual_normalized_loss * 20 +
size_residual_normalized_loss * 20 +
stage1_center_loss +
corner_loss_weight * corners_loss)

return total_loss
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate into smaller functions and make unit-tests for each

Copy link
Owner

@oarriaga oarriaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the changes. I made some annotations of some elements I think require further attention.

NUM_SIZE_CLUSTER = 8
NUM_OBJECT_POINT = 512

type2class = {'Car': 0, 'Van': 1, 'Truck': 2, 'Pedestrian': 3,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in PAZ we use full names i.e. type_to_class

'Person_sitting': 4, 'Cyclist': 5, 'Tram': 6, 'Misc': 7}
class2type = {type2class[t]: t for t in type2class}
type2onehotclass = {'Car': 0, 'Pedestrian': 1, 'Cyclist': 2}
type_mean_size = {'Car': np.array([3.88311640418, 1.62856739989, 1.52563191]),
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type_mean_size doesn't seem to reflect the proper variable name. Maybe something like class_to_mean?

'Cyclist': np.array([1.76282397, 0.59706367, 1.73698127]),
'Tram': np.array([16.17150617, 2.53246914, 3.53079012]),
'Misc': np.array([3.64300781, 1.54298177, 1.92320313])}
g_mean_size_arr = np.zeros((NUM_SIZE_CLUSTER, 3)) # size clustrs
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is g? use full names

'Tram': np.array([16.17150617, 2.53246914, 3.53079012]),
'Misc': np.array([3.64300781, 1.54298177, 1.92320313])}
g_mean_size_arr = np.zeros((NUM_SIZE_CLUSTER, 3)) # size clustrs
for i in range(NUM_SIZE_CLUSTER):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is i? use full name even in iteration args

Comment on lines +3 to +5
NUM_HEADING_BIN = 12
NUM_SIZE_CLUSTER = 8
NUM_OBJECT_POINT = 512
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strange variable names. Maybe something like: NUM_HEAD_BINS, NUM_CLUSTERS, NUM_OBJECTS? Please change adequately. Or maybe I am not understanding what you this variables describe.

size_residual_label], loss,
name='f_pointnet_train')

det_model = Model(inputs=[point_cloud, one_hot_vector],
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use full variable names. What is det? detection?

Comment on lines +80 to +92
class FrustumPointNetLoss:
""" Loss functions for 3D object detection.
Input:
args: dict
mask_label: TF int32 tensor in shape (B,N)
center_label: TF tensor in shape (B,3)
heading_class_label: TF int32 tensor in shape (B,)
heading_residual_label: TF tensor in shape (B,)
size_class_label: TF tensor int32 in shape (B,)
size_residual_label: TF tensor tensor in shape (B,)
end_points: dict, outputs from our model
corner_loss_weight: float scalar
box_loss_weight: float scalar
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this requires tests for every method or important loss calculation part

M = NUM_OBJECT_POINT as a hyper-parameter
mask_xyz_mean: TF tensor in shape (B,3)
"""
batch_size = point_cloud.get_shape()[0]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this variable batch size necessary?

Comment on lines +8 to +15
""" TF layer.
Inputs:
center: (B,3)
heading_residuals: (B,NH)
size_residuals: (B,NS,3)
Outputs:
box3d_corners: (B,NH,NS,8,3) tensor
"""
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring not using PAZ convention

from examples.frustum_pointnet.dataset_utils import NUM_HEADING_BIN, \
g_mean_size_arr, NUM_SIZE_CLUSTER

def ExtractBox3DCornersHelper(centers, headings, sizes):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a function it should start with a verb and use lower case separated by underscores. Also the world "Helper" does not convey what it's doing. So we should try to be explicit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants