-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FrustumPointNet model #129
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi thank you so much for taking the time for submitting the PR. Please before submitting read the "Contributing" documentation: https://oarriaga.github.io/paz/contributing/
As general remarks please consider installing a python linter in your IDE to solve all the Pep8 conventions including the 80 columns. Also add documentation to the functions and please include unit-tests of the functions that you make. Thank you! and I am looking forward to make this PR work! Also maybe consider making smaller PRs next time :)
paz/models/detection/model_util.py
Outdated
def huber_loss(error, delta): | ||
abs_error = tf.abs(error) | ||
quadratic = tf.minimum(abs_error, delta) | ||
linear = (abs_error - quadratic) | ||
losses = 0.5 * quadratic ** 2 + delta * linear | ||
return tf.reduce_mean(losses) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this part of the loss this should go in the loss file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved, by moving huber_loss function to frustumpointnet_loss.py
from paz.models.detection.model_util import NUM_HEADING_BIN, NUM_SIZE_CLUSTER | ||
from paz.models.detection.model_util import parse_output_to_tensors | ||
from paz.models.detection.model_util import point_cloud_masking | ||
from paz.optimization.losses.frustumpointnet_loss import FPointNet_loss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't add a new file having all utils unless it's really required. Importing constants doesn't seem like it justifies having another file.
from paz.optimization.losses.frustumpointnet_loss import FPointNet_loss | ||
|
||
|
||
def Instance_Seg_Net(point_cloud, one_hot_vec): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use full names. Also follow the same convention as in other PAZ models i.e. CamelCase. For example: InstanceSegmentationNet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed and Renamed all functions following the Camelcase style
|
||
net = tf.expand_dims(point_cloud, 2) | ||
|
||
net = Conv2D(64, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_1')(net) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to comply to pep8 conventions. Specifically having 80 characters per line. I would highly encourage to install a pythin linter in your IDE to automatically check for this. Also try to keep the values in the given order of the function such that you don't specify more than required; thus, taking more space than required. In other words the convolution layer can be specified as well as:
x = Conv2D(64, 1, (1,1), 'valid', activation='relu', name='conv1_1')(x)
Also the name "net" is misleading. It should be a tensor name rather than a model name often we use the variable "x" to indicate a tensor during the model building or "inputs" or "outputs" if it's the first or last tensor respectively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes done
net = Conv2D(64, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_2')(net) | ||
|
||
point_feat = Conv2D(64, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_3')(net) | ||
|
||
net = Conv2D(128, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_4')( | ||
point_feat) | ||
|
||
net = Conv2D(1024, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_5')(net) | ||
|
||
global_feat = MaxPooling2D(pool_size=[num_points, 1], padding='VALID')(net) | ||
|
||
global_feat = tf.concat([global_feat, tf.expand_dims(tf.expand_dims(one_hot_vec, 1), 1)], axis=3) | ||
global_feat_expand = tf.tile(global_feat, [1, num_points, 1, 1]) | ||
concat_feat = tf.concat(axis=3, values=[point_feat, global_feat_expand]) | ||
|
||
net = Conv2D(512, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_6')( | ||
concat_feat) | ||
|
||
net = Conv2D(256, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_7')(net) | ||
|
||
net = Conv2D(128, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_8')(net) | ||
|
||
net = Conv2D(128, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation='relu', name='conv1_9')(net) | ||
|
||
net = Dropout(rate=0.5)(net) | ||
|
||
logits = Conv2D(2, kernel_size=[1, 1], padding='VALID', strides=[1, 1], activation=None, name='conv1_10')(net) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
from paz.models.detection.model_util import NUM_HEADING_BIN, NUM_SIZE_CLUSTER, g_mean_size_arr, \ | ||
get_box3d_corners_helper, get_box3d_corners |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are these variables not as input to the loss. They should not exist as "global" variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also use full names i.e. what does "g" and "arr" mean in "g_mean_size_arr"?
def FPointNet_loss(args, corner_loss_weight=10.0, box_loss_weight=1.0, mask_weight=1.0): | ||
""" Loss functions for 3D object detection. | ||
Input: | ||
args: dict | ||
mask_label: TF int32 tensor in shape (B,N) | ||
center_label: TF tensor in shape (B,3) | ||
heading_class_label: TF int32 tensor in shape (B,) | ||
heading_residual_label: TF tensor in shape (B,) | ||
size_class_label: TF tensor int32 in shape (B,) | ||
size_residual_label: TF tensor tensor in shape (B,) | ||
end_points: dict, outputs from our model | ||
corner_loss_weight: float scalar | ||
box_loss_weight: float scalar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are too many inputs for a single loss. Try to separate the loss into simple functions that try to take (y_true, y_pred) inputs.
mask_label, center_label, heading_class_label, heading_residual_label, size_class_label, size_residual_label, \ | ||
end_points = args[0], args[1], args[2], args[3], args[4], args[5], args[6] | ||
|
||
mask_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(logits=end_points['mask_logits'], | ||
labels=tf.cast(mask_label, tf.int64))) | ||
|
||
center_dist = tf.norm(center_label - end_points['center'], axis=-1) | ||
center_loss = huber_loss(center_dist, delta=2.0) | ||
stage1_center_dist = tf.norm(center_label - end_points['stage1_center'], axis=-1) | ||
stage1_center_loss = huber_loss(stage1_center_dist, delta=1.0) | ||
|
||
heading_class_loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits( | ||
logits=end_points['heading_scores'], labels=tf.cast(heading_class_label, tf.int64))) | ||
|
||
hcls_onehot = tf.one_hot(tf.cast(heading_class_label, tf.int64), depth=NUM_HEADING_BIN, on_value=1, off_value=0, | ||
axis=-1) # BxNUM_HEADING_BIN | ||
heading_residual_normalized_label = heading_residual_label / (np.pi / NUM_HEADING_BIN) | ||
heading_residual_normalized_loss = huber_loss(tf.reduce_sum( | ||
end_points['heading_residuals_normalized'] * tf.cast(hcls_onehot, dtype=tf.float32), axis=1) - \ | ||
heading_residual_normalized_label, delta=1.0) | ||
|
||
size_class_loss = tf.reduce_mean( | ||
tf.nn.sparse_softmax_cross_entropy_with_logits( | ||
logits=end_points['size_scores'], labels=tf.cast(size_class_label, tf.int64))) | ||
|
||
scls_onehot = tf.one_hot(tf.cast(size_class_label, tf.int64), | ||
depth=NUM_SIZE_CLUSTER, | ||
on_value=1, off_value=0, axis=-1) # BxNUM_SIZE_CLUSTER |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separate into smaller functions and make unit-test for each.
scls_onehot_tiled = tf.tile(tf.expand_dims(tf.cast(scls_onehot, dtype=tf.float32), -1), | ||
[1, 1, 3]) # BxNUM_SIZE_CLUSTERx3 | ||
predicted_size_residual_normalized = tf.reduce_sum(end_points['size_residuals_normalized'] * | ||
scls_onehot_tiled, axis=[1]) # Bx3 | ||
|
||
mean_size_arr_expand = tf.expand_dims(tf.constant(g_mean_size_arr, dtype=tf.float32), 0) # 1xNUM_SIZE_CLUSTERx3 | ||
mean_size_label = tf.reduce_sum(scls_onehot_tiled * mean_size_arr_expand, axis=[1]) # Bx3 | ||
size_residual_label_normalized = size_residual_label / mean_size_label | ||
size_normalized_dist = tf.norm(size_residual_label_normalized - predicted_size_residual_normalized, axis=-1) | ||
size_residual_normalized_loss = huber_loss(size_normalized_dist, delta=1.0) | ||
|
||
corners_3d = get_box3d_corners(end_points['center'], | ||
end_points['heading_residuals'], | ||
end_points['size_residuals']) # (B,NH,NS,8,3) | ||
gt_mask = tf.tile(tf.expand_dims(hcls_onehot, 2), [1, 1, NUM_SIZE_CLUSTER]) * \ | ||
tf.tile(tf.expand_dims(scls_onehot, 1), [1, NUM_HEADING_BIN, 1]) # (B,NH,NS) | ||
corners_3d_pred = tf.reduce_sum(tf.cast(tf.expand_dims(tf.expand_dims(gt_mask, -1), -1), dtype=tf.float32) * | ||
corners_3d, axis=[1, 2]) # (B,8,3) | ||
|
||
heading_bin_centers = tf.constant(np.arange(0, 2 * np.pi, 2 * np.pi / NUM_HEADING_BIN), dtype=tf.float32) # (NH,) | ||
heading_label = tf.expand_dims(heading_residual_label, 1) + tf.expand_dims(heading_bin_centers, 0) # (B,NH) | ||
heading_label = tf.reduce_sum(tf.cast(hcls_onehot, dtype=tf.float32) * heading_label, 1) | ||
mean_sizes = tf.expand_dims(tf.constant(g_mean_size_arr, dtype=tf.float32), 0) # (1,NS,3) | ||
size_label = mean_sizes + tf.expand_dims(size_residual_label, 1) # (1,NS,3) + (B,1,3) = (B,NS,3) | ||
size_label = tf.reduce_sum(tf.expand_dims(tf.cast(scls_onehot, dtype=tf.float32), -1) * size_label, | ||
axis=[1]) # (B,3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separate into smaller functions and make unit-tests for each
corners_3d_gt = get_box3d_corners_helper(center_label, heading_label, size_label) # (B,8,3) | ||
corners_3d_gt_flip = get_box3d_corners_helper(center_label, heading_label + np.pi, size_label) # (B,8,3) | ||
|
||
corners_dist = tf.minimum(tf.norm(corners_3d_pred - corners_3d_gt, axis=-1), | ||
tf.norm(corners_3d_pred - corners_3d_gt_flip, axis=-1)) | ||
corners_loss = huber_loss(corners_dist, delta=1.0) | ||
|
||
total_loss = mask_loss * mask_weight + box_loss_weight * (center_loss + heading_class_loss + size_class_loss + | ||
heading_residual_normalized_loss * 20 + | ||
size_residual_normalized_loss * 20 + | ||
stage1_center_loss + | ||
corner_loss_weight * corners_loss) | ||
|
||
return total_loss |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
separate into smaller functions and make unit-tests for each
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes. I made some annotations of some elements I think require further attention.
NUM_SIZE_CLUSTER = 8 | ||
NUM_OBJECT_POINT = 512 | ||
|
||
type2class = {'Car': 0, 'Van': 1, 'Truck': 2, 'Pedestrian': 3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in PAZ we use full names i.e. type_to_class
'Person_sitting': 4, 'Cyclist': 5, 'Tram': 6, 'Misc': 7} | ||
class2type = {type2class[t]: t for t in type2class} | ||
type2onehotclass = {'Car': 0, 'Pedestrian': 1, 'Cyclist': 2} | ||
type_mean_size = {'Car': np.array([3.88311640418, 1.62856739989, 1.52563191]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
type_mean_size
doesn't seem to reflect the proper variable name. Maybe something like class_to_mean
?
'Cyclist': np.array([1.76282397, 0.59706367, 1.73698127]), | ||
'Tram': np.array([16.17150617, 2.53246914, 3.53079012]), | ||
'Misc': np.array([3.64300781, 1.54298177, 1.92320313])} | ||
g_mean_size_arr = np.zeros((NUM_SIZE_CLUSTER, 3)) # size clustrs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is g? use full names
'Tram': np.array([16.17150617, 2.53246914, 3.53079012]), | ||
'Misc': np.array([3.64300781, 1.54298177, 1.92320313])} | ||
g_mean_size_arr = np.zeros((NUM_SIZE_CLUSTER, 3)) # size clustrs | ||
for i in range(NUM_SIZE_CLUSTER): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is i
? use full name even in iteration args
NUM_HEADING_BIN = 12 | ||
NUM_SIZE_CLUSTER = 8 | ||
NUM_OBJECT_POINT = 512 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strange variable names. Maybe something like: NUM_HEAD_BINS, NUM_CLUSTERS, NUM_OBJECTS? Please change adequately. Or maybe I am not understanding what you this variables describe.
size_residual_label], loss, | ||
name='f_pointnet_train') | ||
|
||
det_model = Model(inputs=[point_cloud, one_hot_vector], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use full variable names. What is det? detection?
class FrustumPointNetLoss: | ||
""" Loss functions for 3D object detection. | ||
Input: | ||
args: dict | ||
mask_label: TF int32 tensor in shape (B,N) | ||
center_label: TF tensor in shape (B,3) | ||
heading_class_label: TF int32 tensor in shape (B,) | ||
heading_residual_label: TF tensor in shape (B,) | ||
size_class_label: TF tensor int32 in shape (B,) | ||
size_residual_label: TF tensor tensor in shape (B,) | ||
end_points: dict, outputs from our model | ||
corner_loss_weight: float scalar | ||
box_loss_weight: float scalar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this requires tests for every method or important loss calculation part
M = NUM_OBJECT_POINT as a hyper-parameter | ||
mask_xyz_mean: TF tensor in shape (B,3) | ||
""" | ||
batch_size = point_cloud.get_shape()[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this variable batch size necessary?
""" TF layer. | ||
Inputs: | ||
center: (B,3) | ||
heading_residuals: (B,NH) | ||
size_residuals: (B,NS,3) | ||
Outputs: | ||
box3d_corners: (B,NH,NS,8,3) tensor | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring not using PAZ convention
from examples.frustum_pointnet.dataset_utils import NUM_HEADING_BIN, \ | ||
g_mean_size_arr, NUM_SIZE_CLUSTER | ||
|
||
def ExtractBox3DCornersHelper(centers, headings, sizes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a function it should start with a verb and use lower case separated by underscores. Also the world "Helper" does not convey what it's doing. So we should try to be explicit
Initial Checking in the FrustumPointNet model and the utilities for Frustum PointNet model
Files added:
paz/models/detection/frustum_pointnet.py
paz/models/detection/model_util.py