- 
                Notifications
    You must be signed in to change notification settings 
- Fork 5
Semantic Labeling #203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
          
     Draft
      
        
      
            sriramk117
  wants to merge
  46
  commits into
  ros2-devel
  
    
      
        
          
  
    
      Choose a base branch
      
     
    
      
        
      
      
        
          
          
        
        
          
            
              
              
              
  
           
        
        
          
            
              
              
           
        
       
     
  
        
          
            
          
            
          
        
       
    
      
from
sriramk/semantic-labeling
  
      
      
   
  
    
  
  
  
 
  
      
    base: ros2-devel
Could not load branches
            
              
  
    Branch not found: {{ refName }}
  
            
                
      Loading
              
            Could not load tags
            
            
              Nothing to show
            
              
  
            
                
      Loading
              
            Are you sure you want to change the base?
            Some commits from the old base branch may be removed from the timeline,
            and old review comments may become outdated.
          
          
  
     Draft
                    Semantic Labeling #203
Changes from all commits
      Commits
    
    
            Show all changes
          
          
            46 commits
          
        
        Select commit
          Hold shift + click to select a range
      
      8ed6683
              
                init
              
              
                sriramk117 dcb74b7
              
                added neccessary callback functions
              
              
                sriramk117 c611e78
              
                implemented functionality to run sam and groundingdino
              
              
                sriramk117 4fb0258
              
                wrote vision pipeline and execute callback
              
              
                sriramk117 5d81216
              
                created result message returned by vision pipeline
              
              
                sriramk117 bfcaeaa
              
                modified launch file and created yaml file for parameters
              
              
                sriramk117 19e9275
              
                updated setup.py and modified parameters
              
              
                sriramk117 c29520d
              
                Merge branch 'ros2-devel' into sriramk/semantic-labeling
              
              
                sriramk117 4e7391f
              
                added requirements to install and fixed imports
              
              
                sriramk117 9c43fc4
              
                changed grounding dino path and added checkpoint
              
              
                sriramk117 31551e4
              
                Added config file + fixed image transformations
              
              
                sriramk117 3ec7b50
              
                Added GroundingDINO visualization function
              
              
                sriramk117 9dc9a40
              
                created GroundingDINO publisher for testing
              
              
                sriramk117 929e570
              
                added more testing code for bbox visualization
              
              
                sriramk117 e1ebf8b
              
                fixed groundingdino results visualization
              
              
                sriramk117 704caa1
              
                corrected image preprocessing?
              
              
                sriramk117 4f9305d
              
                groundingdino works!
              
              
                sriramk117 024c71c
              
                masks are now displayable
              
              
                sriramk117 c78cd4a
              
                record vision pipeline inference time
              
              
                sriramk117 e503800
              
                wrote code to generate mask messages during action calls
              
              
                sriramk117 648a46e
              
                masks msgs are generated but action keeps aborting
              
              
                sriramk117 3032f65
              
                Added gpt-4o query functionality
              
              
                sriramk117 85f9577
              
                groundingdino can be downloaded via github url
              
              
                 0049598
              
                updated comments/code quality changes
              
              
                 e9fd4d5
              
                invoking gpt-4o has been transformed into a service
              
              
                sriramk117 9d52d98
              
                segment all items action now takes a single string as input
              
              
                sriramk117 30bc036
              
                added env variables
              
              
                sriramk117 4d3b27c
              
                environment variables not loading?
              
              
                sriramk117 94af48e
              
                ran black formatter
              
              
                sriramk117 29ed345
              
                Merge branch 'ros2-devel' into sriramk/semantic-labeling
              
              
                sriramk117 23577ae
              
                changes to segmentallitems node initializing it as a perception node
              
              
                sriramk117 3688541
              
                fixed error of topics not being received by segmentallitems action
              
              
                sriramk117 195b123
              
                code cleanup
              
              
                 b8a4ccb
              
                running gpt-4o inference is now an action not a service
              
              
                sriramk117 5363732
              
                cleaned up some comments
              
              
                sriramk117 d73c983
              
                goal status cancellation
              
              
                sriramk117 2326742
              
                temporary changes for running testing procedures
              
              
                sriramk117 b95ac8e
              
                republisher.yaml reverted to original
              
              
                sriramk117 4bf52ea
              
                fixed cv2 visualization merge conflict
              
              
                sriramk117 9382b67
              
                segmentation inference optimization workin
              
              
                sriramk117 5cc10dd
              
                Merge branch 'ros2-devel' into sriramk/semantic-labeling
              
              
                sriramk117 7f70167
              
                updated prompts
              
              
                sriramk117 74b8cda
              
                temporarily removed segment from point
              
              
                sriramk117 e7f7282
              
                implemented nms to remove dino prediction duplicates
              
              
                sriramk117 3a98848
              
                added nearest label interpolation
              
              
                sriramk117 914d0a0
              
                added acquire food client for robot demos
              
              
                sriramk117 File filter
Filter by extension
Conversations
          Failed to load comments.   
        
        
          
      Loading
        
  Jump to
        
          Jump to file
        
      
      
          Failed to load files.   
        
        
          
      Loading
        
  Diff view
Diff view
There are no files selected for viewing
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
              | Original file line number | Diff line number | Diff line change | 
|---|---|---|
|  | @@ -2,6 +2,9 @@ | |
| build/ | ||
| __pycache__/ | ||
|  | ||
| # Environment Variables file | ||
| .env | ||
|  | ||
| # Compiled Object files | ||
| *.slo | ||
| *.lo | ||
|  | ||
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              | Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| # The interface for an action that takes in a list of input labels | ||
| # describing the food items on a plate and returns a sentence caption compiling | ||
| # these labels used as a query for GroundingDINO detection. | ||
|  | ||
| # A list of semantic labels corresponding to each of the masks of detected | ||
| # items in the image | ||
| string[] input_labels | ||
| --- | ||
| # Possible return statuses | ||
| uint8 STATUS_SUCCEEDED=0 | ||
| uint8 STATUS_FAILED=1 | ||
| uint8 STATUS_CANCELED=3 | ||
| uint8 STATUS_UNKNOWN=99 | ||
|  | ||
| # Whether the vision pipeline succeeded and if not, why | ||
| uint8 status | ||
|  | ||
| # The header for the image that the generated caption by GPT-4o | ||
| # corresponds to | ||
| std_msgs/Header header | ||
| # The camera intrinsics | ||
| sensor_msgs/CameraInfo camera_info | ||
| # A sentence caption compiling the semantic labels used as a query for | ||
| # GroundingDINO to perform bounding box detections. | ||
| string caption | ||
| --- | ||
| # How much time the action has spent running inference on GPT-4o | ||
| builtin_interfaces/Duration elapsed_time | 
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              | Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -1,22 +1,27 @@ | ||
| # The interface for an action that gets an image from the camera and returns | ||
| # the masks of all segmented items within that image. | ||
| # the bounding boxes of all items within that image. | ||
|  | ||
| # The list of input semantic labels for the food items on the plate | ||
| string caption | ||
| 
      Comment on lines
    
      +4
     to 
      +5
    
   There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Comment seems misleading, I suspect this was an old comment for  | ||
| --- | ||
| # Possible return statuses | ||
| uint8 STATUS_SUCCEEDED=0 | ||
| uint8 STATUS_FAILED=1 | ||
| uint8 STATUS_CANCELED=3 | ||
| uint8 STATUS_UNKNOWN=99 | ||
|  | ||
| # Whether the segmentation succeeded and if not, why | ||
| # Whether the vision pipeline succeeded and if not, why | ||
| uint8 status | ||
|  | ||
| # The header for the image that the masks corresponds to | ||
| std_msgs/Header header | ||
| # The camera intrinsics | ||
| sensor_msgs/CameraInfo camera_info | ||
| # Masks of all the detected items in the image | ||
| ada_feeding_msgs/Mask[] detected_items | ||
| # Bounding boxes of all the detected items in the image | ||
| sensor_msgs/RegionOfInterest[] detected_items | ||
| # A list of semantic labels corresponding to each of the masks of detected | ||
| # items in the image | ||
| string[] item_labels | ||
| --- | ||
| # How much time the action has spent segmenting the food item | ||
| # How much time the action has spent running the vision pipeline | ||
| builtin_interfaces/Duration elapsed_time | ||
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              | Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| # The interface for an action that gets an image from the camera and a bounding | ||
| # box of the desired item to segment, and then returns the pixel-wise mask | ||
| # of that item | ||
|  | ||
| # The region of interest (bounding box) to seed the segmentation algorithm with | ||
| sensor_msgs/RegionOfInterest region_of_interest | ||
|  | ||
| # The semantic label describing the item bounded by the region of interest | ||
| string label | ||
| --- | ||
| # Possible return statuses | ||
| uint8 STATUS_SUCCEEDED=0 | ||
| uint8 STATUS_FAILED=1 | ||
| uint8 STATUS_CANCELED=3 | ||
| uint8 STATUS_UNKNOWN=99 | ||
|  | ||
| # Whether the segmentation succeeded and if not, why | ||
| uint8 status | ||
|  | ||
| # The header for the image that the masks corresponds to | ||
| std_msgs/Header header | ||
| # The camera intrinsics | ||
| sensor_msgs/CameraInfo camera_info | ||
| # Top contender mask segmented given a bounding box of an item | ||
| ada_feeding_msgs/Mask detected_item | ||
| --- | ||
| # How much time the action has spent segmenting the food item | ||
| builtin_interfaces/Duration elapsed_time | 
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
  
    
      This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
      Learn more about bidirectional Unicode characters
    
  
  
    
              
      
      Oops, something went wrong.
        
    
  
      
      Oops, something went wrong.
        
    
  
  Add this suggestion to a batch that can be applied as a single commit.
  This suggestion is invalid because no changes were made to the code.
  Suggestions cannot be applied while the pull request is closed.
  Suggestions cannot be applied while viewing a subset of changes.
  Only one suggestion per line can be applied in a batch.
  Add this suggestion to a batch that can be applied as a single commit.
  Applying suggestions on deleted lines is not supported.
  You must change the existing code in this line in order to create a valid suggestion.
  Outdated suggestions cannot be applied.
  This suggestion has been applied or marked resolved.
  Suggestions cannot be applied from pending reviews.
  Suggestions cannot be applied on multi-line comments.
  Suggestions cannot be applied while the pull request is queued to merge.
  Suggestion cannot be applied right now. Please check back later.
  
    
  
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a
.envfile added in this PR, but I'm guessing this was more for personal use. I'd recommend omitting this change unless it's relevant for the functionality of the PR.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed more references to a
envfile later in the code, where exactly does this come into play?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm adding environment variable functionality to our codebase so we can privately store API keys without exposing them publicly in github. In this particular case, it is for accessing the PRL OpenAI API key to invoke GPT-4o.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming this may come in handy later on as well if we power perception w/ foundation models in the future.