A webapp interface to SAM2 for labelling batches of image data. The tool runs inside a Docker container to avoid environment dependency issues, and the Dockerfile for building it is included here. SAM2 enables multi-label object masking, and tracking across subsequent video frames. Supports exporting masks, and YOLO style bounding box annotations.

- Run the container script using the full path of your data folder as the first argument. If you haven't built the container already, this script will do it automatically so it may take a while to run for the first time. It will mount your input data folder to '/workspace/data' within the container. Alternatively, you can change the DEFAULT_LOCAL_DATA_DIR variable inside the script, and use no command line arugments.
./run_sam_container.sh /path/to/data
-
Modify the config/annotator_config.json file. This file allows you to specify model configuration, object labels, and a default 'data_folder' to load. (Coming Soon: I will implement a regex for matching image files of a specified pattern to load into the model). Different data folders may also be loaded while the webapp is running using the interface. NOTE: check how your data is mounted by calling
ls /workspace/data. -
Run the Annotator Webapp script:
python3 scripts/annotator_webapp.py
- Open the address listed in the terminal in your web browser and you should see the annotator tool displayed.
- Using the controls at the top of the page, you can specify a new input folder of images to load and specify the REGEX pattern to match when finding frames. The Find Frames button will simply search for frames, and update the First Frame and Last Frame input boxes based on the frames found. The REGEX pattern must have a group (bounded by parenthesis) to match the number of the frame, as this is used for sorting and filtering out frames.
- The Load Frames into SAM2 button does exactly that, using the Frames Path, Frame search regex, First Frame, and Last Frame, it will find all the matching frames within the specified range and load them into the model for segmenting. If Last Frame is less than First Frame or greater than the highest frame number found, it will simply get all frame numbers above First Frame.
- Previous, Next, and Jump (to N) buttons allow you to step through the loaded images.
- Select the object you're currently labelling with the 'Select Label' dropdown.
- Left-Clicking on the image will add a positive prompt for the selected object. The 'negative prompt' checkbox will make it a negative prompt instead.
- You can propagate/track masks across frames using the 'Propagate All From Here' and 'Propagate N From Here' buttons. Neither of these buttons will change any of the masks in previous frames. If you need to make corrections on any frame, you should add the appropriate prompts, then re-propagate from that frame.
- Once you have all the masks you want, use the export buttons to save everything.
- YOLO Annotations - Exports YOLO style bounding box annotation files. By default this will export them with the object NAME as the label rather than the internal numeric object ID (which is expected by YOLO). TODO: Make this optional.
- Binary Masks - This will export black and white masks for each loaded images. By default, the object ID is encoded into the masks such that the mask color is 255-{obj_id}.
- Colored Masks - This will export all images with their colored masks overlaid (same as how each frame is displayed within the interface).