当前位置:网站首页>Detectron2 using custom datasets

Detectron2 using custom datasets

2022-04-23 21:01:00 Top of the program

This document explains the data set API(DatasetCatalog、MetadataCatalog) How to work , And how to use them to add custom datasets .

If you want to use a custom dataset , Reuse at the same time detectron2 Data loader (data loaders), You need :

  1. Register your dataset ( namely , tell detectron2 How to get your dataset ).
  2. ( Optional ) Register metadata for your dataset .

In order to make detectron2 Know how to get a file named “my_dataset” Data set of , The user needs to implement a function , This function is used to return the dictionary list of the dataset mentioned below , Then tell detectron2 This function :

def my_dataset_function():
  ...
  return list[dict] in the following format

from detectron2.data import DatasetCatalog
DatasetCatalog.register("my_dataset", my_dataset_function)
# later, to access the data:
data: List[Dict] = DatasetCatalog.get("my_dataset")

ad locum , The code snippet will be named “my_dataset” The data set of is associated with the function that returns the data . If you call more than once , The function must return the same data ( In the same order ). Registration is always valid , Until the process exits .
This function can do anything , And should return list[dict] Data in , Every dict All in one of the following formats :

  • Detectron2 Standard dataset Dictionary , As follows . This will make it similar to detectron2 Many of the other built-in features in , Therefore, it is recommended to use it when sufficient .
  • Any custom format . You can also return any... In your own format dicts, For example, add additional keys for new tasks . then , You also need to handle them correctly downstream . See more details below .
    Standard dataset dictionary
    For standard tasks ( Instance detection 、 example / semantics / Panoramic segmentation 、 Key point detection ), We load the original dataset into the dictionary list ( list[dict]) in , Its specification is similar to COCO The annotation .
    Every dict Contains information about an image . dict May have the following fields , The required fields vary depending on the needs of the data loader or task ( See below )
TaskFields
Commonfile_name, height, width, image_id
Instance detection/segmentationannotations
Semantic segmentationsem_seg_file_name
Panoptic segmentationpan_seg_file_name, segments_info
  • file_name: The full path of the image file .
  • height, width: Integers , The shape of the image .
  • image_id(str or int): A unique... That identifies this image id.
  • annotations (list[dict]): Instance detection / Required for segmentation or key detection tasks . Every dict The corresponding annotation of an instance in this image , And may include the following keys:

 1. bbox (list[float], required):  Represents the instance bounding box  4  A list of numbers .
 2. bbox_mode (int, required): bbox  The format of . It has to be structures.BoxMode Members of . At present, we support :BoxMode.XYXY_ABS、BoxMode.XYWH_ABS.
 3. category_id (int, required)[0, num_categories-1]  Range of integers , Indicates a category label . If applicable , Reserved values  num_categories  To indicate that “ background ” Category .
 4. segmentation (list[list[float]] or dict): Split mask of the instance .
 		a.  If it is  list[list[float]], It represents a list of polygons , One for each connected component of the object . Every  list[float]  Is a simple polygon , The format is  [x1, y1, ..., xn, yn] (n≥3). Xs  and  Ys  Is the absolute coordinate in pixels .		
		b.  If it is  dict, said  COCO  Compress  RLE  Per pixel segmentation mask in format . dict  There should be a key “size” and “counts”. You can  pycocotools.mask.encode(np.asarray(mask, order="F"))  take  0s  and  1s  Of  uint8  The segmentation mask is converted to such  dict. If you use the default data loader in this format ,cfg.INPUT.MASK_FORMAT  Must be set to bitmask .		
 5. keypoints (list[float]): The format is  [x1, y1, v1,..., xn, yn, vn]. v[i]  Indicates the visibility of the key . n  Must be equal to the number of key categories . Xs  and  Ys  yes  [0, W  or  H]  Absolute real value coordinates within the range .		
		( Be careful ,COCO  The key coordinates of the format are  [0, W-1  or  H-1]  Range of integers , This is different from our standard format .Detectron2  take  COCO  Key coordinates plus  0.5  To convert them from discrete pixel indexes to floating point coordinates .)		
 6. iscrowd:0( Default ) or 1. Whether this instance is marked as COCO Of “ Crowd area ”. If you don't know what it means , Please do not include this field .

If annotations It's an empty list , It means that the image is marked as having no object . By default , Such images will be deleted from the training , But you can use DATALOADER.FILTER_EMPTY_ANNOTATIONS Included .

  • sem_seg_file_name (str): Semantic segmentation ground truth The full path to the file . It should be a grayscale image , Its pixel value is an integer label .
  • pan_seg_file_name (str): Panoramic segmentation ground truth The full path to the file . It should be a RGB Images , Its pixel value is using panopticapi.utils.id2rgb Function encoded integer id. id from segments_info Definition . If id
    Not in segments_info in , Then the pixel is considered unmarked , And often overlooked in training and evaluation .
  • Segments_info(list[dict]): Define panoramic segmentation ground truth Each of them id The meaning of . Every dict All have the following keys :

 1. id (int): Appear in the ground truth Integer in image .
 2. category_id(int)[0, num_categories-1]  Range of integers , Indicates a category label .
 3. iscrowd:0( Default ) or 1. Whether this instance is marked as COCO Of “ Crowd area ”.

版权声明
本文为[Top of the program]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/111/202204210545090239.html