Proprietary/Spinenet
RetinaNet object detection model using SpineNet backbone. A Google internal dataset is used to pre-train this model. The pre-trained checkpoint will be loaded as the initial checkpoint. The derived models can be used in commercial products but the weights can not be exported.SpineNet is an image object detection model generated using neural architecture search. Du et al (2020) proposed it in the paper "SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization." A Google internal dataset is used to pretrain the model. The pretrained checkpoint will be loaded as the initial checkpoint. The derived models can be used in commercial products but the weights can't be exported.
Convolutional neural networks typically encode an input image into a series of intermediate features with decreasing resolutions. While this structure is suited to classification tasks, it does not perform well for tasks requiring simultaneous recognition and localization (e.g., object detection). The encoder-decoder architectures are proposed to resolve this by applying a decoder network onto a backbone model designed for classification tasks. Du et al argue that encoder-decoder architecture is ineffective in generating strong multi-scale features because of the scale-decreased backbone. SpineNet is a backbone with scale-permuted intermediate features and cross-scale connections that is learned on an object detection task by neural architecture search.
Using similar building blocks, SpineNet models outperform ResNet-FPN models by ~3% AP at various scales while using 10-20% fewer FLOPs. In particular, SpineNet-190 achieves 52.5% AP with a MaskR-CNN detector and achieves 52.1% AP with a RetinaNet detector on COCO for a single model without test-time augmentation, significantly outperforming prior state-of-the-art detectors. SpineNet can transfer to classification tasks, achieving 5% top-1 accuracy improvement on a challenging iNaturalist fine-grained dataset.
This model is implemented in the TPU Object Detection and Segmentation Model Zoo repository on GitHub.
This model can be used in a notebook. Click Open notebook to use the model in Colab.
This model checkpoint was pretrained on Google's proprietary dataset. Training images are resized/rescaled to the same resolution (1024x1024) and normalized across the RGB channels with mean (0.5, 0.5, 0.5) and standard deviation (0.5, 0.5, 0.5). For more details, see the notebook.
Given an image, the model will output a vector of confidence scores for each support class (label) the model identifies.
Given an image, the model will output x-axis aligned bounding boxes, and corresponding object classes with confidence scores.
Resource ID | Release date | Release stage | Description |
---|---|---|---|
google-proprietary/retina_sp49s | 2023-06-30 | Public Preview | Image object detection finetuning and serving |
google-proprietary/retina_sp96 | 2023-06-30 | Public Preview | Image object detection finetuning and serving |
Google Cloud 콘솔에서 www.gstatic.com의 자바스크립트 소스를 로드할 수 없습니다.
가능한 원인은 다음과 같습니다.