chatpdf 虽然是个人工智障,但用好的也是一个非常不错的助手;
Spatial Consistency Loss for Training Multi-Label Classifiers from Single-Label Annotations
我看不懂
H
n
t
H^t_n
Hnt?是什么意思,我这样问他:
用big H
指代它;
Learning Deep Features for Discriminative Localization
(1) 我问它,explain the big F and big H, and what is the score heatmap ?
它这样回答:
In the context of the article “Learning Deep Features for Discriminative Localization,” the terms “big F” and “big H” are used to represent important concepts related to class activation mapping and discriminative localization.
Big F:
Big H (Class Activation Map):
Score Heatmap:
In summary, big F represents the result of global average pooling on the feature maps, big H (class activation map) highlights discriminative regions, and the score heatmap visually represents the activated regions on the input image for a specific class. These concepts are fundamental to the proposed approach for discriminative localization using convolutional neural networks with global average pooling.
(前面几个公式我修正了一点,它原来显示没有这么漂亮)
非常棒,解决了我的问题,主要还是要靠提问技巧:
我又这样问它:
(2) what the shape of small f, tell me something like (1, 2048, 7, 7), please
The shape of the small f, representing the activation of a unit in the last convolutional layer at a spatial location, is (k, 7, 7), where k is the number of units in the last convolutional layer . This indicates that for each unit, the activation is represented by a 7x7 feature map.
(3) why the k of shape (k, 7, 7) is the number of units in the last convolutional layer ?
The shape of small f, representing the activation of a unit in the last convolutional layer at a spatial location, is (k, 7, 7), where k is the number of units in the last convolutional layer .
The reason for this is that the last convolutional layer of a convolutional neural network typically consists of multiple units, each of which is responsible for detecting a specific feature or pattern in the input image. These units are arranged in a 3D grid, with each unit having a 2D feature map representing its activations at different spatial locations in the input image.
In the case of the proposed approach for discriminative localization using global average pooling, the activations of these units are used to compute the class activation maps, which indicate the importance of the activation at spatial locations for the classification of an image to a specific class. Therefore, the number of units in the last convolutional layer determines the number of feature maps in the activation tensor, resulting in the shape of (k, 7, 7) for small f.
就这种Spatial_Consistency_Loss_WACV_2023_supplemental.pdf
,supplemental 有时会用缩写;有时候附录直接放在了文章后面,有时候下载的版本没有附录,但是文章里面提到了确实有这个东西,还得自己去找;