about network structure and loss function

Hello, thank you for sharing! @youngerous 

I have some questions:

1. in the paper, MMD loss is used before the FC (linear) layer and applied on the global feature of backbone output in [figure2](https://github.com/youngerous/ddgsd-pytorch/blob/main/img/architecture.png). It is called classifier in the figure. However, in your [code implementation of L143 ~ L159](https://github.com/youngerous/ddgsd-pytorch/blob/main/src/trainer.py), all losses are calculated on the output of FC layer. Why? No difference?

2. in the paper, MMD loss is used, but in your implementation, MSE loss is used, are they the same things? have you compared them in experiments?
 
3. Based on your implementation, the predictor in the figure is not a network module, but only a softmax transformation, is that right?

4. in table3 of the paper, cifar-100 top1 error of Resnet18 baseline and ddgsd are 23.45 and 21.47, but in your shared reproducing results are  30.15 and 26.60. 
 - in your opinion, what has caused this large gap?
 - Have you ever reproduced result which is more closely to the paper's result based on your implementation? can you share them? 

sorry for so many questions to bother you, look forward to your reply, thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

about network structure and loss function #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

about network structure and loss function #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions