Skip to content

Conversation

amangupta2
Copy link
Contributor

#27

Added a new model definition to the code to test ANN performance with 10 hidden layers. I have just added an ablation flag to the training file. Would you suggest having a separate file for these ablation studies instead? Also, there is a possibility of making a more general neural net function definition which asks how many hidden layers we want in the model. Not sure if I want to work on that right now. Would be happy to look into it in the future.

Background:
Current ANNs have 6 hidden layers, which I think are not too low. But, it is good to conduct some sensitivity tests around this.

Having too many layers is also not ideal due to vanishing gradients, so I have selected 10 layers (which is how many Qiang has I think)

Copy link
Collaborator

@TomMelt TomMelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps one silly question... but what is ablation?

These changes look fine to me. But in general, for the whole code base, we should really be refactoring the code.

For example, each of the folders have duplicate files e.g., function_training.py, model_definition.py etc. These should be combined into a common set of utils. Then we can reduce code duplication.

I would also like to introduce docstrings, at least on new additions to the code.

I quite like Google's style (e.g., see here).

Could you please add some docstrings to the new functions in this PR?

Going forwards we can add it to all new changes.

As for the code, just a few minor changes.

Comment on lines +416 to +446
self.layer2 = nn.Linear(hdim, hdim)
self.act2 = nn.LeakyReLU()
# self.bnorm2 = nn.BatchNorm1d(hdim)
# -------------------------------------------------------
self.layer3 = nn.Linear(hdim, hdim)
self.act3 = nn.LeakyReLU()
# self.bnorm3 = nn.BatchNorm1d(hdim)
# -------------------------------------------------------
self.layer4 = nn.Linear(hdim, hdim)
self.act4 = nn.LeakyReLU()
# self.bnorm4 = nn.BatchNorm1d(2 * hdim)
# --------------------------------------------------------
self.layer5 = nn.Linear(hdim, hdim)
self.act5 = nn.LeakyReLU()
# self.bnorm5 = nn.BatchNorm1d(hdim)
# -------------------------------------------------------

self.layer6 = nn.Linear(hdim, hdim)
self.act6 = nn.LeakyReLU()
# -------------------------------------------------------
self.layer7 = nn.Linear(hdim, hdim)
self.act7 = nn.LeakyReLU()
# -------------------------------------------------------
self.layer8 = nn.Linear(hdim, hdim)
self.act8 = nn.LeakyReLU()
# -------------------------------------------------------
self.layer9 = nn.Linear(hdim, hdim)
self.act9 = nn.LeakyReLU()
# -------------------------------------------------------
self.layer10 = nn.Linear(hdim, 2 * odim)
self.act10 = nn.LeakyReLU()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should gather this into a list and loop over the layers and activation functions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see here for example:

https://github.com/DataWaveProject/CAM_GW_pytorch_emulator/blob/fd780f5b23fbdbe83d0bf7b33f3ff5c3e216fede/newCAM_emulation/Model.py#L61-L68

        layers = []
        input_size = in_ver * ilev + in_nover  
        for _ in range(hidden_layers):
            layers.append(nn.Linear(input_size, hidden_size, dtype=torch.float64))
            layers.append(nn.SiLU())
            input_size = hidden_size
        layers.append(nn.Linear(hidden_size, out_ver * ilev, dtype=torch.float64))
        self.linear_stack = nn.Sequential(*layers)

Comment on lines +458 to +467
x = self.dropout(self.act1(self.layer1(x)))
x = self.dropout(self.act2(self.layer2(x)))
x = self.dropout(self.act3(self.layer3(x)))
x = self.dropout(self.act4(self.layer4(x)))
x = self.dropout(self.act5(self.layer5(x)))
x = self.dropout(self.act6(self.layer6(x)))
x = self.dropout(self.act7(self.layer7(x)))
x = self.dropout(self.act8(self.layer8(x)))
x = self.dropout(self.act9(self.layer9(x)))
x = self.dropout(self.act10(self.layer10(x)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similarly here:

We should gather this into a list and loop over the layers and activation functions


from dataloader_definition import Dataset_ANN_CNN
from model_definition import ANN_CNN
from model_definition import ANN_CNN, ANN_CNN10
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of the refactor we should perhaps make number of layers a parameter rather than a new class?

@TomMelt TomMelt added the enhancement New feature or request label Dec 17, 2024
@amangupta2 amangupta2 mentioned this pull request Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants