Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@
"In the literature, the explainable variance is also known as the *signal\n",
"power*. \n",
"\n",
"For more information, see {cite:t}`Sahani2002`, {cite:t}`Hsu2004`, and {cite:t}`Schoppe2016`."
"For more information, see {cite}`Sahani2002,Hsu2004,Schoppe2016`."
]
},
{
Expand Down
85 changes: 52 additions & 33 deletions tutorials/notebooks/shortclips/05_fit_wordnet_model.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,36 @@
"print(\"(n_repeats, n_samples_test, n_voxels) =\", Y_test.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before fitting an encoding model, the fMRI responses are typically z-scored over time. This normalization step is performed for two reasons.\n",
"First, the regularized regression methods used to estimate encoding models generally assume the data to be normalized {cite}`Hastie2009`. \n",
"Second, the temporal mean and standard deviation of a voxel are typically considered uninformative in fMRI because they can vary due to factors unrelated to the task, such as differences in signal-to-noise ratio (SNR).\n",
"\n",
"To preserve each run independent from the others, we z-score each run separately."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from scipy.stats import zscore\n",
"\n",
"# indice of first sample of each run\n",
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
"print(run_onsets)\n",
"\n",
"# zscore each training run separately\n",
"Y_train = np.split(Y_train, run_onsets[1:])\n",
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
"# zscore each test run separately\n",
"Y_test = zscore(Y_test, axis=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -125,6 +155,9 @@
"outputs": [],
"source": [
"Y_test = Y_test.mean(0)\n",
"# We need to zscore the test data again, because we took the mean across repetitions.\n",
"# This averaging step makes the standard deviation approximately equal to 1/sqrt(n_repeats)\n",
"Y_test = zscore(Y_test, axis=0)\n",
"\n",
"print(\"(n_samples_test, n_voxels) =\", Y_test.shape)"
]
Expand Down Expand Up @@ -192,7 +225,8 @@
"following time sample in the validation set. Thus, we define here a\n",
"leave-one-run-out cross-validation split that keeps each recording run\n",
"intact.\n",
"\n"
"\n",
"We define a cross-validation splitter, compatible with ``scikit-learn`` API."
]
},
{
Expand All @@ -206,27 +240,6 @@
"from sklearn.model_selection import check_cv\n",
"from voxelwise_tutorials.utils import generate_leave_one_run_out\n",
"\n",
"# indice of first sample of each run\n",
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
"print(run_onsets)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We define a cross-validation splitter, compatible with ``scikit-learn`` API.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"n_samples_train = X_train.shape[0]\n",
"cv = generate_leave_one_run_out(n_samples_train, run_onsets)\n",
"cv = check_cv(cv) # copy the cross-validation splitter into a reusable list"
Expand All @@ -240,19 +253,24 @@
"\n",
"Now, let's define the model pipeline.\n",
"\n",
"With regularized linear regression models, it is generally recommended to normalize \n",
"(z-score) both the responses and the features before fitting the model {cite}`Hastie2009`. \n",
"Z-scoring corresponds to removing the temporal mean and dividing by the temporal standard deviation.\n",
"We already z-scored the fMRI responses after loading them, so now we need to specify\n",
"in the model how to deal with the features. \n",
"\n",
"We first center the features, since we will not use an intercept. The mean\n",
"value in fMRI recording is non-informative, so each run is detrended and\n",
"demeaned independently, and we do not need to predict an intercept value in\n",
"the linear model.\n",
"\n",
"However, we prefer to avoid normalizing by the standard deviation of each\n",
"feature. If the features are extracted in a consistent way from the stimulus,\n",
"For this particular dataset and example, we do not normalize by the standard deviation \n",
"of each feature. If the features are extracted in a consistent way from the stimulus,\n",
"their relative scale is meaningful. Normalizing them independently from each\n",
"other would remove this information. Moreover, the wordnet features are\n",
"one-hot-encoded, which means that each feature is either present (1) or not\n",
"present (0) in each sample. Normalizing one-hot-encoded features is not\n",
"recommended, since it would scale disproportionately the infrequent features.\n",
"\n"
"recommended, since it would scale disproportionately the infrequent features."
]
},
{
Expand Down Expand Up @@ -778,7 +796,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Similarly to [1]_, we correct the coefficients of features linked by a\n",
"Similarly to {cite:t}`huth2012`, we correct the coefficients of features linked by a\n",
"semantic relationship. When building the wordnet features, if a frame was\n",
"labeled with `wolf`, the authors automatically added the semantically linked\n",
"categories `canine`, `carnivore`, `placental mammal`, `mamma`, `vertebrate`,\n",
Expand Down Expand Up @@ -954,10 +972,11 @@
"voxel_colors = scale_to_rgb_cube(average_coef_transformed[1:4].T, clip=3).T\n",
"print(\"(n_channels, n_voxels) =\", voxel_colors.shape)\n",
"\n",
"ax = plot_3d_flatmap_from_mapper(voxel_colors[0], voxel_colors[1],\n",
" voxel_colors[2], mapper_file=mapper_file,\n",
" vmin=0, vmax=1, vmin2=0, vmax2=1, vmin3=0,\n",
" vmax3=1)\n",
"ax = plot_3d_flatmap_from_mapper(\n",
" voxel_colors[0], voxel_colors[1], voxel_colors[2], \n",
" mapper_file=mapper_file, \n",
" vmin=0, vmax=1, vmin2=0, vmax2=1, vmin3=0, vmax3=1\n",
")\n",
"plt.show()"
]
},
Expand All @@ -984,7 +1003,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "voxelwise_tutorials",
"language": "python",
"name": "python3"
},
Expand All @@ -998,7 +1017,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.12"
"version": "3.10.13"
}
},
"nbformat": 4,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,7 @@
"source": [
"## Load the data\n",
"\n",
"We first load the fMRI responses.\n",
"\n"
"We first load and normalize the fMRI responses."
]
},
{
Expand All @@ -83,23 +82,32 @@
"source": [
"import os\n",
"import numpy as np\n",
"from scipy.stats import zscore\n",
"from voxelwise_tutorials.io import load_hdf5_array\n",
"\n",
"file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n",
"Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n",
"Y_test = load_hdf5_array(file_name, key=\"Y_test\")\n",
"\n",
"print(\"(n_samples_train, n_voxels) =\", Y_train.shape)\n",
"print(\"(n_repeats, n_samples_test, n_voxels) =\", Y_test.shape)"
"print(\"(n_repeats, n_samples_test, n_voxels) =\", Y_test.shape)\n",
"\n",
"# indice of first sample of each run\n",
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
"\n",
"# zscore each training run separately\n",
"Y_train = np.split(Y_train, run_onsets[1:])\n",
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
"# zscore each test run separately\n",
"Y_test = zscore(Y_test, axis=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We average the test repeats, to remove the non-repeatable part of fMRI\n",
"responses.\n",
"\n"
"responses, and normalize the average across repeats."
]
},
{
Expand All @@ -111,6 +119,7 @@
"outputs": [],
"source": [
"Y_test = Y_test.mean(0)\n",
"Y_test = zscore(Y_test, axis=0)\n",
"\n",
"print(\"(n_samples_test, n_voxels) =\", Y_test.shape)"
]
Expand Down Expand Up @@ -169,7 +178,8 @@
"\n",
"We define the same leave-one-run-out cross-validation split as in the\n",
"previous example.\n",
"\n"
"\n",
"We define a cross-validation splitter, compatible with ``scikit-learn`` API."
]
},
{
Expand All @@ -183,27 +193,6 @@
"from sklearn.model_selection import check_cv\n",
"from voxelwise_tutorials.utils import generate_leave_one_run_out\n",
"\n",
"# indice of first sample of each run\n",
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
"print(run_onsets)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We define a cross-validation splitter, compatible with ``scikit-learn`` API.\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"n_samples_train = X_train.shape[0]\n",
"cv = generate_leave_one_run_out(n_samples_train, run_onsets)\n",
"cv = check_cv(cv) # copy the cross-validation splitter into a reusable list"
Expand Down Expand Up @@ -571,7 +560,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "voxelwise_tutorials",
"language": "python",
"name": "python3"
},
Expand All @@ -585,7 +574,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.12"
"version": "3.10.13"
}
},
"nbformat": 4,
Expand Down
22 changes: 16 additions & 6 deletions tutorials/notebooks/shortclips/08_fit_motion_energy_model.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@
"source": [
"## Load the data\n",
"\n",
"We first load the fMRI responses.\n",
"We first load and normalize the fMRI responses.\n",
"\n"
]
},
Expand All @@ -89,23 +89,32 @@
"source": [
"import os\n",
"import numpy as np\n",
"from scipy.stats import zscore\n",
"from voxelwise_tutorials.io import load_hdf5_array\n",
"\n",
"file_name = os.path.join(directory, \"responses\", f\"{subject}_responses.hdf\")\n",
"Y_train = load_hdf5_array(file_name, key=\"Y_train\")\n",
"Y_test = load_hdf5_array(file_name, key=\"Y_test\")\n",
"\n",
"print(\"(n_samples_train, n_voxels) =\", Y_train.shape)\n",
"print(\"(n_repeats, n_samples_test, n_voxels) =\", Y_test.shape)"
"print(\"(n_repeats, n_samples_test, n_voxels) =\", Y_test.shape)\n",
"\n",
"# indice of first sample of each run\n",
"run_onsets = load_hdf5_array(file_name, key=\"run_onsets\")\n",
"\n",
"# zscore each training run separately\n",
"Y_train = np.split(Y_train, run_onsets[1:])\n",
"Y_train = np.concatenate([zscore(run, axis=0) for run in Y_train], axis=0)\n",
"# zscore each test run separately\n",
"Y_test = zscore(Y_test, axis=1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We average the test repeats, to remove the non-repeatable part of fMRI\n",
"responses.\n",
"\n"
"responses, and normalize the average across repeats."
]
},
{
Expand All @@ -117,6 +126,7 @@
"outputs": [],
"source": [
"Y_test = Y_test.mean(0)\n",
"Y_test = zscore(Y_test, axis=0)\n",
"\n",
"print(\"(n_samples_test, n_voxels) =\", Y_test.shape)"
]
Expand Down Expand Up @@ -496,7 +506,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"display_name": "voxelwise_tutorials",
"language": "python",
"name": "python3"
},
Expand All @@ -510,7 +520,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.12"
"version": "3.10.13"
}
},
"nbformat": 4,
Expand Down
Loading