Skip to content

Dataset Index not included as DataFrame column in ._to_dataframe() when name different from dimension name #10851

@stijnvanhoey

Description

@stijnvanhoey

What happened?

The .to_dataframe function describes in the documentation "Other coordinates are included as columns in the DataFrame.".

When applying the function on a Dataset that contains an index that is not the same 'name' as the corresponding dimension, the coordinate is not included in the resulting Pandas DataFrame. E.g.

import xarray as xr
import pandas as pd
import numpy as np

ds_temp = xr.Dataset(data_vars=dict(temp=(["time", "pos"], np.array([[5, 10, 15, 20, 25]]))), coords=dict(pf=("pos", [1., 2., 4.2, 8., 10.]), time=("time", [pd.to_datetime("2025-01-01")]))).set_xindex("pf")

The example Dataset looks like

<xarray.Dataset> Size: 88B
Dimensions:  (time: 1, pos: 5)
Coordinates:
  * time     (time) datetime64[ns] 8B 2025-01-01
  * pf       (pos) float64 40B 1.0 2.0 4.2 8.0 10.0
Dimensions without coordinates: pos
Data variables:
    temp     (time, pos) int64 40B 5 10 15 20 25

Converting the Dataset to a Pandas DataFrame:

ds_temp.to_dataframe()

The resulting DataFrame is missing the pf coordinate in the returned DataFrame:

                temp
time       pos      
2025-01-01 0       5
           1      10
           2      15
           3      20
           4      25

Dropping the index and applying to_dataframe does actually include the respective coords in the DataFrame:

>>> ds_temp.drop_indexes("pf").to_dataframe()
                temp    pf
time       pos            
2025-01-01 0       5   1.0
           1      10   2.0
           2      15   4.2
           3      20   8.0
           4      25  10.0

This behavior changed in between recent release as in version 2025.1.2 the column was included. I assume this change results from the support for ExtensionArray.

What did you expect to happen?

An index that has not the same name as the dimension is also included in the resulting DataFrame, in the case of the example having pf in the final DataFrame.

Minimal Complete Verifiable Example

import xarray as xr
import pandas as pd
import numpy as np
xr.show_versions()

ds_temp = xr.Dataset(data_vars=dict(temp=(["time", "pos"], np.array([[5, 10, 15, 20, 25]]))), coords=dict(pf=("pos", [1., 2., 4.2, 8., 10.]), time=("time", [pd.to_datetime("2025-01-01")]))).set_xindex("pf")
df = ds_temp.to_dataframe()
assert "pf" in df.columns

Steps to reproduce

The resulting DataFrame lacks the pf as a column

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

Anything else we need to know?

No response

Environment

INSTALLED VERSIONS

commit: None
python: 3.11.2 (main, Nov 30 2024, 21:22:50) [GCC 12.2.0]
python-bits: 64
OS: Linux
OS-release: 6.1.0-37-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.2
libnetcdf: 4.9.4-development
xarray: 2025.10.1
pandas: 2.3.3
numpy: 2.2.3
scipy: 1.15.2
netCDF4: 1.7.2
pydap: None
h5netcdf: None
h5py: None
zarr: 3.0.4
cftime: 1.6.4.post1
nc_time_axis: None
iris: None
bottleneck: None
dask: 2025.2.0
distributed: None
matplotlib: None
cartopy: None
seaborn: None
numbagg: None
fsspec: 2025.2.0
cupy: None
pint: None
sparse: None
flox: None
numpy_groupies: None
setuptools: 75.8.0
pip: 25.0.1
conda: None
pytest: 8.3.5
mypy: 1.15.0
IPython: 9.0.1
sphinx: 8.2.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugneeds triageIssue that has not been reviewed by xarray team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions