Skip to content

Ordering of Coords in Repr #11039

@ianhi

Description

@ianhi

What is your issue?

Problem

This preferentially placing dimension coordinates first in the order. Which makes a lot of sense. But has a negative side effect for the visualization non-dimension coords that share a dimension wiht a dim coord. e..g here word coord and the other coords sharing the word dim are disconnected:

Image

Or a simple example:

import numpy as np

ds = xr.Dataset(
    {"data": (("a", "b", "c"), np.random.rand(3, 3, 3))},
    coords={"a": [0, 1, 2], "b": [0, 1, 2], "c": [3, 2, 1]},
)

ds = ds.assign_coords({"a_1":('a', [5,6,7])})
ds = ds.assign_coords({"b_1":('b', [5,6,7])})
ds = ds.assign_coords({"c_1":('c', [5,6,7])})
print(ds)
<xarray.Dataset> Size: 360B
Dimensions:  (a: 3, b: 3, c: 3)
Coordinates:
  * a        (a) int64 24B 0 1 2
  * b        (b) int64 24B 0 1 2
  * c        (c) int64 24B 3 2 1
    a_1      (a) int64 24B 5 6 7
    b_1      (b) int64 24B 5 6 7
    c_1      (c) int64 24B 5 6 7
Data variables:
    data     (a, b, c) float64 216B 0.2621 0.7528 0.7416 ... 0.3515 0.1 0.2662

Solution

The order of coords in the repr is currently defined here:

dim_ordered_coords = sorted(
coords.items(), key=lambda x: dims.index(x[0]) if x[0] in dims else len(dims)
)

We could (possibly optionally) change the sort key to something like this:

def _coord_sort_key(name, var, dims):
    """Sort key for coordinate ordering.

        Orders by:
        1. Primary: index of the first matching dimension in dataset dims
        2. Secondary: dimension coordinates (name == dim) come before non-dimension coordinates

        This groups non-dimension coordinates right after their associated dimension
    coordinate.
    """
    # Dimension coordinates sort by their position in dims, come first (0)
    if name in dims:
        return (dims.index(name), 0)

    # Non-dimension coordinates sort by their first dim, come second (1)
    for d in var.dims:
        if d in dims:
            return (dims.index(d), 1)

    # Scalar coords or coords with dims not in dataset dims go at end
    return (len(dims), 1)

which gives

<xarray.Dataset> Size: 360B
Dimensions:  (a: 3, b: 3, c: 3)
Coordinates:
  * a        (a) int64 24B 0 1 2
    a_1      (a) int64 24B 5 6 7
  * b        (b) int64 24B 0 1 2
    b_1      (b) int64 24B 5 6 7
  * c        (c) int64 24B 3 2 1
    c_1      (c) int64 24B 5 6 7
Data variables:
    data     (a, b, c) float64 216B 0.0534 0.7124 0.705 ... 0.5102 0.563 0.1605

or potentially going even farther and indenting the non-dimension coords

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions