Skip to content

An aggregation of GPU utilization (GPU and VRAM) per cgroup? #344

@gleventhal

Description

@gleventhal

From what I can tell, the G (Cgroupv2) view and the E (GPU) views are mutually exclusive, as are the G and e views, such that (FWICT) there is not any aggregation for GPU utilization per cgroup (v2).

I understand that there are no control files (AFAIK) in cgroupsv2 for GPU, so atop would need to perform the calculations/aggregations, I think.

I think it could be useful, for instance: a multi-tenancy research cluster with many GPUs, to show which jobs are responsible for what GPU utilization on a cgroup level (assuming that a cgroup name is more meaningful to the admins than a PID for some cases).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions