Skip to content

Conversation

@GuillaumeGomez
Copy link
Member

cc @antoyo

r? ghost

bjorn3 and others added 30 commits August 24, 2025 11:20
As opposed to passing it around through Result.
…lsewhere

A lot of places had special handling just in case they would get an
allocator module even though most of these places could never get one or
would have a trivial implementation for the allocator module. Moving all
handling of the allocator module to a single place simplifies things a
fair bit.
It is always false nowadays. ThinLTO summary writing is instead done by
llvm_optimize.
Misc LTO cleanups

Follow up to rust-lang#145955.

* Remove want_summary argument from `prepare_thin`.
   Since rust-lang#133250 ThinLTO summary writing is instead done by `llvm_optimize`.
* Two minor cleanups
We need a different attribute than `rustc_align` because unstable attributes are
tied to their feature (we can't have two unstable features use the same
unstable attribute). Otherwise this uses all of the same infrastructure
as `#[rustc_align]`.
…lmann,ralfjung,traviscross

Implement `#[rustc_align_static(N)]` on `static`s

Tracking issue: rust-lang#146177

```rust
#![feature(static_align)]

#[rustc_align_static(64)]
static SO_ALIGNED: u64 = 0;
```

We need a different attribute than `rustc_align` because unstable attributes are tied to their feature (we can't have two unstable features use the same unstable attribute). Otherwise this uses all of the same infrastructure as `#[rustc_align]`.

r? `@traviscross`
…ethercote

Add panic=immediate-abort

MCP: rust-lang/compiler-team#909

This adds a new panic strategy, `-Cpanic=immediate-abort`. This panic strategy essentially just codifies use of `-Zbuild-std-features=panic_immediate_abort`. This PR is intended to just set up infrastructure, and while it will change how the compiler is invoked for users of the feature, there should be no other impacts.

In many parts of the compiler, `PanicStrategy::ImmediateAbort` behaves just like `PanicStrategy::Abort`, because actually most parts of the compiler just mean to ask "can this unwind?" so I've added a helper function so we can say `sess.panic_strategy().unwinds()`.

The panic and unwind strategies have some level of compatibility, which mostly means that we can pre-compile the sysroot with unwinding panics then the sysroot can be linked with aborting panics later. The immediate-abort strategy is all-or-nothing, enforced by `compiler/rustc_metadata/src/dependency_format.rs` and this is tested for in `tests/ui/panic-runtime/`. We could _technically_ be more compatible with the other panic strategies, but immediately-aborting panics primarily exist for users who want to eliminate all the code size responsible for the panic runtime. I'm open to other use cases if people want to present them, but not right now. This PR is already large.

`-Cpanic=immediate-abort` sets both `cfg(panic = "immediate-abort")` _and_ `cfg(panic = "abort")`. bjorn3 pointed out that people may be checking for the abort cfg to ask if panics will unwind, and also the sysroot feature this is replacing used to require `-Cpanic=abort` so this seems like a good back-compat step. At least for the moment. Unclear if this is a good idea indefinitely. I can imagine this being confusing.

The changes to the standard library attributes are purely mechanical. Apart from that, I removed an `unsafe` we haven't needed for a while since the `abort` intrinsic became safe, and I've added a helpful diagnostic for people trying to use the old feature.

To test that `-Cpanic=immediate-abort` conflicts with other panic strategies, I've beefed up the core-stubs infrastructure a bit. There is now a separate attribute to set flags on it.

I've added a test that this produces the desired codegen, called `tests/run-make-cargo/panic-immediate-abort-codegen/` and also a separate run-make-cargo test that checks that we can build a binary.
…monomorphization

Unify zero-length and oversized SIMD errors
…, r=lcnr,RalfJung

Add an attribute to check the number of lanes in a SIMD vector after monomorphization

Allows std::simd to drop the `LaneCount<N>: SupportedLaneCount` trait and maintain good error messages.

Also, extends rust-lang#145967 by including spans in layout errors for all ADTs.

r? ``@RalfJung``

cc ``@workingjubilee`` ``@programmerjake``
TypeTree support in autodiff

# TypeTrees for Autodiff

## What are TypeTrees?
Memory layout descriptors for Enzyme. Tell Enzyme exactly how types are structured in memory so it can compute derivatives efficiently.

## Structure
```rust
TypeTree(Vec<Type>)

Type {
    offset: isize,  // byte offset (-1 = everywhere)
    size: usize,    // size in bytes
    kind: Kind,     // Float, Integer, Pointer, etc.
    child: TypeTree // nested structure
}
```

## Example: `fn compute(x: &f32, data: &[f32]) -> f32`

**Input 0: `x: &f32`**
```rust
TypeTree(vec![Type {
    offset: -1, size: 8, kind: Pointer,
    child: TypeTree(vec![Type {
        offset: -1, size: 4, kind: Float,
        child: TypeTree::new()
    }])
}])
```

**Input 1: `data: &[f32]`**
```rust
TypeTree(vec![Type {
    offset: -1, size: 8, kind: Pointer,
    child: TypeTree(vec![Type {
        offset: -1, size: 4, kind: Float,  // -1 = all elements
        child: TypeTree::new()
    }])
}])
```

**Output: `f32`**
```rust
TypeTree(vec![Type {
    offset: -1, size: 4, kind: Float,
    child: TypeTree::new()
}])
```

## Why Needed?
- Enzyme can't deduce complex type layouts from LLVM IR
- Prevents slow memory pattern analysis
- Enables correct derivative computation for nested structures
- Tells Enzyme which bytes are differentiable vs metadata

## What Enzyme Does With This Information:

Without TypeTrees (current state):
```llvm
; Enzyme sees generic LLVM IR:
define float ``@distance(ptr*`` %p1, ptr* %p2) {
; Has to guess what these pointers point to
; Slow analysis of all memory operations
; May miss optimization opportunities
}
```

With TypeTrees (our implementation):
```llvm
define "enzyme_type"="{[]:Float@float}" float ``@distance(``
    ptr "enzyme_type"="{[]:Pointer}" %p1,
    ptr "enzyme_type"="{[]:Pointer}" %p2
) {
; Enzyme knows exact type layout
; Can generate efficient derivative code directly
}
```

# TypeTrees - Offset and -1 Explained

## Type Structure

```rust
Type {
    offset: isize, // WHERE this type starts
    size: usize,   // HOW BIG this type is
    kind: Kind,    // WHAT KIND of data (Float, Int, Pointer)
    child: TypeTree // WHAT'S INSIDE (for pointers/containers)
}
```

## Offset Values

### Regular Offset (0, 4, 8, etc.)
**Specific byte position within a structure**

```rust
struct Point {
    x: f32, // offset 0, size 4
    y: f32, // offset 4, size 4
    id: i32, // offset 8, size 4
}
```

TypeTree for `&Point` (internal representation):
```rust
TypeTree(vec![
    Type { offset: 0, size: 4, kind: Float },   // x at byte 0
    Type { offset: 4, size: 4, kind: Float },   // y at byte 4
    Type { offset: 8, size: 4, kind: Integer }  // id at byte 8
])
```

Generates LLVM:
```llvm
"enzyme_type"="{[]:Float@float}"
```

### Offset -1 (Special: "Everywhere")
**Means "this pattern repeats for ALL elements"**

#### Example 1: Array `[f32; 100]`
```rust
TypeTree(vec![Type {
    offset: -1, // ALL positions
    size: 4,    // each f32 is 4 bytes
    kind: Float, // every element is float
}])
```

Instead of listing 100 separate Types with offsets `0,4,8,12...396`

#### Example 2: Slice `&[i32]`
```rust
// Pointer to slice data
TypeTree(vec![Type {
    offset: -1, size: 8, kind: Pointer,
    child: TypeTree(vec![Type {
        offset: -1, // ALL slice elements
        size: 4,    // each i32 is 4 bytes
        kind: Integer
    }])
}])
```

#### Example 3: Mixed Structure
```rust
struct Container {
    header: i64,        // offset 0
    data: [f32; 1000],  // offset 8, but elements use -1
}
```

```rust
TypeTree(vec![
    Type { offset: 0, size: 8, kind: Integer }, // header
    Type { offset: 8, size: 4000, kind: Pointer,
        child: TypeTree(vec![Type {
            offset: -1, size: 4, kind: Float // ALL array elements
        }])
    }
])
```
@GuillaumeGomez
Copy link
Member Author

Sending a PR.

@GuillaumeGomez
Copy link
Member Author

GuillaumeGomez commented Nov 7, 2025

Opened rust-lang/ci-mirrors#17.

@GuillaumeGomez
Copy link
Member Author

Restarted CI, let's see if it's happier now.

@GuillaumeGomez
Copy link
Member Author

GuillaumeGomez commented Nov 7, 2025

Seems not.

@Kobzol when/how are files uploaded to our CI mirrors?

@Kobzol
Copy link
Member

Kobzol commented Nov 7, 2025

Immediately when the PR is merged. The file has been uploaded to the mirrors correctly: https://ci-mirrors.rust-lang.org/rustc/gcc/gmp-6.3.0.tar.bz2 and it was downloaded in the latest CI run of this PR. The SHA512 hash doesn't seem to match though:

gmp-6.3.0.tar.bz2: FAILED
  sha512sum: WARNING: 1 computed checksum did NOT match
  error: Cannot verify integrity of possibly corrupted file gmp-6.3.0.tar.bz2

@thesamesam
Copy link

https://github.com/rust-lang/ci-mirrors/pull/17/files#r2505298478

@GuillaumeGomez
Copy link
Member Author

Sent rust-lang/ci-mirrors#18 to fix it.

@GuillaumeGomez
Copy link
Member Author

It'd be much more convenient if we knew ahead of time all the missing files... :-/

@Kobzol
Copy link
Member

Kobzol commented Nov 12, 2025

The files are listed in the GCC source code, you can check if they are present on the mirrors 😆 But yeah, this is kinda annoying.

@GuillaumeGomez
Copy link
Member Author

Gonna take a broader look tomorrow to see all I missed. In the meantime I'll send a PR to add mpc.

@rustbot
Copy link
Collaborator

rustbot commented Nov 13, 2025

⚠️ Warning ⚠️

@GuillaumeGomez
Copy link
Member Author

Finally! \o/

@bors r+ p=1 rollup=never

@bors
Copy link
Collaborator

bors commented Nov 13, 2025

📌 Commit 866de5f has been approved by GuillaumeGomez

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Nov 13, 2025
@bors
Copy link
Collaborator

bors commented Nov 13, 2025

⌛ Testing commit 866de5f with merge 2286e5d...

@bors
Copy link
Collaborator

bors commented Nov 13, 2025

☀️ Test successful - checks-actions
Approved by: GuillaumeGomez
Pushing 2286e5d to main...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Nov 13, 2025
@bors bors merged commit 2286e5d into rust-lang:main Nov 13, 2025
12 checks passed
@rustbot rustbot added this to the 1.93.0 milestone Nov 13, 2025
@github-actions
Copy link
Contributor

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing af5c5b7 (parent) -> 2286e5d (this PR)

Test differences

Show 1 test diff

Stage 2

  • [ui] tests/ui/lto/lto-global-allocator.rs: pass -> ignore (gcc backend is marked as ignore) (J0)

Job group index

Test dashboard

Run

cargo run --manifest-path src/ci/citool/Cargo.toml -- \
    test-dashboard 2286e5d224b3413484cf4f398a9f078487e7b49d --output-dir test-dashboard

And then open test-dashboard/index.html in your browser to see an overview of all executed tests.

Job duration changes

  1. x86_64-gnu-llvm-20: 2408.7s -> 3415.5s (+41.8%)
  2. x86_64-gnu-gcc: 3081.6s -> 4038.1s (+31.0%)
  3. dist-aarch64-apple: 5534.5s -> 7029.2s (+27.0%)
  4. dist-apple-various: 4430.9s -> 5508.4s (+24.3%)
  5. pr-check-1: 1303.1s -> 1503.5s (+15.4%)
  6. test-various: 6618.1s -> 5842.4s (-11.7%)
  7. x86_64-rust-for-linux: 2488.7s -> 2750.2s (+10.5%)
  8. x86_64-gnu-llvm-20-3: 6264.1s -> 5643.3s (-9.9%)
  9. tidy: 150.3s -> 164.3s (+9.3%)
  10. i686-gnu-2: 5004.4s -> 5416.2s (+8.2%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (2286e5d): comparison URL.

Overall result: ✅ improvements - no action needed

@rustbot label: -perf-regression

Instruction count

Our most reliable metric. Used to determine the overall result above. However, even this metric can be noisy.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-1.1% [-1.4%, -0.2%] 7
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results (secondary 3.2%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
3.2% [3.2%, 3.2%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Cycles

Results (secondary 2.3%)

A less reliable metric. May be of interest, but not used to determine the overall result above.

mean range count
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
2.3% [2.3%, 2.3%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) - - 0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 475.694s -> 474.605s (-0.23%)
Artifact size: 388.39 MiB -> 388.41 MiB (0.00%)

@GuillaumeGomez GuillaumeGomez deleted the subtree-update_cg_gcc_2025-11-04 branch November 14, 2025 10:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

has-merge-commits PR has merge commits, merge with caution. merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.