GCC Rust Monthly Report #19 July 2022

Thanks again to Open Source Security, inc and Embecosm for their ongoing support for this project.

Milestone Progress

July saw a lot of news for GCC Rust, prominently the approval to merge upstream into GCC 13. This is seen as part of the development process for us in ensuring we are handling copyright approval and coding standards properly for GCC. The first patch set has already been reviewed, and there is work for us to start splitting up our front-end into a buildable patch set which will form version two of the patch set. I will be extracting each compiler pass into a separate patch starting with:

  • Skeleton front-end for rust
  • AST structures
  • Lexer and Parser (might be split up)
  • Expansion pass
  • Name resolution pass
  • HIR IR and lowering pass
  • Type resolution pass
  • Deadcode pass
  • Unsafe pass
  • GCC Generic code-generation and constexpr code
  • Smaller lints like unused var
  • Metadata output
  • Compiler driver update
  • Testsuite

This is a lot of work to split up and ensure each component is buildable, but between Arthur and Faisal, our google summer of code student, we can keep moving forward each week.

Monthly Community Call

It is time for our next monthly community call:

Completed Activities

  • Porting more constexpr code PR1350 PR1356 PR1369
  • Support keyword self path in expressions and types PR1346
  • Add new -frust-dump-pretty for our new AST dump mechanism PR1353
  • Cleanup header and source file declarations PR1359 PR1371 PR1372
  • Add name resolution to const-generic parameters PR1354
  • Implement disambiguation of const-generic arguments PR1355
  • Fix bad ABI enum switch PR1368
  • Add extern blocks to new AST dump pass PR1365
  • Support optional nullptr linemap PR1364
  • Refactor lexer to support internal buffers as well as file sources PR1363
  • Fix use after move PR1370
  • Add initial support for match expression on Tuples PR1367
  • Refactor our mappings class across crates PR1366
  • Remove unused code PR1374
  • Support missing ABI options PR1375
  • cpp const-exprt porting PR1369
  • Support more foreign ABI’s PR1375 PR1379
  • Bug fix bad arithmetic type checking on generics PR1384
  • Support generics in AST dump PR1382
  • Support arithmetic expressions in AST dump PR1381
  • Bug fix support aggregate types in transmute PR1380
  • Add crate helpers in mappings class PR1388
  • External items with Rust ABI need name mangling PR1387
  • Fix undefined behaviour with unique_ptr PR1386
  • Add missing include PR1385
  • Update build farm badges PR1390
  • Extern crate loading PR1362
  • Fix ICE on extern block PR1391
  • Typechecking of default const generic parameters PR1373
  • Disambiguation of generic params PR1358
  • Parse any possible inner attribute items on module expansion PR1392
  • Fix grouped tail expression parsing PR1394
  • Add support for keywords based on rust editions PR1397
  • Fix make check-rust in paralell mode for link tests PR1404
  • Fix bug in recursive macro expansion PR1401
  • Allow repeating metavars in macros PR1405
  • Refactor analysis passes in the compiler pipeline PR1409
  • Add new attribute checking pass PR1406
  • Experiment: Add error-codes to error diagnostics along with embeded url PR1408
  • Add unsafe checks PR1410 PR1415 PR1417 PR1416 PR1427
  • Add skeleton improved hir-dump PR1378
  • Add const checks PR1419
  • Fix remark automation PR1402
  • Bug fix recursive macros PR1421

Contributors this month

Overall Task Status

CategoryLast MonthThis MonthDelta
TODO152160+8
In Progress2829+1
Completed405420+15
GitHub Issues

Test Cases

CategoryLast MonthThis MonthDelta
Passing63956531+136
Failed
XFAIL3151+20
XPASS
make check-rust

Bugs

CategoryLast MonthThis MonthDelta
TODO5755-2
In Progress1113+2
Completed169178+9
GitHub Bugs

Milestone Progress

MilestoneLast MonthThis MonthDeltaStart DateCompletion DateTarget
Data Structures 1 – Core100%100%30th Nov 202027th Jan 202129th Jan 2021
Control Flow 1 – Core100%100%28th Jan 202110th Feb 202126th Feb 2021
Data Structures 2 – Generics100%100%11th Feb 202114th May 202128th May 2021
Data Structures 3 – Traits100%100%20th May 202117th Sept 202127th Aug 2021
Control Flow 2 – Pattern Matching100%%10020th Sept 20219th Dec 202129th Nov 2021
Macros and cfg expansion100%100%1st Dec 202131st Mar 202228th Mar 2022
Imports and Visibility97%100%+3%29th Mar 202213th Jul 202227th May 2022
Const Generics15%45%+30%30th May 202217th Oct 2022
Intrinsics0%0%6th Sept 202214th Nov 2022
GitHub Milestones

Risks

RiskImpact (1-3)Likelihood (0-10)Risk (I * L)Mitigation
Rust Language Changes2714Target a specific Rustc version
Missing GCC 13 upstream window166Merge in GCC 14 and be proactive about reviews

Planned Activities

  • Prepare gcc patches v2
  • Continue work on const evaluation

Detailed changelog

Unsafe checks

One important feature that we hadn’t implemented so far in the compiler was the check for unsafe code. This is a core feature of Rust, as a lot of operations permitted by other languages may prove dangerous and need some extra consideration. These limitations include the dereferencing of raw pointers, calls to unsafe or extern functions, accessing a union’s member or using certain kinds of static variables (and more). However, these behaviors are necessary in certain situations, in which case they need to be wrapped in unsafe blocks or functions.

gccrs will now error out as expected from Rust programs in the following situations:

unsafe fn unsafoo() {}

static mut GLOBAL: i32 = 15;

fn bar(value: i32) {}

fn foo() {
    unsafoo(); // call to unsafe function!

    let a = 15;
    let b = &a as *const i32; // this is allowed

    let c = *b; // this is unsafe!

    bar(*b); // here as well!

    let d = GLOBAL; // this is unsafe as well!
}

You can follow our progress in adding unsafe checks on this tracking issue on our repository.

Linking crates

In Rust, the entire crate is the compilation unit; for reference, a compilation unit is often referred to as the translation unit in GCC. This means, unlike other languages, a crate is built up with multiple source files. This is all managed by the mod keywords in your source code, such that mod foo will expand automatically to the relative path of foo.rs and include the source code akin to an include nested within a namespace in C++. This has some exciting benefits, notably no need for header files, but this means more complexity because, when linking code, the caller needs to know the calling conventions and type layout information.

To support linking against crates, many things come together to let it happen, so let us look at this by considering a simple example of calling a function in a library. Let us assume we have a library foo with directory structure:

// libfoo/src/lib.rs
fn bar(a:i32) -> i32 {
  a + 2
}

We can compile this by running:

gccrs -g -O2 -frust-crate=foo -c src/lib.rs -o foo.o

This will generate your expected object file, but you will notice a new output in your current working directory: foo.rox. This is your crate metadata; it contains all this “header” information, such as functions and type layouts. There is code to this by embedding this metadata directly into the object file, which will be preserved into static libraries, and the compiler will support reading from object files and archives but not shared objects, unfortunately. However, by emitting this separate file, it means its output format is agnostic as this method does not seem to be supported for us on macosx.

Back to the example, in order to link against this object and call the function, we must write code to import it:

// test/src/main.rs
extern crate foo;
use foo::bar;

fn main() {
  let a = bar(123);
}

Now to compile and link this.

gccrs -g -O2 -I../libfoo -c src/main.rs -o main.o
gccrs -o test main.o ../libfoo/foo.o

In the compiler, we see the extern crate declaration, which tells the compiler to look for the external crate foo, which in turn triggers the compiler to look for foo.rox, foo.o or libfoo.a in this case, we will find foo.rox. The front-end loads this data, so we know there is a function named bar. Internally the crate of foo just exports:

extern "Rust" {
  fn bar(a:i32) -> i32;
}

This is more complicated for generics and impl blocks, but the idea is the same. The benefit of exporting raw rust code here is that to support public generics, we just get this for free by reusing the same compiler pipeline.

Note you can use the following options to control this metadata output so far:

  • -frust-embed-metadata this toggles to embed the metadata into .rust_export section of the target asm output default off
  • -frust-metadata-output= specifies the path to directly write the metadata to file

Note 1: that when specifying the location to write this metadata file the compiler will enforce a naming convention of crate_name.rox on the basename of the path as the crate name is critical here. Note 2: this link model is heavily inspired as that from gccgo.

1 thought on “GCC Rust Monthly Report #19 July 2022

  1. Please double-check with the GCC maintainers whether you actually need that each patch set is buildable. If I remember correctly, the split is only requested to ease reviewing (so that experts of each part of the compiler know which patch set they need to review), but you can commit the whole thing in one go. Usually the split can be done simply by filtering the merge diff by path. Patches that modify previous patches in the sequence are actually undesirable (i.e., ideally each patch can be independently applied to the main branch). Making each part independently buildable seems unnecessary and a waste of your time.

    On the other hand, changes to the common part of the compiler (specially bugfixes) should be committed separately (thus be independently buildable) if that is possible and easy. Otherwise, I’m pretty sure the maintainers would also accept a single commit of multiple patches if they are related.

    You can also request a freeze period of the GCC main branch if you prefer to do multiple commits that need independent builds and testing to avoid conflicts with other commits.

    Ask the GCC global maintainers what would be their preference before wasting your time!

Leave a Reply to Manuel López-Ibáñez Cancel reply

Your email address will not be published.