Thanks again to Open Source Security, inc and Embecosm for their ongoing support for this project.
Milestone Progress
July saw a lot of news for GCC Rust, prominently the approval to merge upstream into GCC 13. This is seen as part of the development process for us in ensuring we are handling copyright approval and coding standards properly for GCC. The first patch set has already been reviewed, and there is work for us to start splitting up our front-end into a buildable patch set which will form version two of the patch set. I will be extracting each compiler pass into a separate patch starting with:
- Skeleton front-end for rust
- AST structures
- Lexer and Parser (might be split up)
- Expansion pass
- Name resolution pass
- HIR IR and lowering pass
- Type resolution pass
- Deadcode pass
- Unsafe pass
- GCC Generic code-generation and constexpr code
- Smaller lints like unused var
- Metadata output
- Compiler driver update
- Testsuite
This is a lot of work to split up and ensure each component is buildable, but between Arthur and Faisal, our google summer of code student, we can keep moving forward each week.
Monthly Community Call
It is time for our next monthly community call:
- Date: 5th August 2022 at: 09h00 UTC
- Agenda: https://hackmd.io/ZVgm1LaPQly173-OML2X7Q
- Jitsi: https://meet.jit.si/gccrs-community-call
Completed Activities
- Porting more constexpr code PR1350 PR1356 PR1369
- Support keyword self path in expressions and types PR1346
- Add new -frust-dump-pretty for our new AST dump mechanism PR1353
- Cleanup header and source file declarations PR1359 PR1371 PR1372
- Add name resolution to const-generic parameters PR1354
- Implement disambiguation of const-generic arguments PR1355
- Fix bad ABI enum switch PR1368
- Add extern blocks to new AST dump pass PR1365
- Support optional nullptr linemap PR1364
- Refactor lexer to support internal buffers as well as file sources PR1363
- Fix use after move PR1370
- Add initial support for match expression on Tuples PR1367
- Refactor our mappings class across crates PR1366
- Remove unused code PR1374
- Support missing ABI options PR1375
- cpp const-exprt porting PR1369
- Support more foreign ABI’s PR1375 PR1379
- Bug fix bad arithmetic type checking on generics PR1384
- Support generics in AST dump PR1382
- Support arithmetic expressions in AST dump PR1381
- Bug fix support aggregate types in transmute PR1380
- Add crate helpers in mappings class PR1388
- External items with Rust ABI need name mangling PR1387
- Fix undefined behaviour with unique_ptr PR1386
- Add missing include PR1385
- Update build farm badges PR1390
- Extern crate loading PR1362
- Fix ICE on extern block PR1391
- Typechecking of default const generic parameters PR1373
- Disambiguation of generic params PR1358
- Parse any possible inner attribute items on module expansion PR1392
- Fix grouped tail expression parsing PR1394
- Add support for keywords based on rust editions PR1397
- Fix make check-rust in paralell mode for link tests PR1404
- Fix bug in recursive macro expansion PR1401
- Allow repeating metavars in macros PR1405
- Refactor analysis passes in the compiler pipeline PR1409
- Add new attribute checking pass PR1406
- Experiment: Add error-codes to error diagnostics along with embeded url PR1408
- Add unsafe checks PR1410 PR1415 PR1417 PR1416 PR1427
- Add skeleton improved hir-dump PR1378
- Add const checks PR1419
- Fix remark automation PR1402
- Bug fix recursive macros PR1421
Contributors this month
Overall Task Status
Category | Last Month | This Month | Delta |
TODO | 152 | 160 | +8 |
In Progress | 28 | 29 | +1 |
Completed | 405 | 420 | +15 |
Test Cases
Category | Last Month | This Month | Delta |
Passing | 6395 | 6531 | +136 |
Failed | – | – | – |
XFAIL | 31 | 51 | +20 |
XPASS | – | – | – |
Bugs
Category | Last Month | This Month | Delta |
TODO | 57 | 55 | -2 |
In Progress | 11 | 13 | +2 |
Completed | 169 | 178 | +9 |
Milestone Progress
Milestone | Last Month | This Month | Delta | Start Date | Completion Date | Target |
Data Structures 1 – Core | 100% | 100% | – | 30th Nov 2020 | 27th Jan 2021 | 29th Jan 2021 |
Control Flow 1 – Core | 100% | 100% | – | 28th Jan 2021 | 10th Feb 2021 | 26th Feb 2021 |
Data Structures 2 – Generics | 100% | 100% | – | 11th Feb 2021 | 14th May 2021 | 28th May 2021 |
Data Structures 3 – Traits | 100% | 100% | – | 20th May 2021 | 17th Sept 2021 | 27th Aug 2021 |
Control Flow 2 – Pattern Matching | 100% | %100 | – | 20th Sept 2021 | 9th Dec 2021 | 29th Nov 2021 |
Macros and cfg expansion | 100% | 100% | – | 1st Dec 2021 | 31st Mar 2022 | 28th Mar 2022 |
Imports and Visibility | 97% | 100% | +3% | 29th Mar 2022 | 13th Jul 2022 | 27th May 2022 |
Const Generics | 15% | 45% | +30% | 30th May 2022 | – | 17th Oct 2022 |
Intrinsics | 0% | 0% | – | 6th Sept 2022 | – | 14th Nov 2022 |
Risks
Risk | Impact (1-3) | Likelihood (0-10) | Risk (I * L) | Mitigation |
Rust Language Changes | 2 | 7 | 14 | Target a specific Rustc version |
Missing GCC 13 upstream window | 1 | 6 | 6 | Merge in GCC 14 and be proactive about reviews |
Planned Activities
- Prepare gcc patches v2
- Continue work on const evaluation
Detailed changelog
Unsafe checks
One important feature that we hadn’t implemented so far in the compiler was the check for unsafe code. This is a core feature of Rust, as a lot of operations permitted by other languages may prove dangerous and need some extra consideration. These limitations include the dereferencing of raw pointers, calls to unsafe or extern functions, accessing a union’s member or using certain kinds of static variables (and more). However, these behaviors are necessary in certain situations, in which case they need to be wrapped in unsafe
blocks or functions.
gccrs
will now error out as expected from Rust programs in the following situations:
unsafe fn unsafoo() {}
static mut GLOBAL: i32 = 15;
fn bar(value: i32) {}
fn foo() {
unsafoo(); // call to unsafe function!
let a = 15;
let b = &a as *const i32; // this is allowed
let c = *b; // this is unsafe!
bar(*b); // here as well!
let d = GLOBAL; // this is unsafe as well!
}
You can follow our progress in adding unsafe checks on this tracking issue on our repository.
Linking crates
In Rust, the entire crate is the compilation unit; for reference, a compilation unit is often referred to as the translation unit in GCC. This means, unlike other languages, a crate is built up with multiple source files. This is all managed by the mod keywords in your source code, such that mod foo will expand automatically to the relative path of foo.rs and include the source code akin to an include nested within a namespace in C++. This has some exciting benefits, notably no need for header files, but this means more complexity because, when linking code, the caller needs to know the calling conventions and type layout information.
To support linking against crates, many things come together to let it happen, so let us look at this by considering a simple example of calling a function in a library. Let us assume we have a library foo with directory structure:
// libfoo/src/lib.rs
fn bar(a:i32) -> i32 {
a + 2
}
We can compile this by running:
gccrs -g -O2 -frust-crate=foo -c src/lib.rs -o foo.o
This will generate your expected object file, but you will notice a new output in your current working directory: foo.rox. This is your crate metadata; it contains all this “header” information, such as functions and type layouts. There is code to this by embedding this metadata directly into the object file, which will be preserved into static libraries, and the compiler will support reading from object files and archives but not shared objects, unfortunately. However, by emitting this separate file, it means its output format is agnostic as this method does not seem to be supported for us on macosx.
Back to the example, in order to link against this object and call the function, we must write code to import it:
// test/src/main.rs
extern crate foo;
use foo::bar;
fn main() {
let a = bar(123);
}
Now to compile and link this.
gccrs -g -O2 -I../libfoo -c src/main.rs -o main.o
gccrs -o test main.o ../libfoo/foo.o
In the compiler, we see the extern crate declaration, which tells the compiler to look for the external crate foo, which in turn triggers the compiler to look for foo.rox, foo.o or libfoo.a in this case, we will find foo.rox. The front-end loads this data, so we know there is a function named bar. Internally the crate of foo just exports:
extern "Rust" {
fn bar(a:i32) -> i32;
}
This is more complicated for generics and impl blocks, but the idea is the same. The benefit of exporting raw rust code here is that to support public generics, we just get this for free by reusing the same compiler pipeline.
Note you can use the following options to control this metadata output so far:
- -frust-embed-metadata this toggles to embed the metadata into .rust_export section of the target asm output default off
- -frust-metadata-output= specifies the path to directly write the metadata to file
Note 1: that when specifying the location to write this metadata file the compiler will enforce a naming convention of crate_name.rox on the basename of the path as the crate name is critical here. Note 2: this link model is heavily inspired as that from gccgo.
Please double-check with the GCC maintainers whether you actually need that each patch set is buildable. If I remember correctly, the split is only requested to ease reviewing (so that experts of each part of the compiler know which patch set they need to review), but you can commit the whole thing in one go. Usually the split can be done simply by filtering the merge diff by path. Patches that modify previous patches in the sequence are actually undesirable (i.e., ideally each patch can be independently applied to the main branch). Making each part independently buildable seems unnecessary and a waste of your time.
On the other hand, changes to the common part of the compiler (specially bugfixes) should be committed separately (thus be independently buildable) if that is possible and easy. Otherwise, I’m pretty sure the maintainers would also accept a single commit of multiple patches if they are related.
You can also request a freeze period of the GCC main branch if you prefer to do multiple commits that need independent builds and testing to avoid conflicts with other commits.
Ask the GCC global maintainers what would be their preference before wasting your time!