GCC Rust Weekly Status Report 34

Thanks again to Open Source Security, inc and Embecosm for their ongoing support for this project.

Milestone Progress

As the previous traits milestone ran over by three weeks, I aim to close out this control flow milestone by the end of this week (3rd December); this means we will get back two of those weeks to avoid skewing the following milestone targets for the upcoming year.

This week I merged bug fixes around the type system and qualified paths, support for array index access via indirection of references and support for the dereference operator overloading. The remaining tasks are merging enum code generation and initial pattern matching support for enum access. Pattern matching will be an ongoing task since it requires a lot of static analysis, but this analysis is not necessary to generate code. The exciting thing about enums is that we are using the qualified_union_type from GCC, which is only used in the ADA front-end, which makes me want to investigate ADA someday.

See: https://gcc.gnu.org/onlinedocs/gccint/Types.html.

Thank you to everyone who continues to support and work on the compiler.

Monthly Community Call

We will be having our 8th community call as the first Friday of the month:

Completed Activities

  • Deref Operator Overloading support PR818 PR821 PR823
  • BugFix QualifiedPath’s within traits PR812 PR813
  • BugFix name mangling on QualifiedPaths PR819
  • BugFix mutability within the type system for reference and pointer types PR820 PR817
  • GCC requires TREE_ADDRESSABLE on declarations that require address operations PR814
  • Cleanup generic substitution code PR822

Contributors this month

Overall Task Status

CategoryLast WeekThis WeekDelta
TODO9393
In Progress1714-3
Completed246251+5
GitHub Issues

Test Cases

CategoryLast WeekThis WeekDelta
Passing51065337+255
XFAIL2121
make check-rust

Bugs

CategoryLast WeekThis WeekDelta
TODO2324+1
In Progress44
Completed8680+4
GitHub Bugs

Milestones Progress

MilestoneLast WeekThis WeekDeltaStart DateCompletion DateTarget
Data Structures 1 – Core100%100%30th Nov 202027th Jan 202129th Jan 2021
Control Flow 1 – Core100%100%28th Jan 202110th Feb 202126th Feb 2021
Data Structures 2 – Generics100%100%11th Feb 202114th May 202128th May 2021
Data Structures 3 – Traits100%100%20th May 202117th Sept 202127th Aug 2021
Control Flow 2 – Pattern Matching80%94%+14%20th Sept 202129th Nov 2021
Macros and cfg expansion0%0%1st Dec 202128th Mar 2022
Imports and Visibility0%0%29th Mar 202227th May 2022
Const Generics0%0%30th May 202225th Jul 2022
Intrinsics0%0%6th Sept 202130th Sept 2022
GitHub Milestones

Risks

RiskImpact (1-3)Likelihood (0-10)Risk (I * L)Mitigation
Rust Language Changes3721Keep up to date with the Rust language on a regular basis
Going over target dates3515Maintain status reports and issue tracking to stakeholders

Planned Activities

  • Merge code generation for enums
  • Pattern matching on enums

Detailed changelog

GCC TREE_ADDRESSABLE

GCC requires VAR_DECL’s and PARAM_DECL’s to be marked with TREE_ADDRESSABLE when the declaration will be used in borrow’s (‘&’ getting the address). This takes into account the implicit addresses when we do autoderef in method resolution/operator-overloading. TREE_ADDRESSABLE if not set allows the optimizers to use registers since no address in memory is required for this declaration, but this means we end up in cases like this:

#[lang = "add_assign"]
pub trait AddAssign<Rhs = Self> {
    fn add_assign(&mut self, rhs: Rhs);
}

impl AddAssign for i32 {
    fn add_assign(&mut self, other: i32) {
        *self += other
    }
}

fn main() {
    let mut a = 1;
    a += 2;
}

This generated GCC Generic IR such as:

i32 main ()
{
  i32 a.1; // <-- This is the copy
  i32 D.86;
  i32 a;

  a = 1;
  a.1 = a; // <-- Taking a copy

  <i32 as AddAssign>::add_assign (&a.1, 2);
  //                               ^
  //                              ----

  D.86 = 0;
  return D.86;
}

You can see GCC will automatically make a copy of the VAR_DECL resulting bad code-generation. But with the TREE_ADDRESSABLE set this looks like this:

i32 main ()
{
  i32 D.86;
  i32 a;

  a = 1;
  <i32 as AddAssign>::add_assign (&a, 2);
  D.86 = 0;
  return D.86;
}

The fix here now marks the declarations appropriately for when we need their address or not which then allows the GCC optimizers to work as we expect. For more info see this useful comment https://github.com/Rust-GCC/gccrs/blob/0024bc2f028369b871a65ceb11b2fddfb0f9c3aa/gcc/tree.h#L634-L649

Qualified Path BugFix

We found that the implementation of qualified paths in was reliant on some implictly injected names within the name-resolution process so that we could try and at least resolve the root of the qualified path. This implementation was never going to hold up but served as a simple hack to get the type system off the ground during the traits milestone. These hacks and implict names are now removed and qualified paths are now just like TypePaths resolved durin the type checking pass. The bug here was that the qualified path of “<Self as Foo>::A” was unable to resolve the root “<Self as Foo>” since no implicit name was generated here, but now the type system is able to properly project Self as Foo to then probe for A which means the type system is able to handle more complex qualified paths.

pub trait Foo {
    type A;

    fn boo(&self) -> <Self as Foo>::A;
}

fn foo2<I: Foo>(x: I) {
    x.boo();
}

Add implicit indirection to array access

When we have an array-index expr rust allows the array to be a fat-pointer reference and the compiler is required to add in the required implicit indirection. Note: Rust supports this implict indirection in tuple and struct access also.

fn foo(state: &mut [u32; 16], a: usize) {
    state[a] = 1;
}

Support Dereference operator overloading

Deref operator overloading is a core piece of Rusts control flow mechanism, it adds in support for more complex method resolution cases as part of the autoderef mechanism. It also has served as a good test of the current state of the type system so far.

extern "C" {
    fn printf(s: *const i8, ...);
}

#[lang = "deref"]
pub trait Deref {
    type Target;

    fn deref(&self) -> &Self::Target;
}

impl<T> Deref for &T {
    type Target = T;

    fn deref(&self) -> &T {
        *self
    }
}

impl<T> Deref for &mut T {
    type Target = T;

    fn deref(&self) -> &T {
        *self
    }
}

struct Foo<T>(T);
impl<T> Deref for Foo<T> {
    type Target = T;

    fn deref(&self) -> &Self::Target {
        &self.0
    }
}

fn main() -> i32 {
    let foo: Foo<i32> = Foo(123);
    let bar: i32 = *foo;

    unsafe {
        let a = "%i\n\0";
        let b = a as *const str;
        let c = b as *const i8;

        printf(c, bar);
    }

    0
}

The interesting piece about dereferences is that the actual deref method that is implemented always returns a reference to the associated type ‘Target’, this implicitly requires the compiler call this method and because the trait and type checking ensures that the result is a reference it means it can safely be dereferenced by the compiler implicitly. I point this out because simply because the function prototype:

fn deref(&self) -> &Self::Target {
    &self.0
}

Here the function type is:

fn deref(self: &Foo<T>) -> &T { &self.0 }

So the dereference operation even on custom types is always going to return a reference. So the dereference operator overloading is a two step mechanism.

Leave a Reply

Your email address will not be published.