Thanks again to Open Source Security, inc and Embecosm for their ongoing support for this project.
Milestone Progress
We are approaching the end of macro expansion, we have one outstanding issue on how we handle expansion around types. Apart from that, we have some small, non-blocking macro issues marked with the good-first-pr
label for new contributors to make their mark. Feel free to ask about them on our Zulip or GitHub!
Macros have gotten more correct as more restrictions have been implemented, such as the verification of follow-set ambiguities. Rust 1.49.0 slices will soon be entirely supported, with only some cleanup of the various implementations remaining to be performed. We will start drafting some issues for the next milestone, which concerns Imports and Visibility. This includes source-code visibility (pub
, pub(crate)
, …) as well as metadata exports and proper handling of .rlib and .rmeta files.
Finally, we have also started looking into improving our set of CI checks: Thanks to a new contributor Daniel del Castillo, we can now easily check that our frontend does not introduce new warnings, which would contradict with gcc’s expectations of a full boostrapping build. We are also working on getting gccrs compilable using an older gcc version (4.8), which is a requirement for upstreaming as well as backporting.
Monthly Community Call
We will be having our regular community call as the first Friday of the month:
- Date and Time 1st April 2022 at: 14h00 UTC
- Agenda: https://hackmd.io/xsCkyWQ-Sp6jzczS6rz8RQ please feel free to add agenda items you wish to see discussed.
- Jitsi: https://meet.jit.si/gccrs-community-call
Completed Activities
- Enable -Werror in CI PR1026
- Do not propagate parser errors in match_repetitions PR1040
- Only expand merged repetitions if they contain the same amount PR1041
- Implement include_bytes! and include_str! PR1043
- Restrict follow up tokens on :expr and :stmt PR1044
- Add helper function for subsituted tokens debugging PR1047
- Add better restrictions around semicolons in statements parsing PR1049
- Add remaining restrictions for follow-set restrictions PR1051
- Add hints for valid follow-set tokens PR1052
- Fix overzealous follow-set ambiguity PR1054
- Allow checking past zeroable matches for follow-set restrictions PR1055
- Fix #include <algorithm> PR1056
- Provide std::hash for Rust::AST::MacroFragSpec::Kind enum class PR1057
- Properly perform follow-set checking on matchers PR1062
- Handle :tt fragments properly PR1064
- Handle :meta fragments properly PR1063
Contributors this week
Overall Task Status
Category | Last Week | This Week | Delta |
TODO | 110 | 105 | -5 |
In Progress | 18 | 23 | +5 |
Completed | 327 | 336 | +9 |
Test Cases
Category | Last Week | This Week | Delta |
Passing | 5511 | 5677 | +166 |
Failed | – | – | – |
XFAIL | 22 | 22 | – |
XPASS | – | – | – |
Bugs
Category | Last Week | This Week | Delta |
TODO | 34 | 35 | +1 |
In Progress | 8 | 10 | +2 |
Completed | 125 | 129 | +4 |
Milestones Progress
Milestone | Last Week | This Week | Delta | Start Date | Completion Date | Target |
Data Structures 1 – Core | 100% | 100% | – | 30th Nov 2020 | 27th Jan 2021 | 29th Jan 2021 |
Control Flow 1 – Core | 100% | 100% | – | 28th Jan 2021 | 10th Feb 2021 | 26th Feb 2021 |
Data Structures 2 – Generics | 100% | 100% | – | 11th Feb 2021 | 14th May 2021 | 28th May 2021 |
Data Structures 3 – Traits | 100% | 100% | – | 20th May 2021 | 17th Sept 2021 | 27th Aug 2021 |
Control Flow 2 – Pattern Matching | 100% | 100% | – | 20th Sept 2021 | 9th Dec 2021 | 29th Nov 2021 |
Macros and cfg expansion | 87% | 99% | +12% | 1st Dec 2021 | – | 28th Mar 2022 |
Imports and Visibility | 0% | 0% | – | 29th Mar 2022 | – | 27th May 2022 |
Const Generics | 0% | 0% | – | 30th May 2022 | – | 25th Jul 2022 |
Intrinsics | 0% | 0% | – | 6th Sept 2021 | – | 30th Sept 2022 |
Risks
Risk | Impact (1-3) | Likelihood (0-10) | Risk (I * L) | Mitigation |
Rust Language Changes | 3 | 7 | 21 | Keep up to date with the Rust language on a regular basis |
Going over target dates | 2 | 5 | 10 | Maintain status reports and issue tracking to stakeholders |
Planned Activities
- Finish working out the various quirks of macros
- Make sure follow-set ambiguities are implemented properly
- Merge unsized method resolution
- Handle macro opacity properly
- Plan out next milestone
Detailed changelog
Two new macro builtins have been added to the compiler thanks to David Faust: include_bytes!
and include_str!
. They allow the user to include files at compilation time, either as bytes or valid UTF-8 strings. This can be extremely useful for anyone dealing with binary blobs, and adds even more code for new contributors to reuse when adding more builtin macros.
Their definition is as follows:
macro_rules! include_str {
($file:expr $(,)?) => {{ /* compiler built-in */ }};
}
macro_rules! include_bytes {
($file:expr $(,)?) => {{ /* compiler built-in */ }};
}
Follow-set ambiguities
While rust macros are extremely powerful, they are also heavily restricted to prevent ambiguities. These restrictions include sets of allowed fragments that can follow a certain metavariable fragment, which are referred to as follow-sets.
As an example, the follow set of :expr
fragments is { COMMA
, SEMICOLON
, MATCH_ARROW
}. Any other token cannot follow an :expr
fragment, as it might cause ambiguities in later versions of the language.
This was previously not handled by gccrs at all. As a result, we had some test cases that contained ambiguous macro definitions that rustc rejected.
We dedicated some time this week to implement (almost!) all of these restrictions, including some complex cases involving repetitions:
Looking past zeroable repetitions
macro_rules! invalid {
($e:expr $(,)? $(;)* $(=>)* forbidden) => {{}};
// 1 2 3 4 5 (matches)
}
Since matches 2
, 3
and 4
might occur zero times (kleene operators *
or ?
), we need to check that the forbidden
token is allowed to follow an :expr
fragment, which is not the case since identifier tokens are not contained in its follow-set.
On the other hand, this macro is perfectly valid since a comma, contained in the follow-set of :expr
, is required to appear at least once before any forbidden tokens
macro_rules! invalid {
($e:expr $(;)* $(,)+ $(=>)* forbidden) => {{}};
// `+` kleen operator indicates one or more, meaning that there will always be at least one comma
}
Metavar fragments following other metavar fragments
macro_rules! mac {
($t:ty $lit:literal) => {{}}; // invalid
($t:ty $lit:block) => {{}}; // valid
}
The follow-set of :ty
fragments allows the user to specify another fragment as follow-up, but only if this metavar fragment is a :block
one.
An interesting tidbit is that these checks are performed at the beginning of the expansion phase in rustc, while we go through them during parsing. This is not set in stone, and we’d love to perform them later if required.
The remaining issues are marked as good-first-pr
as they are simple and offer an entrypoint into the compiler’s implementation of macros.
Restrict merged repetitions to metavars with the same amount of repetitions
Likewise, you cannot merge together repetitions which do not have the same amount of repetitions:
macro_rules! tuplomatron {
($($e:expr),* ; $($f:expr),*) => { ( $( ( $e, $f ) ),* ) };
}
let tuple = tuplomatron!(1, 2, 3; 4, 5, 6); // valid
let tuple = tuplomatron!(1, 2, 3; 4, 5); // invalid since both metavars do not have the same amount of repetitions
This gets expanded properly into one big tuple:
let tuple = TupleExpr:
outer attributes: none
inner attributes: none
Tuple elements:
TupleExpr:
outer attributes: none
inner attributes: none
Tuple elements:
1
4
TupleExpr:
outer attributes: none
inner attributes: none
Tuple elements:
2
5
TupleExpr:
outer attributes: none
inner attributes: none
Tuple elements:
3
6
final expression: none
Handle :tt fragments properly
Having :tt
fragments handled properly allows us to dwelve into the world of tt-munchers, a very powerful pattern which allows the implementation of extremely complex behaviors or DSLs. The target code we’re using for this comes directly from The Little Book of Rust Macros by Lukas Wirth, adapted to fit our non-println-aware compiler.
extern "C" {
fn printf(fmt: *const i8, ...);
}
fn print(name: &str, value: i32) {
unsafe {
printf(
"%s = %d\n\0" as *const str as *const i8,
name as *const str as *const i8,
value,
);
}
}
macro_rules! mixed_rules {
() => {{}};
(trace $name_str:literal $name:ident; $($tail:tt)*) => {
{
print($name_str, $name);
mixed_rules!($($tail)*);
}
};
(trace $name_str:literal $name:ident = $init:expr; $($tail:tt)*) => {
{
let $name = $init;
print($name_str, $name);
mixed_rules!($($tail)*);
}
};
}
fn main() {
mixed_rules! (trace "a\0" a = 14; trace "a\0" a; trace "b\0" b = 15;);
}
This is now handled by gccrs, and produces the same output as rustc.
~/G/gccrs > rustc tt-muncher.rs
~/G/gccrs > ./tt-muncher
a = 14
a = 14
b = 15
~/G/gccrs > gccrs tt-muncher.rs -o tt-muncher-gccrs
~/G/gccrs > ./tt-muncher-gccrs
a = 14
a = 14
b = 15