RustFest Barcelona – Pietro Albini: Shipping a stable compiler every six weeks

RustFest Barcelona – Pietro Albini: Shipping a stable compiler every six weeks


PIETRO: Hi, everyone. In this talk I am going to shed a bit of light
on why we – on how it works and why we do it that way. As they said, I’m Pietro, a member of the
Rust release team and a colleague of the infrastructure team. I’m actually a member of other teams, but
I do a lot of stuff in the project. I think everyone is aware by now that we actually
got our release out a few days ago, with some features everyone waited for, for a long time
but that’s not the only release. Six weeks later we released 1.38 which was
released in September and changed a lot of code. Users just reported 5 regressions after the
release. Two of them just broke valid code. Six weeks earlier, there was another release,
and this changed tens of thousands of lines of code and we just got three regressions
reported and unfortunately all of them broke valid code, but it’s a very little number. Even before, we got 1.46 out in July with
just four regressions supported so I wanted to explain a little bit why we do releases
this fast which creates a lot of problems for us and how can we prevent regressions,
and just get very few reported after the release is out? So why do we have this schedule? The question is interesting because it’s really
unusual in the compiler world. I collected some stats on some of the most
popular languages, like why there are some efforts to shorten the release cycle. Python recently announced that they are going
to switch to a year schedule. Rust is sort of popular and has a six-week
release cycle. In a compiler world that’s pretty fast, but
there is a simple reason why we do that. We have no pressure to ship things. If a feature is not ready, we have issues,
we can just delay it by a few weeks and nobody is going to care if it’s going to get stabilised
today or in a month and a half. And we actually do that a lot. The most obvious example is a few weeks ago,
when we decided that async/await wasn’t ready enough to be shipped into Rust 1.38 because
it turns out it wasn’t actually stabilised when the beta freeze happened and there were
blocker issues so we would have to crash the future and that’s something that we would
not love to do. We actually tried long release cycles and
it turns out we don’t actually – it doesn’t work for us. The addition came out – the stable edition
came out in early December and in September we still had questions: how to make the module
system work? We had a proposal at the end of September
which was not implemented yet and that’s what actually was released, but the proposal had
no time, users didn’t have much time to break it. It broke a lot of our internal processes. We actually did this thing which is something
I’m not comfortable with still, which is we actually learned that a change in the behaviour
of the model system directly on the beta branch, two weeks before the stable release came out,
and if we did a mistake there we would have no way to roll it back before waiting for
the next edition and we don’t even know if we are going to do the 2021 edition yet. This broke almost all the policies we have
but we had to do it because otherwise we would not be able to ship a working condition, and
thankfully it ended great. I’m not aware of any huge mistakes we made
but if we actually made them it would be really bad because we would have to wait a lot to
fix them and we would be stuck in the 2018 edition with a broken features set for compatibility
reasons. So with such fast early cycle, how can you
actually prevent regressions from reaching the stable channel? Of course, the first answer is the compiler’s
test suite because rustc has a lot of tests. We have thousands of them that test both the
compiled outward but also their messages and we have builders for each poll request which
takes three to four hours so we actually do a lot of testing but that’s not enough because
a test suite can’t really express everything that Rust language can do. So of course we use the compiler to build
the compiler itself. Because for every release we use the previous
one to build the compiler. Like on nat we used beta, on beta we used
stable and on stable we used the previous stable. That allows us to cut some corners because
the compiler bases use some unstable features and also it’s a bit old so there are a lot
of quirks in it but still that can’t catch everything. We get back reports from you all. We get them mostly from nat, not from beta,
because people don’t actually use beta. Asking our users to test beta is something
we can’t really do. With such a fast cycle we don’t have time
to test everything manually with the new compiler every six weeks; and languages with long release
cycles can afford to test the beta release, but we can’t, and even when we ask, people
don’t really do that, so we actually had an idea. Why don’t we test our users’ code ourselves? This is an idea that seems really bad and
seems to waste a lot of money but it actually works and it’s Crater. Crater is a project that Brian Anderson started
and I’m now the maintainer of which creates experiments which actually get all the source
code available on crates.io and all the Rust [inaudible] on GitHub with the cargo, so it’s
actually tested for every release to catch regressions. For each project we run cargo test two times,
once with stable and one with beta and if the cargo tests runs on beta – on stable but
fails on beta then that’s a regression and we get a nice colourful report where we can
inspect them. This is the actual report for 1.38 and we
got just 46 crates that failed and those are regressions nobody reported before. Then manually, it goes through each – I hope
we didn’t break any of yours – and manages the log and then files the issue so then the
compiler team looks at the issues, fixes them and ships the fix to you all. 1.39 went pretty well – this is 1.38 and we
actually had 600 crates that were broken so if we didn’t have Crater there is a good chance
your project wouldn’t compile anymore and this would break the trust you have. We know it’s not perfect. We don’t test any kind of code because we
don’t have access to your source code and nobody got round to write scrapers yet, and
also not every crate can be built in a sandbox environment. It’s not really something we can scale forever
in the long term because it uses a lot of computer sources already, which thankfully
are sponsored, but if the users sky rocket we are going to reach a point where it’s not
economically feasible to run Crater in a timely fashion anymore. Those are real problems but for now it works
great. It allows the types of regressions that often
affect some thousands of crates and it’s a real reason why we can afford to make such
fast releases. Without it, this is my personal opinion, but
I know it’s shared by other members of the release team, I wouldn’t be comfortable making
releases every six weeks without Crater because they would be so buggy I wouldn’t use them
myself. So to recap, the fast release cycles that
we have allow the team not to burn out and to simply ignore deadlines, and that’s great
especially for a community of mostly volunteers. And Crater is the real reason why we can afford
to do that. It’s a tool that wastes a lot of money but
actually gets us great results. So unfortunately I don’t think I have time
for questions, but I’m going to be around the conference today so if you have any questions,
you want to implement support for the repositories, reach out to me, I’m happy to talk to you
all, and thanks. [Applause]
>>Actually, you have some time for questions. PIETRO: Oh, that’s great.>>Do we have time for one question or two,
because I have actually two. PIETRO: I don’t know that.>>Okay, so first question, you were hinting
that maybe the addition idea wasn’t such a success for us. Would you think that jeopardises a possible
2021 edition of the language? PIETRO: Like the man issue wasn’t really the
addition itself; it was that we actually started working on it really late. So basically we went way over time with implementing
the features. This is my personal opinion, it’s not the
official opinion of the project, but if we make another edition I want explicit phrases
that we won’t accept any more changes after this date and to actually enforce that because
we nearly burnt out most of the team with the edition. There were people that were just for months
fixing regressions and fixing bugs and that’s not sustainable, especially because most of
the contributors to the compiler are volunteers.>>Okay. I will ask also my second question. So the second question is: for private repository,
of course you cannot run Crater, but how could somebody who has a private repository, a private
crate setup, would run Crater, or is that possible now? PIETRO: Someone could just test on Crater
and create bug reports if they could face compile. We have some ideas on how to create a Crater
for enterprises but it’s just a plan, an idea, and at the moment we don’t have enough development
resources to actually do the implementation test and documentation work that such a project
would require.>>So that would be necessary to support Crater
in a private environment? PIETRO: Yes.>>Okay.>>Any other questions?>>A lot of crates have peculiar requirements
about their environments. Can you talk about how Crater handles that
and specifically is it possible to customise the environment in which my crates are built
on Crater? PIETRO: So the environment doesn’t have any
kind of network access for obvious security reasons, so you can’t really – you can’t install
the dependency yourself but the build environment runs inside Docker. We have these big Docker images, 4GB, which
just has our usual packages installed, so if you know your crate doesn’t work you can
easily check with Docks LS because since recently it uses the same build code as Crater, so
if it builds on Docks LS it builds on Crater as well. And if it doesn’t build, you can file an issue,
probably on Docks LS is the best place, and if I want 18 of 4 packages available, we are
just going to install the package on the build environment and then your package will work.>>Hello, thanks for the great talk. How long does it take to run Crater on all
of the – PIETRO: Okay, so that actually varies a lot
because we are making constant changes to the build environment, changes with machines
and such. I think at the moment running cargo test on
the entire ecosystem takes a week and the running cargo checker, which we actually do
for sample requests, if there is a poll request that we know is risky, we usually run it beforehand
and then do cargo checker which is faster, takes about three days, but that really varies
mostly because we make a lot of changes to the machines.>>Any more?>>Thank you. Is it possible to supply the Crater run with
more runners to speed up the process? PIETRO: I think we could. At the moment, we are just in a sweet spot
because we have enough experiments that we fill out the service, but we don’t have any
idle time. And the queue is not that long. If we had more servers then the end result
is that for a bunch of time the server is going to be idle so we are just wasting resources
and we have actually a sponsorship firm co-operation, so if we reach a point where the queue is
not sustainable anymore we are going to actually get funding, though Crater is really heavy
on resources. At the moment I think we have 24 cores and
38GB of RAM, 4 telebytes of this space, so it’s not something where you can throw out
some old machine and get meaningful results out of it.>>All right, we are out of time for questions. We are going on to the next talk. Thank you.

Leave a Reply

Your email address will not be published. Required fields are marked *