Language Selection

English French German Italian Portuguese Spanish

Mozilla

Syndicate content
Planet Mozilla - https://planet.mozilla.org/
Updated: 1 week 1 day ago

Tantek Çelik: Life Happens: Towards Community Support For Prioritizing Life Events And Mutual Care, Starting With The #IndieWeb

Saturday 20th of February 2021 10:37:00 PM
Foreword

A couple of weeks ago I noticed one of our newer IndieWeb community participants found an example on the IndieWeb wiki that no longer worked, and it was from someone who hasn’t been around for a while.

I knew that person had various things come up in their personal life, thus left without warning, and was unable to maintain their site (with said example) as well. This wasn’t the first time this has happened. I noted in our community chat that there were community care, repair, and supportiveness issues worth discussing, summarized with:
tl;dr: life happens, and expressed a goal:

I’d like to figure out how we as a community can 1) provide support to folks who have “life happens” events and not feel “guilty” about being absent or abruptly having to stop participating, and 2) do “repair” on our pages in a kind and respectful way that doesn't exacerbate their guilt/shame, and ideally makes it clear they are welcome back any time

What followed was my stream of consciousness and braindump on the subject matter, which after seeing it resonated with and being encouraged by several members of the community, I collected into an IndieWeb wiki page: life happens. I’m blogging most of what I wrote there because I think it’s worth its own post and wanted to capture my thoughts & feelings on this matters while I remember the context.

“Life Happens” is an acknowledgement that there are numerous things that people experience in their actual physical lives that suddenly take higher priority than nearly anything else (like participation in volunteer-based communities), and those communities (like the IndieWeb) should acknowledge, accept, and be supportive of community members experiencing such events.

What Happens

What kind of events? Off the top of my head I came up with several that I’ve witnessed community members (including a few myself) experience, like:

  • getting married — not having experienced this myself, I can only imagine that for some folks it causes a priorities reset
  • having a child — from what I've seen this pretty much causes nearly everything else that isn’t essential to get dropped, acknowledging that there are many family shapes, without judgment of any
  • going through a bad breakup or divorce — the trauma, depression etc. experienced can make you want to not show up for anything, sometimes not even get out of bed
  • starting a new job — that takes up all your time, and/or polices what you can say online, or where you may participate
  • becoming an essential caregiver — caring for an aging, sick, or critically ill parent, family member, or other person
  • buying a house — often associated with a shift in focus of personal project time (hat tip: Marty McGuire)
  • home repairs or renovations — similar to “new house” project time, or urgent repairs. This is one that I’ve been personally both “dealing with” and somewhat embracing since December 2019 (with maybe a few weeks off at times), due to an infrastructure failure the previous month, which turned into an inspired series of renovations
  • death of a family member, friend, pet
  • … more examples of how life happens on the wiki
Values, People, and Making It Explicit

When these things happen, as a community, I feel we should respond with kindness, support, and understanding when someone steps back from community participation or projects. We should not shame or guilt them in any way, and ideally act in such a way that welcomes their return whenever they are able to do so.

Many projects (especially open source software) often talk about their “bus factor” (or more positively worded “lottery factor”). However that framing focuses on the robustness of the project (or company) rather than those contributing to it. Right there in IndieWeb’s motto is an encouragement to reframe: be a “people-focused alternative to the corporate […]”.

The point of “life happens” is to decenter the corporation or project when it comes to such matters, and instead focus on the good of the people in the community. Resiliency of humanity over resiliency of any particular project or organization.

Adopting such values and practices explicitly is more robust than depending on accidental good faith or informal cultural support. Such emotional care should be the clearly written default, rather than something everyone has to notice and figure out on their own. I want to encourage more mutual care-taking as a form of community-based resiliency, and make it less work for folks experiencing “life happens” moments. Through such care, I believe you get actually sustainable community resiliency, without having to sacrifice or burn people out.

Acknowledging Life Happens And You Should Take Care

It’s important to communicate to community members, and especially new community members that a community believes in mutual care-taking. That yes, if and when “life happens” to you that:

  • we want you to take care of what you need to take care of
  • you are encouraged to prioritize those things most important to you, and that the community will not judge or shame you in any way
  • you should not feel guilty about being absent, or abruptly having to stop participating
  • it is ok to ask for help in the community with any of your community projects or areas of participation, no matter what size or importance
  • the community will be here for you when you’re able to and want to return

It’s an incomplete & imperfect list, yet hopefully captures the values and general feeling of support. More suggestions welcome.

How to Help

Similarly, if you notice someone active in the community is missing, if you feel you know them well enough, you’re encouraged to reach out and unobtrusively check on them, and ask (within your capacity) if there’s anything you can do to help out with any community projects or areas of participation.

Thanks to Chris Aldrich for expanding upon How to help and encouraging folks to Keep in mind that on top of these life changes and stresses, the need to make changes to social activities (like decreasing or ceasing participation in the IndieWeb community) can be an added additional compounding stress on top of the others. Our goal should be to mitigate this additional stress as much as possible.

How to Repair

Absence(s) from the community can result in shared resources or projects falling behind or breaking. It’s important to provide guidance to the community with how to help repair such things, especially in a caring way without any shame or guilt. Speaking to a second person voice:

You might notice that one or more projects, wiki pages, or sections appear to be abandoned or in disrepair. This could be for any number of reasons, so it’s best to ask about it in a discussion channel to see if anyone knows what’s going on. If it appears someone is missing (for any reason), you may do kind and respectful repairs on related pages (wikifying), in a manner that attempts to minimize or avoid any guilt or shame, and ideally makes it clear they are welcome back any time.

If you come across an IndieWeb Examples section on a page where the links either don’t work (404, broken in some other way, or support appears to have been dropped), move that specific IndieWeb Example to a “Past Examples” section, and fix the links with Internet Archive versions, perhaps at a point in time of when the links were published (e.g. permalinks with dates in their paths), or by viewing history on the wiki page and determining when the broken links were added.

Encouraging More Communities To Be Supportive When Life Happens

When I shared these thoughts with the IndieWeb chat and wiki a couple of weeks ago, no one knew of any other open (source, standards, etc.) communities that had such an explicit “Life Happens” statement or otherwise explicitly captured such a sentiment.

My hope is that the IndieWeb community can set a good example here for making a community more humane and caring (rather than the “just work harder” capitalist default, or quiet unemotional detached neglect of an abandoned GitHub repo).

That being said, we’re definitely interested in knowing about other intentional creative communities with any similar explicit sentiments or statements of community care, especially those that acknowledge that members of a community may experience things which are more important to them than their participation in that community, and being supportive of that.

This blog post is a snapshot in time and my own expression, most of which is shared freely on the IndieWeb wiki.

If this kind of statement resonates with you and your communities, you’re encouraged to write one of your own, borrowing freely from the latest (and CC0 licensed) version on the wiki: life happens. Attribution optional. Either way, let us know, as it would be great to collect other examples of communities with explicit “life happens” statements.

Thanks

Thanks to early feedback & review in chat from Kevin Marks, Jacky Alcine, Anthony Ciccarello, Ben Werdmüller, and gRegor Morrill. On the wiki page, thanks for excellent additions from Chris Aldrich, and proofreading & precise fixes from Jeremy Cherfas. Thanks for the kind tweets Ana Rodrigues and Barry Frost.

Now back to some current “life happens” things… (also posted on IndieNews)

Cameron Kaiser: TenFourFox FPR30 SPR2 available

Friday 19th of February 2021 09:19:10 PM
TenFourFox Feature Parity Release "30.2" (SPR 2) is now available for testing (downloads, hashes, release notes). The reason this is another security-only release is because of my work schedule and also I spent a lot of time spinning my wheels on issue 621, which is the highest priority JavaScript concern because it is an outright crash. The hope was that I could recreate one of the Apple discussion pages locally and mess with it and maybe understand what is unsettling the parser, but even though I thought I had all the components, it still won't load or reproduce in a controlled environment. I've spent too much time on it and even if I could do more diagnostic analysis I still don't know if I can do anything better than "not crash" (and in SPR2 is a better "don't crash" fix, just one that doesn't restore any functionality). Still, if you are desperate to see this fixed, see if you can create a completely local Apple discussions page or clump of files that reliably crashes the browser. If you do this, either attach the archive to the Github issue or open a Tenderapp ticket and let's have a look. No promises, but if the community wants this fixed, the community will need to do some work on it.

In the meantime, I want to get back to devising a settings tab to allow the browser to automatically pick appropriate user agents and/or start reader mode by domain so that sites that are expensive or may not work with TenFourFox's older hacked-up Firefox 45 base can automatically select a fallback. Our systems are only getting slower compared to all the crap modern sites are adding, after all. I still need to do some plumbing work on it, so the fully-fledged feature is not likely to be in FPR31, but I do intend to have some more pieces of the plumbing documented so that you can at least mess with that. The user-agent selector will be based on the existing functionality that was restored in FPR17.

Assuming no major issues, FPR30 SPR2 goes live Monday evening Pacific as usual.

Daniel Stenberg: “I will slaughter you”

Friday 19th of February 2021 10:20:43 AM

You might know that I’ve posted funny emails I’ve received on my blog several times in the past. The kind of emails people send me when they experience problems with some device they own (like a car) and they contact me because my email address happens to be visible somewhere.

People sometimes say I should get a different email address or use another one in the curl license file, but I’ve truly never had a problem with these emails, as they mostly remind me about the tough challenges the modern technical life bring to people and it gives me insights about what things that run curl.

But not all of these emails are “funny”.

Category: not funny

Today I received the following email

From: Al Nocai <[redacted]@icloud.com> Date: Fri, 19 Feb 2021 03:02:24 -0600 Subject: I will slaughter you

That subject.

As an open source maintainer since over twenty years, I know flame wars and personal attacks and I have a fairly thick skin and I don’t let words get to me easily. It took me a minute to absorb and realize it was actually meant as a direct physical threat. It found its ways through and got to me. This level of aggressiveness is not what I’m prepared for.

Attached in this email, there were seven images and no text at all. The images all look like screenshots from a phone and the first one is clearly showing source code I wrote and my copyright line:

The other images showed other source code and related build/software info of other components, but I couldn’t spot how they were associated with me in any way.

No explanation, just that subject and the seven images and I was left to draw my own conclusions.

I presume the name in the email is made up and the email account is probably a throw-away one. The time zone used in the Date: string might imply US central standard time but could of course easily be phony as well.

How I responded

Normally I don’t respond to these confused emails because the distance between me and the person writing them is usually almost interplanetary. This time though, it was so far beyond what’s acceptable to me and in any decent society I couldn’t just let it slide. After I took a little pause and walked around my house for a few minutes to cool off, I wrote a really angry reply and sent it off.

This was a totally and completely utterly unacceptable email and it hurt me deep in my soul. You should be ashamed and seriously reconsider your manners.

I have no idea what your screenshots are supposed to show, but clearly something somewhere is using code I wrote. Code I have written runs in virtually every Internet connected device on the planet and in most cases the users download and use it without even telling me, for free.

Clearly you don’t deserve my code.

I don’t expect that it will be read or make any difference.

Update below, added after my initial post.

Al Nocai’s response

Contrary to my expectations above, he responded. It’s not even worth commenting but for transparency I’ll include it here.

I do not care. Your bullshit software was an attack vector that cost me a multimillion dollar defense project.

Your bullshit software has been used to root me and multiple others. I lost over $15k in prototyping alone from bullshit rooting to the charge arbitrators.

I have now since October been sandboxed because of your bullshit software so dipshit google kids could grift me trying to get out of the sandbox because they are too piss poor to know shat they are doing.

You know what I did to deserve that? I tried to develop a trade route in tech and establish project based learning methodologies to make sure kids aren’t left behind. You know who is all over those god damn files? You are. Its sickening. I got breached in Oct 2020 through federal server hijacking, and I owe a great amount of that to you.

Ive had to sit and watch as i reported:

  1. fireeye Oct/2020
  2. Solarwinds Oct/2020
  3. Zyxel Modem Breach Oct/2020
  4. Multiple Sigover attack vectors utilizing favicon XML injection
  5. JS Stochastic templating utilizing comparison expressions to write to data registers
  6. Get strong armed by $50billion companies because i exposed bullshit malware

And i was rooted and had my important correspondence all rerouted as some sick fuck dismantled my life with the code you have your name plastered all over. I cant even leave the country because of the situation; qas you have so effectively built a code base to shit all over people, I dont give a shit how you feel about this.

You built a formula 1 race car and tossed the keys to kids with ego problems. Now i have to deal with Win10 0-days because this garbage.

I lost my family, my country my friends, my home and 6 years of work trying to build a better place for posterity. And it has beginnings in that code. That code is used to root and exploit people. That code is used to blackmail people.

So no, I don’t feel bad one bit. You knew exactly the utility of what you were building. And you thought it was all a big joke. Im not laughing. I am so far past that point now.

/- Al

Al continues

Nine hours after I first published this blog post , Al replied again with two additional emails. His third and forth emails to me.

Email 3:

https://davidkrider.com/i-will-slaughter-you-daniel-haxx-se/
Step up. You arent scaring me. What led me here? The 5th violent attempt on my life. Apple terms of service? gtfo, thanks for the platform.

Amusingly he has found a blog post about my blog post.

Email 4:

There is the project: MOUT Ops Risk Analysis through Wide Band Em Spectrum analysis through different fourier transforms.
You and whoever the fuck david dick rider is, you are a part of this.
Federal server breaches-
Accomplice to attempted murder-
Fraud-
just a few.

I have talked to now: FBI FBI Regional, VA, VA OIG, FCC, SEC, NSA, DOH, GSA, DOI, CIA, CFPB, HUD, MS, Convercent, as of today 22 separate local law enforcement agencies calling my ass up and wasting my time.

You and dick ridin’ dave are respinsible. I dont give a shit, call the cops. I cuss them out wheb they call and they all go silent.

I’ve kept his peculiar formatting and typos. In email 4 there was also a PDF file attached named BustyBabes 4.pdf. It is apparently a 13 page document about the “NERVEBUS NERVOUS SYSTEM” described in the first paragraph as “NerveBus Nervous System aims to be a general utility platform that provides comprehensive and complex analysis to provide the end user with cohesive, coherent and “real-time” information about the environment it monitors.”. There’s no mention of curl or my name in the document.

Since I don’t know the status of this document I will not share it publicly, but here’s a screenshot of the front page:

Related

This topic on hacker news and reddit.

I have reported the threat to the Swedish police (where I live).

The Mozilla Blog: Expanding Mozilla’s Boards

Thursday 18th of February 2021 05:41:48 PM

I’m delighted to share that the Mozilla Foundation and Corporation Boards are each welcoming a new member.

Wambui Kinya is Vice President of Partner Engineering at Andela, a Lagos-based global talent network that connects companies with vetted, remote engineers from Africa and other emerging markets. Andela’s vision is a world where the most talented people can build a career commensurate with their ability – not their race, gender, or geography. Wambui joins the Mozilla Foundation Board and you can read more from her, here, on why she is joining. Motivated by the intersection of Africa, technology and social impact, Wambui has led business development and technology delivery, digital technology implementation, and marketing enablement across Africa, the United States, Europe and South America. In 2020 she was selected as one of the “Top 30 Most Influential Women” by CIO East Africa.

Laura Chambers is Chief Executive Officer of Willow Innovations, which addresses one of the biggest challenges for mothers, with the world’s first quiet, all-in-one, in-bra, wearable breast pump. She joins the Mozilla Corporation Board. Laura holds a wealth of knowledge in internet product, marketplace, payment, and community engagement from her time at AirBnB, eBay, PayPal, and Skype, as well as her current role at Willow. Her experience also includes business operations, marketing, shipping, global customer trust and community engagement. Laura brings a clear understanding of the challenges we face in building a better internet, coupled with strong business acumen, and an acute ability to hone in on key issues and potential solutions. You can read more from Laura about why she is joining here.

At Mozilla, we invite our Board members to be more involved with management, employees and volunteers than is generally the case, as I’ve written about in the past. To ready them for this, Wambui and Laura met with existing Board members, members of the management team, individual contributors and volunteers.

We know that the challenges of the modern internet are so big, and that expanding our capacity will help us develop solutions to those challenges. I am sure that Laura and Wambui’s insights and strategic thinking will be a great addition to our boards.

The post Expanding Mozilla’s Boards appeared first on The Mozilla Blog.

The Mozilla Blog: Why I’m Joining Mozilla’s Board of Directors

Thursday 18th of February 2021 05:37:24 PM

Laura Chambers

Like many of us I suspect, I have long been a fairly passive “end-user” of the internet. In my daily life, I’ve merrily skipped along it to simplify and accelerate my life, to be entertained, to connect with far-flung friends and family, and to navigate my daily life. In my career in Silicon Valley, I’ve happily used it as a trusty building block to help build many consumer technologies and brands – in roles leading turnarounds and transformations at market-creating companies like eBay, PayPal, Skype, Airbnb, and most recently as CEO of Willow Innovations Inc.

But over the past few years, my relationship with the internet has significantly changed. We’ve all had to face up to the cracks and flaws … many of which have been there for a while, but have recently opened into gaping chasms that we can’t ignore. The impact of curated platforms and data misuse on families, friendships, communities, politics and the global landscape has been staggering. And it’s hit close to home … I have three young children, all of whom are getting online much faster and earlier than expected, due to the craziness of homeschooling, and my concerns about their safety and privacy are tremendous. All of a sudden, my happy glances at the internet have been replaced with side-eyes of mistrust.

So last year, in between juggling new jobs, home-offices full of snoring dogs, and home schooling, I started to think about what I could do to help. In that journey, I was incredibly fortunate to connect with the team at Mozilla. As I learned more about the team, met the talented people at the helm, and dove into their incredible mission to ensure the internet is free, open and accessible to all, I couldn’t think of a better way to do something practical and meaningful to help than through joining the Board.

The opportunity ahead is astounding … using the power of the open internet to make the world a better, freer, more positively connected place. Mozilla has an extraordinary legacy of leading that charge, and I couldn’t be more thrilled to be join the exceptional group driving toward a much better future. I look forward to us all once again being able to merrily skip along our daily lives, with the internet as our trusty guide and friend along the way.

The post Why I’m Joining Mozilla’s Board of Directors appeared first on The Mozilla Blog.

Armen Zambrano: Making pip installations faster with wheel

Thursday 18th of February 2021 04:24:18 PM

I recently noticed the following message in Sentry’s pip installation step:

Using legacy ‘setup.py install’ for openapi-core, since package ‘wheel’ is not installed.

Upon some investigation, I noticed that the package wheel was not being installed. After making some changes, I can now guarantee that our development environment installs it by default and it’s given us about 40–50% speed gain.

<figcaption>Timings from before and after installing wheel</figcaption>

The screenshot above shows the steps from two different Github workflows; it installs Sentry’s Python packages inside of a fresh virtualenv and the pip cache is available.

If you see a message saying that wheelpackage is not installed, make sure to attend to it!

The Mozilla Blog: Why I’m Joining Mozilla’s Board of Directors

Thursday 18th of February 2021 02:46:14 PM

Wambui Kinya

 My introduction to Mozilla was when Firefox was first launched. I was starting my career as a software developer in Boston, MA at the time. My experience was Firefox was a far superior browser. I was also deeply fascinated by the notion that, as an open community, we could build and evolve a product for greater good.

You have probably deduced from this, that I am also old enough that growing up in my native country, Kenya, most of my formative years were under the policies of “poverty reduction programs” dictated and enforced by countries and institutions in the northern hemisphere. My firsthand experience of many of these programs was observing my mother, a phenomenal environmental engineer and academic, work tirelessly to try to convince donor organizations to be more inclusive of the communities they sought to serve and benefit.

This drive to have greater inclusion and representation was deepened over ten years of being a woman and person of color in technology in corporate America. I will spare you the heartache of recounting my experiences of being the first or the only one. But I must also acknowledge, I was fortunate enough to have leaders who wanted to help me succeed and grow. As my professional exposure became more global, I felt an urgency to have more representation and greater voice from Africa.

When I moved back to Kenya, ten years ago, I was excited about the advances in access to technology. However, I was disheartened that it was primarily as consumers rather than creators of technology products. We were increasingly distanced from the concentration of power influencing our access, our data and our ability to build and compete in this internet age.

My professional journey has since been informed by the culmination of believing in the talent that is in Africa, the desire to build for Africa and by extension the digital sovereignty of citizens of the global south. I was greatly influenced by the audacity of organizations like ThoughtWorks that thought deeply about the fight against digital colonialism and invested in free and open source products and communities. This is the context in which I was professionally reintroduced to Mozilla and its manifesto.

Mozilla’s commitment and reputation to “ensure the internet remains a public resource that is open and accessible to us all” has consistently inspired me. However, there is an increased urgency to HOW this is done given the times we live in. We must not only build, convene and enable technology and communities on issues like disinformation, privacy, trustworthy AI and digital rights, but it is imperative that we consider:

  • how to rally citizens and ensure greater representation;
  • how we connect leaders and enable greater agency to produce; and finally,
  • how we shape an agenda that is more inclusive.

This is why I have joined the Mozilla board. I am truly honored and look forward to contributing but also learning alongside you.

Onwards ever, backwards never!

The post Why I’m Joining Mozilla’s Board of Directors appeared first on The Mozilla Blog.

Benjamin Bouvier: A primer on code generation in Cranelift

Wednesday 17th of February 2021 06:00:00 PM

Cranelift is a code generator written in the Rust programming language that aims to be a fast code generator, which outputs machine code that runs at reasonable speeds.

The Cranelift compilation model consists in compiling functions one by one, holding extra information about external entities, like external functions, memory addresses, and so on. This model allows for concurrent and parallel compilation of individual functions, which supports the goal of fast compilation. It was designed this way to allow for just-in-time (JIT) compilation of WebAssembly binary code in Firefox, although its scope has broadened a bit. Nowadays it is used in a few different WebAssembly runtimes, including Wasmtime and Wasmer, but also as an alternative backend for Rust debug compilation, thanks to cg_clif.

A classic compiler design usually includes running a parser to translate the source to some form of intermediate representations, then run optimization passes onto them, then feeds this to the machine code generator.

This blog post focuses on the final step, namely the concepts that are involved in code generation, and what they map to in Cranelift. To make things more concrete, we'll take a specific instruction, and see how it's translated, from its creation down to code generation. At each step of the process, I'll provide a short (ahem) high-level explanation of the concepts involved, and I'll show what they map to in Cranelift, using the example instruction. While this is not a tutorial detailing how to add new instructions in Cranelift, this should be an interesting read for anyone who's interested in compilers, and this could be an entry point if you're interested in hacking on the Cranelift codegen crate.

This is our plan for this blog post: each squared box represents data, each rounded box is a process. We're going to go through each of them below.

graph TD; clif[Optimized CLIF]; vcode[VCode]; final_vcode[Final VCode]; machine_code[Machine code artifacts]; lowering([Lowering]); regalloc([Register allocation]); codegen([Machine code generation]); clif --> lowering --> vcode --> regalloc --> final_vcode --> codegen --> machine_code Intermediate representations

Compilers use intermediate representations (IR) to represent source code. Here we're interested in representations of the data flow, that is instructions themselves and only that. The IRs contain information about the instructions themselves, their operands, type specialization information, and any additional metadata that might be useful. IRs usually map to a certain level of abstraction, and as such, they are useful for solving different problems that require different levels of abstraction. Their shape (which data structures) and numbers often have a huge impact on the performance of the compiler itself (that is, how fast it is at compiling).

In general, most programming languages use IRs internally, and yet, these are invisible to the programmers. The reason is that source code is usually first parsed (tokenized, verified) and then translated into an IR. The abstract syntax tree, aka AST, is one such IR representing the source code itself, in a format that's very close to the source code itself. Since the raison d'être of Cranelift is to be a code generator, having a text format is secondary, and only useful for testing and debugging purposes. That's why embedders directly create and manipulate Cranelift's IR.

At the time of writing, Cranelift has two IRs to represent the function's code:

  • one external, high-level intermediate representation, called CLIF (for Cranelift IR format),
  • one internal, low-level intermediate representation called VCode (for virtual-registerized code).
CLIF IR

CLIF is the IR that Cranelift embedders create and manipulate. It consists of high-level typed operations that are convenient to use and/or can be simply translated to machine code. It is in static single assignment (SSA) form: each value referenced by an operation (SSA value) is defined only once, and may have as many uses as desired. CLIF is practical to use and manipulate for classic compilers optimization passes (e.g. LICM), as it is generic over the target architecture which we're compiling to.

let x = builder.ins().iconst(types::I64, 42); let y = builder.ins().iconst(types::I64, 1337); let sum = builder.ins().iadd(x, y);

An example of Rust code that would generate CLIF IR: using an IR builder, two constant 64-bits integer SSA values x and y are created, and then added together. The result is stored into the sum SSA value, which can then be consumed by other instructions.

The code for the IR builder we're manipulating above is automatically generated by the cranelift-codegen build script. The build script uses a domain specific meta language (DSL)1 that defines the instructions, their input and output operands, which input types are allowed, how the output type is inferred, etc. We won't take a look at this today: this is a bit too far from code generation, but this could be material for another blog post.

As an example of a full-blown CLIF generator, there is a crate in the Cranelift project that allows translating from the WebAssembly binary format to CLIF. The Cranelift backend for Rustc uses its own CLIF generator that translates from one of the Rust compiler's IRs.

Finally, it's time to reveal what's going to be our running example! The Chosen One is the iadd CLIF operation, which allows to add two integers of any length together, with wrapping semantics. It is both simple to understand what it does, and exhibits interesting behaviors on the two architectures we're interested in. So, let's continue down the pipeline!

VCode IR

Later on, the CLIF intermediate representation is lowered, i.e. transformed from a high-level one into a lower-level one. Here lower level means a form more specialized for a machine architecture. This lower IR is called VCode in Cranelift. The values it references are called virtual registers (more on the virtual bit below). They're not in SSA form anymore: each virtual register may be redefined as many times as we want. This IR is used to encode register allocation constraints and it guides machine code generation. As a matter of fact, since this information is tied to the machine code's representation itself, this IR is also target-specific: there's one flavor of VCode per each CPU architecture we're compiling to.

Let's get back to our example, that we're going to compile on two instruction set architectures: - ARM 64-bits (aka aarch64), which is used in most mobile devices but start to become mainstream on laptops (Apple's Mac M1, some Chromebooks) - Intel's x86 64-bits (aka x86_64, also abbreviated x64), which is used in most desktop and laptop machines).

An integer addition machine instruction on aarch64 will take three operands: two input operands (one of which must be a register), and another third output register operand. While on the x86_64 architecture, the equivalent instruction involves a total of two registers: one that is a read-only source register, and another that is an in-out modified register, containing both the second source and the destination register. We'll get back to this.

So considering iadd, let's look at (one of2) the VCode instruction that's used to represent integer additions on aarch64 (as defined in cranelift/codegen/src/isa/aarch64/inst/mod.rs):

/// An ALU operation with two register sources and a register destination. AluRRR { alu_op: ALUOp, rd: Writable<Reg>, rn: Reg, rm: Reg, },

Some details here:

  • alu_op defines the sub-opcode used in the ALU (Arithmetic Logic Unit). It will be AluOp::Add64 for a 64-bits integer addition.
  • rn and rm are the conventional aarch64 names for the two input registers.
  • rd is the destination register. See how it's marked as Writable, while the two others are not? Writable is a plain Rust wrapper that makes sure that we can statically differentiate read-only registers from writable registers; a neat trick that allows us to catch more issues at compile-time.

All this information is directly tied to the machine code representation of an addition instruction on aarch64: each field is later used to select some bytes that will be generated during code generation.

As said before, the VCode is specific to each architecture, so x86_64 has a different VCode representation for the same instruction (as defined in cranelift/codegen/src/isa/x64/inst/mod.rs):

/// Integer arithmetic/bit-twiddling: (add sub and or xor mul adc? sbb?) (32 64) (reg addr imm) reg AluRmiR { is_64: bool, op: AluRmiROpcode, src: RegMemImm, dst: Writable<Reg>, },

Here, the sub-opcode is defined as part of the AluRmiROpcode enum (the comment hints at which other x86 machine instructions are generated by this same VCode). See how there's only one src (source) register (or memory or immediate operand), while the instruction conceptually takes two inputs? That's because it's expected that the dst (destination) register is modified, that is, both read (so it's the second input operand) and written to (so it's the result register). In equivalent C code, the x86's add instruction doesn't actually do a = b + c. What it does is a += b, that is, one of the sources is consumed by the instruction. This is an artifact inherited from the design of older x86 machines in the 1970's, when instructions were designed around an accumulator model (and representing efficiently three operands in a CISC architecture would make the encoding larger and harder than it is).

Instruction selection (lowering)

As said before, converting from the high-level IR (CLIF) to the low-level IR (VCode) is called lowering. Since VCode is target-dependent, this process is also target-dependent. That's where we consider which machine instructions get eventually used for a given CLIF opcode. There are many ways to achieve the same machine state results for given semantics, but some of these ways are faster than other, and/or require fewer code bytes to achieve. The problem can be summed up like this: given some CLIF, which VCode can we create to generate the fastest and/or smallest machine code that carries out the desired semantics? This is called instruction selection, because we're selecting the VCode instructions among a set of different possible instructions.

How do these IR map to each other? A given CLIF node may be lowered into 1 to N VCode instructions. A given VCode instruction may lead to the code generation of 1 to M machine instructions. There are no rules governing the maximum of entities mapped. For instance, the integer addition CLIF opcode iadd on 64-bits inputs maps to a single VCode instruction on aarch64. The VCode instruction then causes a single code instruction to be generated.

Other CLIF opcodes may generate more than a single machine instruction eventually. Consider the CLIF opcode for signed integer division idiv. Its semantics define that it traps for zero inputs and in case of integer overflow3. On aarch64, this is lowered into:

  • one VCode instruction that checks if the input is zero and trap otherwise
  • two VCode instructions for comparing the input values against the minimal integer value and -1
  • one VCode instruction to trap if the two input values match what we checked against
  • and one VCode instruction that does the actual division operation.

Each of these VCode instruction then generates one or more machine code instructions, resulting in a bit of a longer sequence.

Let's look at the lowering of iadd on aarch64 (in cranelift/codegen/src/isa/aarch64/lower_inst.rs), edited and simplified for clarity. I've added comments in the code, explaining what each line does:

Opcode::Iadd => { // Get the destination register. let rd = get_output_reg(ctx, outputs[0]).only_reg().unwrap(); // Get the controlling type of the addition (32-bits int or 64-bits int or // int vector, etc.). let ty = ty.unwrap(); // Force one of the inputs into a register, not applying any signed- or // zero-extension. let rn = put_input_in_reg(ctx, inputs[0], NarrowValueMode::None); // Try to see if we can encode the second operand as an immediate on // 12-bits, maybe by negating it; // Otherwise, put it into a register. let (rm, negated) = put_input_in_rse_imm12_maybe_negated( ctx, inputs[1], ty_bits(ty), NarrowValueMode::None, ); // Select the ALU subopcode, based on possible negation and controlling // type. let alu_op = if !negated { choose_32_64(ty, ALUOp::Add32, ALUOp::Add64) } else { choose_32_64(ty, ALUOp::Sub32, ALUOp::Sub64) }; // Emit the VCode instruction in the VCode stream. ctx.emit(alu_inst_imm12(alu_op, rd, rn, rm)); }

In fact, the alu_inst_imm12 wrapper can create one VCode instruction among a set of possible ones (since we're trying to select the best one). For the sake of simplicity, we'll assume that AluRRR is going to be generated, i.e. the selected instruction is the one using only register encodings for the input values.

Register allocation graph TD vcode_vreg[VCode with virtual registers] regalloc([Register allocation]) vcode_rreg[VCode with real registers] codegen([Code generation]) machine_code(Machine code) vcode_vreg --> regalloc --> vcode_rreg --> codegen --> machine_code VCode, registers and stack slots

Hey, ever wondered what the V in VCode meant? Back to the drawing board. While a program may reference a theoretically unlimited number of instructions, each referencing a theoretically unlimited number of values as inputs and outputs, the physical machine only has a fixed set of containers for those values:

  • either they must live in machine registers: very fast to access in the CPU, take some CPU real estate, thus are costly, so there are usually few of them.
  • or they must live in the process' stack memory: it's slower to access, but we can have virtually any amount of stack slots.
mov %edi,-0x4(%rbp) mov %rsi,-0x10(%rbp) mov -0x4(%rbp),%eax

In this example of x86 machine code, %edi, %rsi, %rbp, %eax are all registers; stack slots are memory addresses computed as the frame pointer (%rbp) plus an offset value (which happens to be negative here). Note that stack slots may be referred to by the stack pointer (%rsp) in general.

Defining the register allocation problem

The problem of mapping the IR values (in VCode these are the Reg) to machine "containers" is called register allocation (aka regalloc). Inputs to register allocation can be as numerous as we want them, and map to "virtual" values, hence we call them virtual registers. And... that's where the V from VCode comes from: the instructions in VCode reference values that are virtual registers before register allocation, so we say the code is in virtualized register form. The output of register allocation is a set of new instructions, where the virtual registers have been replaced by real registers (the physical ones, limited in quantity) or stack slots references (and other additional metadata).

// Before register allocation, with unlimited virtual registers: v2 = v0 + v1 v3 = v2 * 2 v4 = v2 + 1 v5 = v4 + v3 return v5 // One possible register allocation, on a machine that has 2 registers %r0, %r1: %r0 = %r0 + %r1 %r1 = %r0 * 2 %r0 = %r0 + 1 %r1 = %r0 + %r1 return %r1

When all is well, the virtual registers don't conceptually live at the same time, and they can be put into physical registers. Issues arise when there's not enough physical registers to contain all the virtual registers that live at the same time, which is the case for... a very large majority of programs. Then, register allocation must decide which registers continue to live in registers at a given program point, and which should be spilled into a stack slot, effectively storing them onto the stack for later use. This later reuse will imply to reload them from the stack slot, using a load machine instruction. The complexity resides in choosing which registers should be spilled, at which program point they should be spilled, and at which program points we should reload them, if we need to do so. Making good choices there will have a large impact on the speed of the generated code, since memory accesses to the stack imply an additional runtime cost. For instance, a variable that's frequently used in a hot loop should live in a register for the whole loop's lifetime, and not be spilled/reloaded in the middle of the loop.

// Before register allocation, with unlimited virtual registers: v2 = v0 + v1 v3 = v0 + v2 v4 = v3 + v1 return v4 // One possible register allocation, on a machine that has 2 registers %r0, %r1. // We need to spill one value, because there's a point where 3 values are live at the same time! spill %r1 --> stack_slot(0) %r1 = %r0 + %r1 %r1 = %r0 + %r1 reload stack_slot(0) --> %r0 %r1 = %r1 + %r0 return %r1

And, since we like to have our cake and eat it too, the register allocator itself should be fast: it should not take an unbounded amount of time to make these allocation decisions. Register allocation has the good taste to be a NP-complete problem. Concretely, this means that implementations cannot find the best solutions given arbitrary inputs, but they'll estimate good solutions based on heuristics, in worst-case quadratic time over the size of the input. All of this makes it so that register allocation has its own whole research field, and has been extensively studied for some time now. It is a fascinating problem.

Register allocation in Cranelift

Back to Cranelift. The register allocation contract is that if a value must live in a real register at a given program point, then it does live where it should (unless register allocation is impossible). At the start of code generation for a VCode instruction, we are guaranteed that the input values live in real registers, and that the output real register is available before the next VCode instruction.

You might have noticed that the VCode instructions only refer to registers, and not stack slots. But where are the stack slots, then? The trick is that the stack slots are invisible to VCode. Register allocation may create an arbitrary number of spills, reloads, and register moves4 around VCode instructions, to ensure that their register allocation constraints are met. This is why the output of register allocation is a new list of instructions, that includes not only the initial instructions filled with the actual registers, but also additional spill, reload and move (VCode) instructions added by regalloc.

As said before, this problem is so sufficiently complex, involved and independent from the rest of the code (assuming the right set of interfaces!) that its code lives in a separate crate, regalloc.rs, with its own fuzzing and testing infrastructure. I hope to shed some light on it at some point too.

What's interesting to us today is the register allocation constraints. Consider the aarch64 integer add instruction add rd, rn, rm: rd is the output virtual register that's written to, while rn and rm are the inputs, thus read from. We need to inform the register allocation algorithm about these constraints. In regalloc jargon, "read to" is known as used, while "written to" is known as defined. Here, the aarch64 VCode instruction AluRRR does use rn and rm, and it defines rd. This usage information is collected in the aarch64_get_regs function (cranelift/codegen/src/isa/aarch64/inst/mod.rs):

fn aarch64_get_regs(inst: &Inst, collector: &mut RegUsageCollector) { match inst { &Inst::AluRRR { rd, rn, rm, .. } => { collector.add_def(rd); collector.add_use(rn); collector.add_use(rm); } // etc.

Then, after register allocation has assigned the physical registers, we need to instruct it how to replace virtual register mentions by physical register mentions. This is done in the aarch64_map_regs function (same file as above):

fn aarch64_map_regs<RUM: RegUsageMapper>(inst: &mut Inst, mapper: &RUM) { // ... match inst { &mut Inst::AluRRR { ref mut rd, ref mut rn, ref mut rm, .. } => { map_def(mapper, rd); map_use(mapper, rn); map_use(mapper, rm); } // etc.

Note this is reflecting quite precisely what the usage collector did: we're replacing the virtual register mention for the defined register rd with the information (which real register) provided by the RegUsageMapper. These two functions must stay in sync, otherwise here be dragons! (and bugs very hard to debug!)

Register allocation on x86

On Intel's x86, register allocation may be a bit trickier: in some cases, the lowering needs to be carefully written so it satisfies some register allocation constraints that are very specific to this architecture. In particular, x86 has fixed register constraints as well as tied operands.

For this specific part, we'll look at the integer shift-left instruction, which is equivalent to C's x << y. Why this particular instruction? It exhibits both properties that we're interested in studying here. The lowering of iadd is similar, albeit slightly simpler, as it only involves tied operands.

Fixed register constraints

On the one hand, some instructions expect their inputs to be in fixed registers, that is, specific registers arbitrarily predefined by the architecture manual. For the example of the shift instruction, if the count is not statically known at compile time (it's not a shift by a constant value), then the amount by which we're shifting must be in the rcx register5.

Now, how do we make sure that the input value actually is in rcx? We can mark rcx as used in the get_regs function so regalloc knows about this, but nothing ensures that the input resides in it at the beginning of the instruction. To resolve this, we'll introduce a move instruction during lowering, that is going to copy the input value into rcx. Then we're sure it lives there, and register allocation knows it's used: we're good to go!

In a nutshell, this shows how lowering and register allocation play together:

  • during lowering, we introduce a move from a dynamic shift input value to rcx before the actual shift
  • in the register usage function, we mark rcx as used
  • (nothing to do in the register mapping function: rcx is a real register already)
Tied operands

On the other hand, some instructions have operands that are both read and written at the same time: we call them modified in Cranelift and regalloc.rs, but they're also known as tied operands in the compiler literature. It's not just that there's a register that must be read, and a register that must be written to: they must be the same register. How do we model this, then?

Consider a naive solution. We take the input virtual register, and decide it's allocated to the same register as the output (modified) register. Unfortunately, if the chosen virtual register was going to be reused by another later VCode instruction, then its value would be overwritten (clobbered) by the current instruction. This would result in incorrect code being generated, so this is not acceptable. In general we can't clobber the value that was in an input value during lowering, because that's the role of regalloc to make this kind of decisions.

// Before register allocation, with virtual registers: v2 = v0 + v1 v3 = v0 + 42 // After register allocation, on a machine with two registers %r0 and %r1: // assign v0 to %r0, v1 to %r1, v2 to %r0 %r0 += v1 ... = %r0 + 42 // ohnoes! the value in %r0 is v2, not v0 anymore!

The right solution is, again, to copy this input virtual register into the output virtual register, right before the instruction. This way, we can still reuse the untouched input register in other instructions without modifying it: only the copy is written to.

Pfew! We can now look at the entire lowering for the shift left instruction, edited and commented for clarity:

// Read the instruction operand size from the output's type. let size = dst_ty.bytes() as u8; // Put the left hand side into a virtual register. let lhs = put_input_in_reg(ctx, inputs[0]); // Put the right hand side (shift amount) into either an immediate (if it's // statically known at compile time), or into a virtual register. let (count, rhs) = if let Some(cst) = ctx.get_input_as_source_or_const(insn, 1).constant { // Mask count, according to Cranelift's semantics. let cst = (cst as u8) & (dst_ty.bits() as u8 - 1); (Some(cst), None) } else { (None, Some(put_input_in_reg(ctx, inputs[1]))) }; // Get the destination virtual register. let dst = get_output_reg(ctx, outputs[0]).only_reg().unwrap(); // Copy the left hand side into the (modified) output operand, to satisfy the // mod constraint. ctx.emit(Inst::mov_r_r(true, lhs, dst)); // If the shift count is statically known: nothing particular to do. Otherwise, // we need to put it in the RCX register. if count.is_none() { let w_rcx = Writable::from_reg(regs::rcx()); // Copy the shift count (which is in rhs) into RCX. ctx.emit(Inst::mov_r_r(true, rhs.unwrap(), w_rcx)); } // Generate the actual shift instruction. ctx.emit(Inst::shift_r(size, ShiftKind::ShiftLeft, count, dst));

And this is how we tell the register usage collector about our constraints:

Inst::ShiftR { num_bits, dst, .. } => { if num_bits.is_none() { // if the shift count is dynamic, mark RCX as used. collector.add_use(regs::rcx()); } // In all the cases, the destination operand is modified. collector.add_mod(*dst); }

Only the modified register needs to be mapped to its allocated physical register:

Inst::ShiftR { ref mut dst, .. } => { map_mod(mapper, dst); } Virtual registers copies and performance

Do these virtual register copies sound costly to you? In theory, they could lead to the code generation of a move instructions, increasing the size of the code generated and causing a small runtime cost. In practice, register allocation, through its interface, knows how to identify move instructions, their source and their destination. By analyzing them, it can see when a source isn't used after a given move instruction, and thus allocate the same register for the source and the destination of the move. Then, when Cranelift generates the code, it will avoid generating a move from a physical register to the same one6. As a matter of fact, creating a VCode copy doesn't necessarily mean that it will generate a machine code move instruction later: it is present just in case regalloc needs it, but it can be avoided when it's spurious.

Code generation

Oh my, we're getting closer to actually being able to run the code! Once register allocation has run, we can generate the actual machine code for the VCode instructions. Cool kids call this step of the pipeline codegen, for code generation. This is the part where we decipher the architecture manuals provided by the CPU vendors, and generate the raw machine bytes for our machine instructions. In Cranelift, this means filling a code buffer (there's a MachBuffer sink interface for this!), returned along some internal relocations7 and additional metadata. Let's see what happens for our integer addition, when the times come to generate the code for its VCode equivalent AluRRR on aarch64 (in cranelift/codegen/src/isa/aarch64/inst/emit.rs):

// We match on the VCode's identity here: &Inst::AluRRR { alu_op, rd, rn, rm } => { // First select the top 11 bits based on the ALU subopcode. let top11 = match alu_op { ALUOp::Add32 => 0b00001011_000, ALUOp::Add64 => 0b10001011_000, // etc }; // Then decide the bits 10 to 15, based on the ALU subopcode as well. let bit15_10 = match alu_op { // other cases _ => 0b000000, }; // Then use an helper and pass forward the allocated physical registers // values. sink.put4(enc_arith_rrr(top11, bit15_10, rd, rn, rm)); }

And what's this enc_arith_rrr doing, then?

fn enc_arith_rrr(bits_31_21: u32, bits_15_10: u32, rd: Writable<Reg>, rn: Reg, rm: Reg) -> u32 { (bits_31_21 << 21) | (bits_15_10 << 10) | machreg_to_gpr(rd.to_reg()) | (machreg_to_gpr(rn) << 5) | (machreg_to_gpr(rm) << 16) }

Encoding the instruction parts (operands, register mentions) is a lot of bit twiddling and fun. We do so for each VCode instruction, until we've generated the whole function's body. If you remember correctly, at this point register allocation may have added some spills/reloads/move instructions. From the codegen's point of view, these are just regular instructions with precomputed operands (either real registers, or memory operands involving the stack pointer), so they're not treated particularly and they're just generated the same way other VCode instructions are.

More work is done by the codegen backend then, to optimize blocks placement, compute final branch offsets, etc. If you're interested by this, I strongly encourage you to go read this blog post by Chris Fallin. After this, we're finally done: we've produced a code buffer, as well as external relocations (to other functions, memory addresses, etc.) for a single function. The code generator's task is complete: the final steps consist in linking and, optionally, producing an executable binary.

Mission accomplished!

So, we're done for today! Thanks for reading this far, hope it has been a useful and pleasant read to you! Feel free to reach out to me on the twitterz if you have additional remarks/questions, and to go contribute on Wasmtime/Cranelift if this sort of things is interesting to you

This Week In Rust: This Week in Rust 378

Wednesday 17th of February 2021 05:00:00 AM

Hello and welcome to another issue of This Week in Rust! Rust is a systems language pursuing the trifecta: safety, concurrency, and speed. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

No newsletters this week.

Official Project/Tooling Updates Observations/Thoughts Rust Walkthroughs Miscellaneous Crate of the Week

Despite having no nominations, this week's crate is firestorm, a fast intrusive flamegraph profiling library.

llogiq is pretty pleased anyway with the suggestion.

Submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from Rust Core

340 pull requests were merged in the last week

Rust Compiler Performance Triage

A mostly quiet week, though with an excellent improvement in bootstrap times, shaving off a couple percent off the total and 10% off of rustc_middle due to changes in the code being compiled.

Triage done by @simulacrum. Revision range: ea09825..f1c47c7

1 Regressions, 2 Improvements, 1 Mixed

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

No RFCs were approved this week.

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs Tracking Issues & PRs New RFCs Upcoming Events Online

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

Have you seen someone juggle several items with one hand? That's the point of async. Blocking (non-async) it like writing - it requires constant work from each hand. If you want to write twice as fast you'll need two hands and write with both at the same time. That's multithreading. If you juggle, the moment the item leaves your hand and is in the air, you have it left with nothing to do. That's similar to network IO - you make a request and are just waiting for the server to respond. You could be doing something in the meantime, like catching another item and throwing it back up again. That's what "await" does - it says I threw and item into the air, so I want my current thread / hand to switch over to catch something else now.

/u/OS6aDohpegavod4 on /r/rust

Thanks to Jacob Pratt for the suggestion.

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman.

Discuss on r/rust

Daniel Stenberg: Transfers vs connections spring cleanup

Tuesday 16th of February 2021 10:39:10 AM

Warning: this post is full of libcurl internal architectural details and not much else.

Within libcurl there are two primary objects being handled; transfers and connections. The transfers objects are struct Curl_easy and the connection counterparts are struct connectdata.

This is a separation and architecture as old as libcurl, even if the internal struct names have changed a little through the years. A transfer is associated with none or one connection object and there’s a pool with potentially several previously used, live, connections stored for possible future reuse.

A simplified schematic picture could look something like this:

<figcaption>One transfer, one connection and three idle connections in the pool.</figcaption> Transfers to connections

These objects are protocol agnostic so they work like this no matter which scheme was used for the URL you’re transferring with curl.

Before the introduction of HTTP/2 into curl, which landed for the first time in September 2013 there was also a fixed relationship that one transfer always used (none or) one connection and that connection then also was used by a single transfer. libcurl then stored the association in the objects both ways. The transfer object got a pointer to the current connection and the connection object got a pointer to the current transfer.

Multiplexing shook things up

Lots of code in libcurl passed around the connection pointer (conn) because well, it was convenient. We could find the transfer object from that (conn->data) just fine.

When multiplexing arrived with HTTP/2, we then could start doing multiple transfers that share a single connection. Since we passed around the conn pointer as input to so many functions internally, we had to make sure we updated the conn->data pointer in lots of places to make sure it pointed to the current driving transfer.

This was always awkward and the source for agony and bugs over the years. At least twice I started to work on cleaning this up from my end but it quickly become a really large work that was hard to do in a single big blow and I abandoned the work. Both times.

Third time’s the charm

This architectural “wart” kept bugging me and on January 8, 2021 I emailed the curl-library list to start a more organized effort to clean this up:

Conclusion: we should stop using ‘conn->data’ in libcurl

Status: there are 939 current uses of this pointer

Mission: reduce the use of this pointer, aiming to reach a point in the future when we can remove it from the connection struct.

Little by little

With the help of fellow curl maintainer Patrick Monnerat I started to submit pull requests that would remove the use of this pointer.

Little by little we changed functions and logic to rather be anchored on the transfer rather than the connection (as data->conn is still fine as that can only ever be NULL or a single connection). I made a wiki page to keep an updated count of the number of references. After the first ten pull requests we were down to just over a hundred from the initial 919 – yeah the mail quote says 939 but it turned out the grep pattern was slightly wrong!

We decided to hold off a bit when we got closer to the 7.75.0 release so that we wouldn’t risk doing something close to the ship date that would jeopardize it. Once the release had been pushed out the door we could continue the journey.

Gone!

As of today, February 16 2021, the internal pointer formerly known as conn->data doesn’t exist anymore in libcurl and therefore it can’t be used and this refactor is completed. It took at least 20 separate commits to get the job done.

I hope this new order will help us do less mistakes as we don’t have to update this pointer anymore.

I’m very happy we could do this revamp without it affecting the API or ABI in any way. These are all just internal artifacts that are not visible to the outside.

One of a thousand little things

This is just a tiny detail but the internals of a project like curl consists of a thousand little details and this is one way we make sure the code remains in a good shape. We identify improvements and we perform them. One by one. We never stop and we’re never done. Together we take this project into the future and help the world do Internet transfers properly.

Mozilla Open Policy & Advocacy Blog: Mozilla Mornings: Unpacking the DSA’s risk-based approach

Monday 15th of February 2021 03:12:36 PM

On 25 February, Mozilla will host the next installment of Mozilla Mornings – our regular event series that brings together policy experts, policymakers and practitioners for insight and discussion on the latest EU digital policy developments.

This installment of Mozilla Mornings will focus on the DSA’s risk-based approach, specifically the draft law’s provisions on risk assessment, risk mitigation, and auditing for very large online platforms. We’ll be looking at what these provisions seek to solve for; how they’re likely to work in practice; and what we can learn from related proposals in other jurisdictions.

Speakers

Carly Kind
Director, Ada Lovelace Institute

Ben Scott
Executive Director, Reset

Owen Bennett
Senior Policy Manager, Mozilla Corporation

Moderated by Brian Maguire
EU journalist and broadcaster

 

Logistical information

25 February, 2021

11:00-12:00 CET

Zoom Webinar (conferencing details to be provided on morning of event)

Register your attendance here

The post Mozilla Mornings: Unpacking the DSA’s risk-based approach appeared first on Open Policy & Advocacy.

Karl Dubost: Capping macOS User Agent String on macOS 11

Monday 15th of February 2021 08:43:00 AM

This is to keep track and document the sequence of events related to macOS 11 and another cascade of breakages related to the change of user agent strings. There is no good solution. One more time it shows how sniffing User Agent strings are both dangerous (future fail) and source of issues.

Brace for impact!

Capping macOS 11 version in User Agent History
  • 2020-06-25 OPENED WebKit 213622 - Safari 14 - User Agent string shows incorrect OS version

    A reporter claims it breaks many websites but without giving details about which websites. There's a mention about VP9

    browser supports vp9

    I left a comment there to get more details.

  • 2020-09-15 OPENED WebKit 216593 - [macOS] Limit reported macOS release to 10.15 series.

    if (!osVersion.startsWith("10")) osVersion = "10_15_6"_s;

    With some comments in the review:

    preserve original OS version on older macOS at Charles's request

    I suspect this is the Charles, the proxy app.

    2020-09-16 FIXED

  • 2020-10-05 OPENED WebKit 217364 - [macOS] Bump reported current shipping release UA to 10_15_7

    On macOS Catalina 10.15.7, Safari reports platform user agent with OS version 10_15_7. On macOS Big Sur 11.0, Safari reports platform user agent with OS version 10_15_6. It's a bit odd to have Big Sur report an older OS version than Catalina. Bump the reported current shipping release UA from 10_15_6 to 10_15_7.

    The issue here is that macOS 11 (Big Sur) reports an older version number than macOS 10.15 (Catalina), because the previous bug harcoded the string number.

    if (!osVersion.startsWith("10")) osVersion = "10_15_7"_s;

    This is still harcoded because in this comment:

    Catalina quality updates are done, so 10.15.7 is the last patch version. Security SUs from this point on won’t increment the patch version, and does not affect the user agent.

    2020-10-06 FIXED

  • 2020-10-11 Unity [WebGL][macOS] Builds do not run when using Big Sur

    UnityLoader.js is the culprit.

    They fixed it on January 2021(?). But there are a lot of legacy codes running out there which could not be updated.

    Irony, there’s no easy way to detect the unity library to create a site intervention that would apply to all games with the issue. Capping the UA string will fix that.

  • 2020-11-30 OPENED Webkit 219346 - User-agent on macOS 11.0.1 reports as 10_15_6 which is older than latest Catalina release.

    It was closed as a duplicate of 217364, but there's an interesting description:

    Regression from 216593. That rev hard codes the User-Agent header to report MacOS X 10_15_6 on macOS 11.0+ which breaks Duo Security UA sniffing OS version check. Duo security check fails because latest version of macOS Catalina is 10.15.7 but 10.15.6 is being reported.

  • 2020-11-30 OPENED Gecko 1679929 - Cap the User-Agent string's reported macOS version at 10.15

    There is a patch for Gecko to cap the user agent string the same way that Apple does for Safari. This will solve the issue with Unity Games which have been unable to adjust the code source to the new version of Unity.

    // Cap the reported macOS version at 10.15 (like Safari) to avoid breaking // sites that assume the UA's macOS version always begins with "10.". int uaVersion = (majorVersion >= 11 || minorVersion > 15) ? 15 : minorVersion; // Always return an "Intel" UA string, even on ARM64 macOS like Safari does. mOscpu = nsPrintfCString("Intel Mac OS X 10.%d", uaVersion);

    It should land very soon, this week (week 8, February 2021), on Firefox Nightly 87. We can then monitor if anything is breaking with this change.

  • 2020-12-04 OPENED Gecko 1680516 - [Apple Chip - ARM64 M1] Game is not loaded on Gamearter.com

    Older versions of Unity JS used to run games are broken when the macOS version is 10_11_0 in the user agent string of the browser.

    The Mozilla webcompat team proposed to fix this with a Site Intervention for gamearter specifically. This doesn't solve the other games breaking.

  • 2020-12-14 OPENED Gecko 1682238 - Override navigator.userAgent for gamearter.com on macOS 11.0

    A quick way to fix the issue on Firefox for gamearter was to release a site intervention by the Mozilla webcompat team

    "use strict"; /* * Bug 1682238 - Override navigator.userAgent for gamearter.com on macOS 11.0 * Bug 1680516 - Game is not loaded on gamearter.com * * Unity < 2021.1.0a2 is unable to correctly parse User Agents with * "Mac OS X 11.0" in them, so let's override to "Mac OS X 10.16" instead * for now. */ /* globals exportFunction */ if (navigator.userAgent.includes("Mac OS X 11.")) { console.info( "The user agent has been overridden for compatibility reasons. See https://bugzilla.mozilla.org/show_bug.cgi?id=1680516 for details." ); let originalUA = navigator.userAgent; Object.defineProperty(window.navigator.wrappedJSObject, "userAgent", { get: exportFunction(function() { return originalUA.replace(/Mac OS X 11\.(\d)+;/, "Mac OS X 10.16;"); }, window), set: exportFunction(function() {}, window), }); }
  • 2020-12-16 OPENED WebKit 219977 - WebP loading error in Safari on iOS 14.3

    In this comment, Cloudinary explains they try to avoid the issue with the system bug with UA detection.

    Cloudinary is attempting to work around this issue by turning off WebP support to affected clients.

    If this is indeed about the underlying OS frameworks, rather than the browser version, as far as we can tell it appeared sometime after MacOS 11.0.1 and before or in 11.1.0. All we have been able to narrow down on the iOS side is ≥14.0.

    If you have additional guidance on which versions of the OSes are affected, so that we can prevent Safari users from receiving broken images, it would be much appreciated!

    Eric Portis (Cloudinary) created some tests: * WebPs that break in iOS ≥ 14.3 & MacOS ≥ 11.1 * Tiny WebP

    The issue seems to affect CloudFlare

  • 2021-01-05 OPENED WebKit WebP failures [ Big Sur ] fast/images/webp-as-image.html is failing

  • 2021-01-29 OPENED Blink 1171998 - Nearly all Unity WebGL games fail to run in Chrome on macOS 11 because of userAgent

  • 2021-02-06 OPENED Blink 1175225 - Cap the reported macOS version in the user-agent string at 10_15_7

    Colleagues at Mozilla, on the Firefox team, and Apple, on the Safari team, report that there are a long tail of websites broken from reporting the current macOS Big Sur version, e.g. 11_0_0, in the user agent string:

    Mac OS X 11_0_0

    and for this reason, as well as slightly improving user privacy, have decided to cap the reported OS version in the user agent string at 10.15.7:

    Mac OS X 10_15_7

  • 2021-02-09 Blink Intent to Ship: User Agent string: cap macOS version number to 10_15_7

    Ken Russell sends an intent to cap macOS in the Chrome (blink) user agent string to 10_15_7 to follow on the steps of Apple and Mozilla. In the intent to ship, there is a discussion on solving the issue with Client Hints. Sec-CH-UA-Platform-Version would be a possibility, but Client Hints is not yet deployed and there is not yet a full consensus about it. This is a specification pushed by Google.

    Masataka Yakura shared with me (Thanks!) two threads on the Webkit-dev mailing-list. One from May 2020 and another one from November 2020.

    In May, Maciej said:

    I think there’s a number of things in the spec that should be cleaned up before an implementation ships enabled by default, specifically around interop, privacy, and protection against UA lockouts. I know there are PRs in flight for some of these issues. I think it would be good to get more of the open issues to resolution before actually shipping this.

    And in November, Maciej did another round of spec review with one decisive issue.

    Note that Google has released, last year, on January 14, 2020 an Intent to Ship: Client Hints infrastructure and UA Client Hints and this was enabled a couple of days ago on February 11, 2021.

And I'm pretty sure the story is not over. There will be probably more breakages and more unknown bugs.

Otsukare!

Project Tofino: Introducing Project Mentat, a flexible embedded knowledge store

Sunday 14th of February 2021 04:42:38 AM

Edit, January 2017: to avoid confusion and to better follow Mozilla’s early-stage project naming guidelines, we’ve renamed Datomish to Project Mentat. This post has been altered to match.

For several months now, a small team at Mozilla has been exploring new ways of building a browser. We called that effort Tofino, and it’s now morphed into the Browser Futures Group.

As part of that, Nick Alexander and I have been working on a persistent embedded knowledge store called Project Mentat. Mentat is designed to ship in client applications, storing relational data on disk with a flexible schema.

It’s a little different to most of the storage systems you’re used to, so let’s start at the beginning and explain why. If you’re only interested in the what, skip down to just above the example code.

As we began building Tofino’s data layer, we observed a few things:

  • We knew we’d need to store new types of data as our product goals shifted: page metadata, saved content, browsing activity, location. The set of actions the user can take, and the data they generate, is bound to grow over time. We didn’t (don’t!) know what these were in advance.
  • We wanted to support front-end innovation without being gated on some storage developer writing a complicated migration. We’ve seen database evolution become a locus of significant complexity and risk — “here be dragons” — in several other applications. Ultimately it becomes easier to store data elsewhere (a new database, simple prefs files, a key-value table, or JSON on disk) than to properly integrate it into the existing database schema.
  • As part of that front-end innovation, sometimes we’d have two different ‘forks’ both growing the data model in two directions at once. That’s a difficult problem to address with a tool like SQLite.
  • Front-end developers were interested in looser approaches to accessing stored data than specialized query endpoints: e.g., Lin Clark suggested that GraphQL might be a better fit. Only a month or two into building Tofino we already saw the number of API endpoints, parameters, and fields growing as we added features. Specialized API endpoints turn into ad hoc query languages.
  • Syncability was a constant specter hovering at the back of our minds: getting the data model right for future syncing (or partial hosting on a service) was important.

Many of these concerns happen to be shared across other projects at Mozilla: Activity Stream, for example, also needs to store a growing set of page summary attributes for visited pages, and join those attributes against your main browsing history.

Nick and I started out supporting Tofino with a simple store in SQLite. We knew it had to adapt to an unknown set of use cases, so we decided to follow the principles of CQRS.

CQRS — Command Query Responsibility Segregation — recognizes that it’s hard to pick a single data storage model that works for all of your readers and writers… particularly the ones you don’t know about yet.

As you begin building an application, it’s easy to dive head-first into storing data to directly support your first user experience. As the experience changes, and new experiences are added, your single data model is pulled in diverging directions.

A common second system syndrome for this is to reactively aim for maximum generality. You build a single normalized super-flexible data model (or key-value store, or document store)… and soon you find that it’s expensive to query, complex to maintain, has designed-in capabilities that will never be used, and you still have tensions between different consumers.

The CQRS approach, at its root, is to separate the ‘command’ from the ‘query’: store a data model that’s very close to what the writer knows (typically a stream of events), and then materialize as many query-side data stores as you need to support your readers. When you need to support a new kind of fast read, you only need to do two things: figure out how to materialize a view from history, and figure out how to incrementally update it as new events arrive. You shouldn’t need to touch the base storage schema at all. When a consumer is ripped out of the product, you just throw away their materialized views.

Viewed through that lens, everything you do in a browser is an event with a context and a timestamp: “the user bookmarked page X at time T in session S”, “the user visited URL X at time T in session S for reason R, coming from visit V1”. Store everything you know, materialize everything you need.

We built that with SQLite.

This was a clear and flexible concept, and it allowed us to adapt, but the implementation in JS involved lots of boilerplate and was somewhat cumbersome to maintain manually: the programmer does the work of defining how events are stored, how they map to more efficient views for querying, and how tables are migrated when the schema changes. You can see this starting to get painful even early in Tofino’s evolution, even without data migrations.

Quite soon it became clear that a conventional embedded SQL database wasn’t a direct fit for a problem in which the schema grows organically — particularly not one in which multiple experimental interfaces might be sharing a database. Furthermore, being elbow-deep in SQL wasn’t second-nature for Tofino’s webby team, so the work of evolving storage fell to just a few of us. (Does any project ever have enough people to work on storage?) We began to look for alternatives.

We explored a range of existing solutions: key-value stores, graph databases, and document stores, as well as the usual relational databases. Each seemed to be missing some key feature.

Most good storage systems simply aren’t suitable for embedding in a client application. There are lots of great storage systems that run on the JVM and scale across clusters, but we need to run on your Windows tablet! At the other end of the spectrum, most webby storage libraries aren’t intended to scale to the amount of data we need to store. Most graph and key-value stores are missing one or more of full-text indexing (crucial for the content we handle), expressive querying, defined schemas, or the kinds of indexing we need (e.g., fast range queries over visit timestamps). ‘Easy’ storage systems of all stripes often neglect concurrency, or transactionality, or multiple consumers. And most don’t give much thought to how materialized views and caches would be built on top to address the tension between flexibility and speed.

We found a couple of solutions that seemed to have the right shape (which I’ll discuss below), but weren’t quite something we could ship. Datomic is a production-grade JVM-based clustered relational knowledge store. It’s great, as you’d expect from Cognitect, but it’s not open-source and we couldn’t feasibly embed it in a Mozilla product. DataScript is a ClojureScript implementation of Datomic’s ideas, but it’s intended for in-memory use, and we need persistent storage for our datoms.

Nick and I try to be responsible engineers, so we explored the cheap solution first: adding persistence to DataScript. We thought we might be able to leverage all of the work that went into DataScript, and just flush data to disk. It soon became apparent that we couldn’t resolve the impedance mismatch between a synchronous in-memory store and asynchronous persistence, and we had concerns about memory usage with large datasets. Project Mentat was born.

Mentat is built on top of SQLite, so it gets all of SQLite’s reliability and features: full-text search, transactionality, durable storage, and a small memory footprint.

On top of that we’ve layered ideas from DataScript and Datomic: a transaction log with first-class transactions so we can see and annotate a history of events without boilerplate; a first-class mutable schema, so we can easily grow the knowledge store in new directions and introspect it at runtime; Datalog for storage-agnostic querying; and an expressive strongly typed schema language.

Datalog queries are translated into SQL for execution, taking full advantage of both the application’s rich schema and SQLite’s fast indices and mature SQL query planner.

You can see more comparisons between Project Mentat and those storage systems in the README.

A proper tutorial will take more space than this blog post allows, but you can see a brief example in JS. It looks a little like this:

// Open a database.
let db = await datomish.open("/tmp/testing.db");// Make sure we have our current schema.
await db.ensureSchema(schema);// Add some data. Note that we use a temporary ID (the real ID
// will be assigned by Mentat).
let txResult = await db.transact([
{"db/id": datomish.tempid(),
"page/url": "https://mozilla.org/",
"page/title": "Mozilla"}
]);// Let's extend our schema. In the real world this would
// typically happen across releases.
schema.attributes.push({"name": "page/visitedAt",
"type": "instant",
"cardinality": "many",
"doc": "A visit to the page."});
await db.ensureSchema(schema);// Now we can make assertions with the new vocabulary
// about existing entities.
// Note that we simply let Mentat find which page
// we're talking about by URL -- the URL is a unique property
// -- so we just use a tempid again.
await db.transact([
{"db/id": datomish.tempid(),
"page/url": "https://mozilla.org/",
"page/visitedAt": (new Date())}
]);// When did we most recently visit this page?
let date = (await db.q(
`[:find (max ?date) .
:in $ ?url
:where
[?page :page/url ?url]
[?page :page/visitedAt ?date]]`,
{"inputs": {"url": "https://mozilla.org/"}}));console.log("Most recent visit: " + date);

Project Mentat is implemented in ClojureScript, and currently runs on three platforms: Node, Firefox (using Sqlite.jsm), and the JVM. We use DataScript’s excellent parser (thanks to Nikita Prokopov, principal author of DataScript!).

Addition, January 2017: we are in the process of rewriting Mentat in Rust. More blog posts to follow!

Nick has just finished porting Tofino’s User Agent Service to use Mentat for storage, which is an important milestone for us, and a bigger example of Mentat in use if you’re looking for one.

What’s next?

We’re hoping to learn some lessons. We think we’ve built a system that makes good tradeoffs: Mentat delivers schema flexibility with minimal boilerplate, and achieves similar query speeds to an application-specific normalized schema. Even the storage space overhead is acceptable.

I’m sure Tofino will push our performance boundaries, and we have a few ideas about how to exploit Mentat’s schema flexibility to help the rest of the Tofino team continue to move quickly. It’s exciting to have a solution that we feel strikes a good balance between storage rigor and real-world flexibility, and I can’t wait to see where else it’ll be a good fit.

If you’d like to come along on this journey with us, feel free to take a look at the GitHub repo, come find us on Slack in #mentat, or drop me an email with any questions. Mentat isn’t yet complete, but the API is quite stable. If you’re adventurous, consider using it for your next Electron app or Firefox add-on (there’s an example in the GitHub repository)… and please do send us feedback and file issues!

Acknowledgements

Many thanks to Lina Cambridge, Grisha Kruglov, Joe Walker, Erik Rose, and Nicholas Alexander for reviewing drafts of this post.

Introducing Project Mentat, a flexible embedded knowledge store was originally published in Project Tofino on Medium, where people are continuing the conversation by highlighting and responding to this story.

Karl Dubost: Browser Wish List - Bookmark This Selection

Friday 12th of February 2021 02:14:00 AM

Some of us are keeping notes of bread and crumbs fallen everywhere. A dead leaf, a piece of string, a forgotten note washed away on a beach, and things read in a book. We collect memories and inspiration.

All browsers have a feature called "Bookmark This Page". It is essentially the same poor badly manageable tool on every browsers. If you do not want to rely on a third party service, or an addon, what the browser has to offer is not very satisfying.

Firefox gives a possibility to change the name, to choose where to put it and to add tags at the moment we save it.

Edge follows the same conventions without the tagging.

Safari offers something slightly more evolved with a Description field.

but none of them is satisfying for the Web drifters, the poets collecting memories, the archivists and the explorers. And it's unfortunate because it looks like such a low hanging fruit. It ties very much in my previous post about Browser Time Machine.

Bookmark This Selection

What I would like from the bookmark feature in the browser is the ability to not only bookmark the full page but be able to select a piece of the page that is reflected in the bookmark, be through the normal menu as we have seen above or through the contextual menu of the browser.

Then once the bookmarks are collected I can do full text searches on all the collected texts.

And yes, some add-ons exist, but I just wish the feature was native to the browser. And I do not want to rely on a third party service. My quotes are mine only and should not necessary be shared with a server on someone's else machine.

Memex which has very interesting features, but it is someone else service. Pocket (even if it belongs to Mozilla) is not answering my needs. I need to open an account, and it is someone's else server.

Otsukare!

Mozilla Localization (L10N): L10n Report: February 2021 Edition

Thursday 11th of February 2021 10:17:38 PM
Welcome!

New localizers

  • Ibrahim of Hausa (ha) drove the Common Voice web part to completion shortly after he joined the community.
  • Crowdsource Kurdish, and Amed of Kurmanji Kurdish (kmr) teamed up to finish the Common Voice site localization.
  • Saltykimchi of Malay (ms) joins us from the Common Voice community.
  • Ibrahimi of Pashto (ps) completed the Common Voice site localization in a few days!
  • Reem of Swahili (sw) has been laser focused on the Terminology project.

Are you a locale leader and want us to include new members in our upcoming reports? Contact us!

New community/locales added
  • Mossi (mos)
  • Pashto (ps)
New content and projects What’s new or coming up in Firefox desktop

First of all, let’s all congratulate the Silesian (szl) team for making their way into the official builds of Firefox. After spending several months in Nightly, they’re now ready for general audience and will ride the trains to Beta and Release with Firefox 87.

Upcoming deadlines:

  • Firefox 86 is currently in Beta and will be released on February 23. The deadline to update localizations is on February 14.
  • Firefox 87 is in Nightly and will move to Beta on February 22.

This means that, as of February 23, we’ll be only two cycles away from the next big release of Firefox (89), which will include the UI redesign internally called Proton. Several strings have already been exposed for localization, and you can start testing them – always in a new profile! – by manually setting these preferences to true in about:config:

  • browser.proton.appmenu.enabled
  • browser.proton.enabled
  • browser.proton.tabs.enabled

It’s a constant work in progress, so expect the UI to change frequently, as new elements are added every few days.

One important thing to note: English will change several elements of the UI from Title Case to Sentence case. These changes will not require locales to retranslate all the strings, but it also expects each locale to have clearly defined rules in their style guides about the correct capitalization to use for each part of the UI. If your locale is following the same capitalization rules as en-US, then you’ll need to manually change these strings to match the updated version.

We’ll have more detailed follow-ups in the coming week about Proton, highlighting the key areas to test. In the meantime, make sure that your style guides are in good shape, and get in touch if you don’t know how to work on them in GitHub.

What’s new or coming up in mobile

You may have noticed some changes to the Firefox for Android (“Fenix”) release schedule – that affects in turn our l10n schedule for the project.

In fact, Firefox for Android is now mirroring the Firefox Desktop schedule (as much as possible). While you will notice that the Pontoon l10n deadlines are not quite the same between Firefox Android and Firefox Desktop, their release cadence will be the same, and this will help streamline our main products.

Firefox for iOS remains unchanged for now – although the team is aiming to streamline the release process as well. However, this also depends on Apple, so this may take more time to implement.

Concerning the Proton redesign (see section above about Desktop), we still do not know to what extent it will affect mobile. Stay tuned!

What’s new or coming up in web projects Firefox Accounts:

The payment settings feature is going to be updated later this month through a Beta release. It will be open for localization at a later date. Stay tuned!

mozilla.org

Migration to Fluent format continues, and the webdev team aims at wrapping up migration by the end of February. Kindly remind all the communities to check the migrated files for warnings, fix them right away. Otherwise, the strings will appear in English in an activated page on production. Or the page may resort to English because it can’t meet the activation threshold of 80% completion. Please follow the priority of the pages and work through them one at a time.

Common Voice

The project will be moved to Mozilla Foundation later this year. More details will be shared as soon as they become available.

Fairly small release as the transition details are being finalized.

  • Fixed bug where “Voices Online” wasn’t tracking activity anymore
  • Redirected language request modal to Github issue template
  • Updated average seconds based on corpus 6.1
  • Increased leaderboards “load more” function from 5 additional records to 20
  • Localization/sentence updates
What’s new or coming up in SuMo

Since the beginning of 2021, SUMO has been supporting Firefox 85. You can see the full list of articles that we added and updated for Firefox 85 in the SUMO Sprint wiki page here.

We also have good news from the Dutch team who’s been changing their team formation and finally managed to localize 100% support articles in SUMO. This is a huge milestone for the team, who has been a little bit behind in the past couple of years.

There are a lot more interesting changes coming up in our pipeline. Feel free to join SUMO Matrix room to discuss or just say hi.

Friends of the Lion

Image by Elio Qoshi

  • The Friesian (fy-NL) community hit the national news with the Voice Challenge, thanks to Wim for leading the effort. It was a competition between Friesian and Dutch languages, a campaign to encourage more people to donate their voices through different platforms and capture the broadest demographics. The ultimate goal is to collect about 300 hours of Frisian text.
  • Dutch team (nl) in SUMO, especially Tim Maks, Wim Benes, Onno Ekker, and Mark Heijl for completing 100% localization of the support articles in SUMO.

Know someone in your l10n community who’s been doing a great job and should appear here? Contact one of the l10n-drivers and we’ll make sure they get a shout-out (see list at the bottom)!

Useful Links

Hacks.Mozilla.Org: MDN localization update, February 2021

Thursday 11th of February 2021 04:06:23 PM

In our previous post, An update on MDN Web Docs’ localization strategy, we explained our broad strategy for moving forward with allowing translation edits on MDN again. The MDN localization communities are waiting for news of our progress on unfreezing the top-tier locales, and here we are. In this post we’ll look at where we’ve got to so far in 2021, and what you can expect moving forward.

Normalizing slugs between locales

Previously on MDN, we allowed translators to localize document URL slugs as well as the document title and body content. This sounds good in principle, but has created a bunch of problems. It has resulted in situations where it is very difficult to keep document structures consistent.

If you want to change the structure or location of a set of documentation, it can be nearly impossible to verify that you’ve moved all of the localized versions along with the en-US versions — some of them will be under differently-named slugs both in the original and new locations, meaning that you’d have to spend time tracking them down, and time creating new parent pages with the correct slugs, etc.

As a knock-on effect, this has also resulted in a number of localized pages being orphaned (not being attached to any parent en-US pages), and a number of en-US pages being translated more than once (e.g. localized once under the existing en-US slug, and then again under a localized slug).

For example, the following table shows the top-level directories in the en-US locale as of Feb 1, 2021, compared to that of the fr locale.

en-US fr games
glossary
learn
mdn
mozilla
plugins
related
tools
web
webassembly accessibilité
adaptation_des_applications_xul_pour_firefox_1.5
améliorations_dom_dans_firefox_3
améliorations_svg_dans_firefox_3
améliorations_xul_dans_firefox_3
apprendre
astuces_css
bugs_importants_corrigés_dans_firefox_3
changements_dans_gecko_1.9_affectant_les_sites_web
chrome
comment_créer_un_arbre_dom
compilation_et_installation
contrôles_dhtml_personnalisés_navigables_au_clavier
css
dhtml
dom
développement_web
explorer_un_tableau_html_avec_des_interfaces_dom_et_javascript
faq_sur_les_transformations_xsl_dans_mozilla
fuel
games
glossaire
glossary
html
inset-block-end
inset-block-start
inset-inline-end
inset-inline-start
inspecteur_dom
introduction_(alternative)
introduction_à_la_cryptographie_à_clef_publique
javascript
jeux
la_sécurité_dans_firefox_2
learn
localization
mdn
mdn_a_dix_ans
mise_à_jour_des_applications_web_pour_firefox_3
mise_à_jour_des_extensions_pour_firefox_2
mise_à_jour_des_extensions_pour_firefox_3
mozilla
navigatorusermedia.getusermedia
npapi
outils
référence_dom_gecko
sgml
svg_dans_firefox
tosource
tostring
type_mime_incorrect_pour_les_fichiers_css
un_raycaster_basique_avec_canvas
utilisation_de_xpath
utilisation_du_cache_de_firefox_1.5
web
webapi
webassembly
webrtc
xhtml
xmlserializer
xpcom
xslt_dans_gecko
xsltprocessor
zoom_pleine_page
à_propos_du_document_object_model

To make the non-en-US locales consistent and manageable, we are going to move to having en-US slugs only — all localized pages will be moved under their equivalent location in the en-US tree. In cases where that location cannot be reliably determined — e.g. where the documents are orphans or duplicates — we will put those documents into a specific storage directory, give them an appropriate prefix, and ask the maintenance communities for each unfrozen locale to sort out what to do with them.

  • Every localized document will be kept in a separate repo to the en-US content, but will have a corresponding en-US document with the same slug (folder path).
  • At first this will be enforced during deployment — we will move all the localized documents so that their locations are synchronized with their en-US equivalents. Every document that does not have a corresponding en-US document will be prefixed with orphaned during deployment. We plan to further automate this to check whenever a PR is created against the repo. We will also funnel back changes from the main en-US content repo, i.e. if an en-US page is moved, the localized equivalents will be automatically moved too.
  • All locales will be migrated, unfortunately, some documents will be marked as orphaned and some others will be marked as conflicting (as in adding a prefix conflicting to their slug). Conflicting documents have a corresponding en-US document with multiple translations in the same locale.
  • We plan to delete, archive, or move out orphaned/conflicting content.
  • Nothing will be lost since everything is in a git repo (even if something is deleted, it can still be recovered from the git history).
Processes for identifying unmaintained content

The other problem we have been wrestling with is how to identify what localized content is worth keeping, and what isn’t. Since many locales have been largely unmaintained for a long time, they contain a lot of content that is very out-of-date and getting further out-of-date as time goes on. Many of these documents are either not relevant any more at all, incomplete, or simply too much work to bring up to date (it would be better to just start from nothing).

It would be better for everyone involved to just delete this unmaintained content, so we can concentrate on higher-value content.

The criteria we have identified so far to indicate unmaintained content is as follows:

  • Pages that should have compat tables, which are missing them.
  • Pages that should have interactive examples and/or embedded examples, which are missing them.
  • Pages that should have a sidebar, but don’t.
  • Pages where the KumaScript is breaking so much that it’s not really renderable in a usable way.

These criteria are largely measurable; we ran some scripts on the translated pages to calculate which ones could be marked as unmaintained (they match one or more of the above). The results are as follows:

If you look for compat, interactive examples, live samples, orphans, and all sidebars:

  • Unmaintained: 30.3%
  • Disconnected (orphaned): 3.1%

If you look for compat, interactive examples, live samples, orphans, but not sidebars:

  • Unmaintained: 27.5%
  • Disconnected (orphaned):  3.1%

This would allow us to get rid of a large number of low-quality pages, and make dealing with localizations easier.

We created a spreadsheet that lists all the pages that would be put in the unmaintained category under the above rules, in case you were interested in checking them out.

Stopping the display of non-tier 1 locales

After we have unfrozen the “tier 1” locales (fr, ja, zh-CN, zh-TW), we are planning to stop displaying other locales. If no-one has the time to maintain a locale, and it is getting more out-of-date all the time, it is better to just not show it rather than have potentially harmful unmaintained content available to mislead people.

This makes sense considering how the system currently works. If someone has their browser language set to say fr, we will automatically serve them the fr version of a page, if it exists, rather than the en-US version — even if the fr version is old and really out-of-date, and the en-US version is high-quality and up-to-date.

Going forward, we will show en-US and the tier 1 locales that have active maintenance communities, but we will not display the other locales. To get a locale displayed again, we require an active community to step up and agree to have responsibility for maintaining that locale (which means reviewing pull requests, fixing issues filed against that locale, and doing a reasonable job of keeping the content up to date as new content is added to the en-US docs).

If you are interested in maintaining an unmaintained locale, we are more than happy to talk to you. We just need a plan. Please get in touch!

Note: Not showing the non-tier 1 locales doesn’t mean that we will delete all the content. We are intending to keep it available in our archived-content repo in case anyone needs to access it.

Next steps

The immediate next step is to get the tier 1 locales unfrozen, so we can start to get those communities active again and make that content better. We are hoping to get this done by the start of March. The normalizing slugs work will happen as part of this.

After that we will start to look at stopping the display of non-tier 1 localized content — that will follow soon after.

Identifying and removing unmaintained content will be a longer game to play — we want to involve our active localization communities in this work for the tier 1 locales, so this will be done after the other two items.

The post MDN localization update, February 2021 appeared first on Mozilla Hacks - the Web developer blog.

Karl Dubost: Whiteboard Reactionaries

Thursday 11th of February 2021 12:21:00 PM

The eminent Mike Taylor has dubbed us with one of his knightly tweets. Something something about

new interview question: on a whiteboard, re-implement the following in React (using the marker color of your choice)

Sir Bruce Lawson OM (Oh My…), a never ending disco knight, has commented about Mike's tweet, pointing out that:

the real test is your choice of marker colour. So, how would you go about making the right choice? Obviously, that depends where you’re interviewing.

I simply and firmly disagree and throw my gauntlet at Bruce's face. Choose your weapons, time and witnesses.

The important part of this tweet is how Mike Taylor points out how the Sillycon Valley industry is a just a pack of die-hard stick-in-the-mud reactionaries who have promoted the whiteboard to the pinnacle of one's dull abilities to regurgitate the most devitalizing Kardashianesque answers to stackoverflow problems. Young programmers! Rise! In front of the whiteboard, just walk out. Refuse the tiranny of the past, the chalk of ignorance.

Where are the humans, the progress? Where are the shores of the oceans, the Célestin Freinet, Maria Montessori and A. S. Neill, the lychens, the moss and the humus, the sap of imagination, the liberty of our creativity.

Otsukare!

Daniel Stenberg: curl –fail-with-body

Thursday 11th of February 2021 08:00:27 AM

That’s --fail-with-body, using two dashes in front of the name.

This is a brand new command line option added to curl, to appear in the 7.76.0 release. This function works like --fail but with one little addition and I’m hoping the name should imply it good enough: it also provides the response body. The --fail option has turned out to be a surprisingly popular option but users have often repeated the request to also make it possible to get the body stored. --fail makes curl stop immediately after having received the response headers – if the response code says so.

--fail-with-body will instead first save the body per normal conventions and then return an error if the HTTP response code was 400 or larger.

To be used like this:

curl --fail-with-body -o output https://example.com/404.html

If the page is missing on that HTTPS server, curl will return exit code 22 and save the error message response in the file named ‘output’.

Not complicated at all. But has been requested many times!

This is curl’s 238th command line option.

The Rust Programming Language Blog: Announcing Rust 1.50.0

Thursday 11th of February 2021 12:00:00 AM

The Rust team is happy to announce a new version of Rust, 1.50.0. Rust is a programming language that is empowering everyone to build reliable and efficient software.

If you have a previous version of Rust installed via rustup, getting Rust 1.50.0 is as easy as:

rustup update stable

If you don't have it already, you can get rustup from the appropriate page on our website, and check out the detailed release notes for 1.50.0 on GitHub.

What's in 1.50.0 stable

For this release, we have improved array indexing, expanded safe access to union fields, and added to the standard library. See the detailed release notes to learn about other changes not covered by this post.

Const-generic array indexing

Continuing the march toward stable const generics, this release adds implementations of ops::Index and IndexMut for arrays [T; N] for any length of const N. The indexing operator [] already worked on arrays through built-in compiler magic, but at the type level, arrays didn't actually implement the library traits until now.

fn second<C>(container: &C) -> &C::Output where C: std::ops::Index<usize> + ?Sized, { &container[1] } fn main() { let array: [i32; 3] = [1, 2, 3]; assert_eq!(second(&array[..]), &2); // slices worked before assert_eq!(second(&array), &2); // now it also works directly } const value repetition for arrays

Arrays in Rust can be written either as a list [a, b, c] or a repetition [x; N]. For lengths N greater than one, repetition has only been allowed for xs that are Copy, and RFC 2203 sought to allow any const expression there. However, while that feature was unstable for arbitrary expressions, its implementation since Rust 1.38 accidentally allowed stable use of const values in array repetition.

fn main() { // This is not allowed, because `Option<Vec<i32>>` does not implement `Copy`. let array: [Option<Vec<i32>>; 10] = [None; 10]; const NONE: Option<Vec<i32>> = None; const EMPTY: Option<Vec<i32>> = Some(Vec::new()); // However, repeating a `const` value is allowed! let nones = [NONE; 10]; let empties = [EMPTY; 10]; }

In Rust 1.50, that stabilization is formally acknowledged. In the future, to avoid such "temporary" named constants, you can look forward to inline const expressions per RFC 2920.

Safe assignments to ManuallyDrop<T> union fields

Rust 1.49 made it possible to add ManuallyDrop<T> fields to a union as part of allowing Drop for unions at all. However, unions don't drop old values when a field is assigned, since they don't know which variant was formerly valid, so safe Rust previously limited this to Copy types only, which never Drop. Of course, ManuallyDrop<T> also doesn't need to Drop, so now Rust 1.50 allows safe assignments to these fields as well.

A niche for File on Unix platforms

Some types in Rust have specific limitations on what is considered a valid value, which may not cover the entire range of possible memory values. We call any remaining invalid value a niche, and this space may be used for type layout optimizations. For example, in Rust 1.28 we introduced NonZero integer types (like NonZeroU8) where 0 is a niche, and this allowed Option<NonZero> to use 0 to represent None with no extra memory.

On Unix platforms, Rust's File is simply made of the system's integer file descriptor, and this happens to have a possible niche as well because it can never be -1! System calls which return a file descriptor use -1 to indicate that an error occurred (check errno) so it's never possible for -1 to be a real file descriptor. Starting in Rust 1.50 this niche is added to the type's definition so it can be used in layout optimizations too. It follows that Option<File> will now have the same size as File itself!

Library changes

In Rust 1.50.0, there are nine new stable functions:

And quite a few existing functions were made const:

See the detailed release notes to learn about other changes.

Other changes

There are other changes in the Rust 1.50.0 release: check out what changed in Rust, Cargo, and Clippy.

Contributors to 1.50.0

Many people came together to create Rust 1.50.0. We couldn't have done it without all of you. Thanks!

Mike Hoye: Text And Context

Wednesday 10th of February 2021 09:02:51 PM


This image is a reference to the four-square Drake template – originally Drake holding up a hand and turning away from something disapprovingly in the top half, while pointing favorably to something else in the lower half – featuring Xzibit rather than Drake, himself meme-famous for “yo dawg we heard you like cars, so we put a car in your car so you can drive while you drive”, to whose recursive nature this image is of course an homage. In the upper left panel, Xzibit is looking away disappointedly from the upper right, which contains a painting by Pieter Bruegel the Elder of the biblical Tower Of Babel. In the lower left, Xzibit is now looking favorably towards an image of another deeply nested meme.

This particular meme features the lead singer from Nickelback holding up a picture frame, a still from the video of their song “Photograph”. The “you know I had to do it to ’em” guy is in the distant background. Inside, the frame is cut in four by a two-axis graph, with “authoritarian/libertarian” on the Y axis and “economic-left/economic-right” on the X axis, overlaid with the words “young man, take the breadsticks and run, I said young man, man door hand hook car gun“, a play on both an old bit about bailing out of a bad conversation while stealing breadsticks, the lyrics to The Village People’s “YMCA”, and adding “gun” to the end of some sentence to shock its audience. These lyrics are arranged within those four quadrants in a visual reference to “loss.jpg”, a widely derided four-panel webcomic from 2008.

Taken as a whole the image is an oblique comment on the Biblical “Tower Of Babel” reference, specifically Genesis 11, in which “… the Lord said, Behold, the people is one, and they have all one language; and this they begin to do: and now nothing will be restrained from them, which they have imagined to do. Go to, let us go down, and there confound their language, that they may not understand one another’s speech” and the proliferation of deeply nested and frequently incomprehensible memes as a form of explicitly intra-generational communication.

So, yeah, there’s a lot going on in there.

I asked about using alt-text for captioning images like that in a few different forums the other day, to learn what the right thing is with respect to memes or jokes. If the image is the joke, is it useful (or expected) that the caption is written to try to deliver the joke, rather than be purely descriptive?

On the one hand, I’d expect you want the punchline to land, but I also want the caption to be usable and useful, and I assume that there are cultural assumptions and expectations in this space that I’m unaware of.

As intended, the question I asked wasn’t so much about “giving away” the punchline as it is about ensuring its delivery; either way you have to give away the joke, but does an image description phrased as a joke help, or hinder (or accidentally insult?) its intended audience?

I’m paraphrasing, but a few of the answers all said sort of the same useful and insightful thing: “The tool is the description of the image; the goal is to include people in the conversation. Use the tool to accomplish the goal.”

Which I kind of love.

And in what should not have stopped surprising me ages ago but still but consistently does, I was reminded that accessibility efforts support people far outside their intended audience. In this case, maybe that description makes the joke accessible to people who have perfectly good eyesight but haven’t been neck deep in memetics since they can-hazzed their first cheezeburgers and don’t quite know why this deep-fried, abstract level-nine metareference they’re seeing is hilarious.

More in Tux Machines

Android Leftovers

It’s raining i.MX 8M Plus systems-on-module at Embedded World 2021

NXP introduced i.MX 8M Plus AI SoC with a built-in 2.3 TOPS neural processing unit (NPU) last year, and we’ve already covered several early announcements about i.MX 8M Plus systems-on-module (SoM) with Variscite VAR-SOM-MX8M-PLUS and DART-MX8M-PLUS, TechNexion EDM-G-IMX8M-PLUS and AXON-E-IMX8M-PLUS respectively using SO-DIMM edge connectors and board-to-board connectors, as well as SolidRun i.MX 8M Plus SoM that was announced together with the HummindBoard Mate carrier board with dual Gigabit Ethernet. But as Embedded World 2021 Digital is taking place virtually until Friday, other companies have now made their own announcements of i.MX 8M Plus SoMs as the processor enters mass production this month, and since as far as I know, it’s pin-to-pin and software compatible with earlier i.MX 8M Nano/Mini SoCs, the update must have been easy. That means we’ve got a longish list of modules, and I have probably missed some. Supported operating systems are basically the same across companies with Linux using Builroot or the Yocto Project running on Cortex-A53 cores, and FreeRTOS on the real-time Cortex-M7 core. Some also offer Android support. Read more

Today in Techrights

today's leftovers

  • mintCast 355.5 – McKnight in Shining Armor

    1:49 Linux Innards 27:06 Vibrations from the Ether 51:29 Check This Out 58:45 Announcements & Outro In our Innards section, we talk to community member Mike! And finally, the feedback and a couple suggestions

  • The small web is beautiful

    About fifteen years ago, I read E. F. Schumacher’s Small is Beautiful and, despite not being interested in economics, I was moved by its message. Perhaps even more, I loved the terse poetry of the book’s title – it resonated with my frugal upbringing and my own aesthetic.

    I think it’s time for a version of that book about technology, with a chapter on web development: The Small Web is Beautiful: A Study of Web Development as if People Mattered. Until someone writes that, this essay will have to do.

    There are two aspects of this: first, small teams and companies. I’m not going to talk much about that here, but Basecamp and many others have. What I’m going to focus on in this essay is small websites and architectures.

  • PS2 Emulation Gets Even Nicer With Custom Textures

    PCSX2 has long been a fantastic PS2 emulator, but a recent advance has made it all the more appealing for anyone playing on a PC: the ability to swap textures in games. While the famous Dolphin emulator for the GameCube has long supported this feature, PCSX2 has only just brought it in, and it’ll allow modders to improve any kind of texture they want in an old PS2 game. In the example video below by someother1ne, we can see everything from the road in Gran Turismo to the helmets and jerseys in NFL2K5 get swapped out.

  • Epic Games is buying Fall Guys creator Mediatonic

    According to the blog posts and FAQs detailing the announcement, Fall Guys will remain available on Steam for the time being, and the developer is still bringing the game to both the Xbox and Nintendo Switch platforms. Epic and Mediatonic say there are no plans right now to make the game (which currently costs $19.99) free-to-play, as Epic did with Rocket League. Epic later confirmed it plans to make the PC version of Fall Guys available on the Epic Game Store.