Language Selection

English French German Italian Portuguese Spanish

Programming Leftovers

Filed under
  • Joey Hess: 2020 hindsight

    Ten years ago, I'd been in an increasingly stale job for several years too long. I was tired of living in the city, and had a yurt as a weekend relief valve. I had the feeling a big change was coming.

    Four months on and I quit my job, despite the ongoing financial crisis making prospects poor for other employment, especially work on free software.

    I tried to start a business, Branchable, with liw, based on my earlier ikiwiki project, but it never really took off. However, I'm proud it's still serving the users it did find, 10 years later.

    Then, through luck and connections, I found a patch of land in a blank spot in the map with the most absurd rent ever ($5/acre/month). It had a house on it, no running water, barely solar power, a phone line, no cell service or internet, total privacy.

    This proved very inspiring. Once again I was hauling water, chopping wood, poking at web pages on the other end of a dialup modem. Just like it was 2000 again. Now I was also hacking by lantern-light until the ancient batteries got so depleted I could hear the voltage regulator crackle with every surge of CPU activity.

  • Introduction to the Unicode Collation Algorithm

    Programmers love to sort things. We discuss sorting algorithms, big-O notation, when sorting pointers or values is better, parallelism, whether being able to discuss sorting algorithms and big-O notation makes you a better programmer or not… but today, I'm going to be talking about the comparison function. Usually, we sort of take it for granted, even when sorting strings, but for anything but the most trivial examples, the standard memcmp()-like algorithm (lexical comparison of bytes) will produce undesired results. What we need is the Unicode Collation Algorithm, or UCA.

    A word of warning first: Superficial knowledge of Unicode and collations gives a high risk of being a loud and boring person who wants to flaunt their own superficial knowledge of Unicode and collation. Don't be that guy.

    With that out of the way… Let's discuss first a bit what we want. We want a universal and consistent way of comparing strings that match what users intuitively expect. (Note that comparison includes both “is before” and “is equal”.) Of course, different users expect different things, so we must also be able to parametrize the algorithm (so-called tailoring), but I won't be talking much about it.

    Just comparing Unicode code points (which, for UTF-8 is exactly the same thing as comparing bytes) will inevitably end up in disaster. For instance, most users will accept that “René” should sort before “Renoir”, but é is U+0072 and o is U+006F, so you'd get the opposite order. Similarly, even in a case-sensitive collation, where “linux” and “Linux” are unequal, they should probably sort together (ie., they should not be split by “Windows”, even though W is between l and Loser. (Don't think about hacks like NFD normalization, removing accents or folding case before sorting, because you're not likely to be getting it right. Stick with the UCA.)

  • Google's IREE To Demonstrate Machine Learning Via Vulkan With MLIR

    One of the new open-source compiler IR advancements of 2019 has been the Google/Tensorflow MLIR as the Multi-Level Intermediate Representation designed for machine learning models/frameworks. With Google's "IREE" project, MLIR can be accelerated by Vulkan and thus allowing machine learning via this high-performance graphics/compute API.

    MLIR is becoming an LLVM sub-project and has growing industry support for this machine learning IR. Google's IREE is an experimental execution environment for MLIR to make use of modern hardware acceleration APIs. In other words, getting MLIR running on the likes of Vulkan and other hardware abstraction layers. IREE also has a CPU interpreter too for running on traditional x86/ARM CPUs.

  • Playing around with XOR (my fav operation) during the holidays
  • Python Data Weekly Roundup – Dec 27 2019

  • Coding from Russia’s countryside: A group of Moscow programmers has launched a crowdfunded project to bring metropolitan expertise to remote towns



    Several times a year, “Kruzhok” programmers from Moscow visit towns and villages to hold free coding workshops for local teenagers. Before finishing these lessons, students create websites where they share videos and photographs of their hometowns, describing life in Russia’s countryside. As it’s grown more popular, the Kruzhok project has also become more diverse. It’s not just programmers going into towns anymore; there are now musicians, architects, journalists, and astronomers. Meduza explains how professionals sick of the “Moscow bubble” are using their fatigue to fuel an effort to help young people in Russia’s remote regions.


More in Tux Machines

today's leftovers

  • Adam Young: Building Linux tip-of-tree on an Ampere based system

    I Have an Ampere Altra-Max/INGRASYS Yushan Server System running Centos 8 stream.

  • How to Kill a Process in Linux Command Line

    It has been an awesome day on your Linux system, and suddenly a process starts to slow down the whole computer. It is not that important, and you want to stop its execution.

  • SWO: An ARM Printf By Any Other Name

    I’ll confess. Although printf-style debugging has a bad rep, I find myself turning to it on occasion. Sure, printf is expensive and brings in a lot of code, but if you have the space and time to use it while debugging you can always remove it before you are finished. However, what if you don’t have an output device or you are using it for something else? If you are using most modern ARM chips, you have another option — a dedicated output channel that is used for several things, including debugging output. I decided I wanted to try that on the Blackpill running mbed, and found out it isn’t as easy as you might think. But it is possible, and when you are done reading, you’ll be able to do it, too.

Proprietary and Microsoft Leftovers

  • Data ordering attacks | Light Blue Touchpaper

    Most deep neural networks are trained by stochastic gradient descent. Now “stochastic” is a fancy Greek word for “random”; it means that the training data are fed into the model in random order. So what happens if the bad guys can cause the order to be not random? You guessed it—all bets are off. Suppose for example a company or a country wanted to have a v, but still be able to pretend that its training was actually fair. Well, they could assemble a set of financial data that was representative of the whole population, but start the model’s training on ten rich men and ten poor women drawn from that set ­ then let initialisation bias do the rest of the work.

  • FTC fines Twitter $150M for using 2FA info for targeted advertising [Ed: "2FA" is very often just fake security; there are many technical issues with it as well, set aside privacy issue]
  • FTC Politely Asks Education Companies If They Would Maybe Stop Spying On Kids

    If you hadn’t noticed, the U.S. doesn’t give much of a shit about this whole privacy thing. Our privacy regulators are comically and intentionally understaffed and underfunded, we still have no meaningful privacy law for the Internet era, and when regulators do act, it’s generally months after the fact with penalties that are easily laughed off by companies rich from data over-collection.

  • DuckDuckGo’s “agreement with Microsoft” allows trackers to bypass privacy settings of DuckDuckGo “Privacy Browser”. – BaronHK's Rants

    DuckDuckGo’s “agreement with Microsoft” allows trackers to bypass privacy settings of DuckDuckGo “Privacy Browser”. Why would they block them? They’re hosted on Microsoft Azure and get their search results from Microsoft Bing. DuckDuckGo is just a thrall of Microsoft, and it lets them sell their products while hiding who they really are, from people who know that the Microsoft brand is toxic.

  • Warning: You should stop using Tails Linux NOW! [Ed: Brian Fagioli with another idiotic clickbait like Microsoft media operatives I saw hours ago; lots of actively exploited holes in Microsoft products are revealed by the dozens each day by CISA, so a distraction is sorely needed]
  • Blizzard: No Piracy Filters? That's Evidence of Intentional Infringement

    A recent DMCA notice sent by Blizzard to Github demands the takedown of an avatar depicting the gaming company's character 'Chef Nomi'. While legally sound up to this point, Blizzard's notice goes on to inform the coding platform that its failure to deploy piracy filtering technologies is "evidence of intentional facilitation of copyright infringement." In Github's case? Not even close.

Android Leftovers

Garuda Linux: All-Rounder Distro Based on Arch Linux

A review of the Arch Linux based Garuda Linux, which brings a collection of desktop environments, window managers, and tools for general users and gamers. Read more