Store and share data safely with IBM Data Privacy Passports for developers
IBM Data Privacy Passports is a new data privacy and security solution that brings protection to the data itself. With its data-centric and policy-based protection, Data Privacy Passports builds on top of pervasive encryption by giving users a way to control how data is stored and shared. At any time, eligible data is protected and future access can be revoked using Data Privacy Passports — for data that originates on z15, as well as data from hybrid cloud environments.
Securing data can be complicated when data moves from platform to platform. As data travels it needs to be on secure networks with secure protocols, and when that data reaches its destination it needs to be secured again by that system. If data is moving across multiple networks to multiple destinations, each network and destination need to be configured for consistent security. When a policy change requires that privileges be altered, each system needs to be adjusted individually. If each system is responsible for securing the data but one system fails to do so, that one system can compromise the entire chain. Data Privacy Passports is designed to solve this complex challenge.
With Data Privacy Passports, the individual fields of eligible data are protected, and this is done with the introduction of the Trusted Data Object (TDO). A TDO is encrypted and must be read through Data Privacy Passports (the current version only supports SQL structured data sources accessed via JDBC) in order to be decrypted into a usable format. Therefore, when data is protected as a TDO and moves between environments, the protection moves with it. This prevents complete reliance on the security of individual systems. With policies that are configurable at the user and group level, Data Privacy Passports provides the control to show different users different views of the same data based on that user’s need to know. Because of this, developers can write code using real data and data administrators can manage data warehouses, without seeing the same data. This allows for the integration of Data Privacy Passports into existing applications.
Andy Wingo: understanding webassembly code generation throughput
Greets! Today's article looks at browser WebAssembly implementations from a compiler throughput point of view. As I wrote in my article on Firefox's WebAssembly baseline compiler, web browsers have multiple wasm compilers: some that produce code fast, and some that produce fast code. Implementors are willing to pay the cost of having multiple compilers in order to satisfy these conflicting needs. So how well do they do their jobs? Why bother?
In this article, I'm going to take the simple path and just look at code generation throughput on a single chosen WebAssembly module. Think of it as X-ray diffraction to expose aspects of the inner structure of the WebAssembly implementations in SpiderMonkey (Firefox), V8 (Chrome), and JavaScriptCore (Safari).
[...]
I'll express results in nanoseconds per WebAssembly code byte. Of the 40 megabytes or so in the Zen Garden demo, only 23 891 164 bytes are actually function code; the rest is mostly static data (textures and so on). So I'll divide the total time by this code byte count.
Dirk Eddelbuettel: gettz 0.0.4
A minor routine update 0.0.4 of gettz arrived on CRAN overnight.
gettz provides a possible fallback in situations where Sys.timezone() fails to determine the system timezone. That happened when e.g. the file /etc/localtime somehow is not a link into the corresponding file with zoneinfo data in, say, /usr/share/zoneinfo. Since the package was written (in the fall of 2016), R added a similar extended heuristic approach itself.
This release adds registration of the compiled routine via R_registerRoutines() and R_useDynamicSymbols(), adds .registration=TRUE to useDynLib() in NAMESPACE, and uses an unquoted symbol in .Call(). Two new badges were added to the README.md as well. And as in the previous release in 2016: No new code, or new features.
How I containerize a build system
A build system is comprised of the tools and processes used to transition from source code to a running application. This transition also involves changing the code's audience from the software developer to the end user, whether the end user is a colleague in operations or a deployment system.
After creating a few build systems using containers, I think I have a decent, repeatable approach that's worth sharing. These build systems were used for generating loadable software images for embedded hardware and compiling machine learning algorithms, but the approach is abstract enough to be used in any container-based build system.
This approach is about creating or organizing the build system in a way that makes it easy to use and maintain. It's not about the tricks needed to deal with containerizing any particular software compilers or tools. It applies to the common use case of software developers building software to hand off a maintainable image to other technical users (whether they are sysadmins, DevOps engineers, or some other title). The build system is abstracted away from the end users so that they can focus on the software.
Using OpenBSD Relayd to Block Bad Robots
Unfortunately, there are a good number of bad web crawlers, also known as bots. Well-behaved robots first query for the existence of a robots.txt file and if their user agent string is in this file, disconnect and go away. As it so happens, the Chinese bots don’t obey the robots.txt file and a few of the little ones do not as well. Given the pervasive cyber crime that the Chinese engage in, all known Chinese IP addresses are simply blocked at the firewall as well as some other countries that engage in nefarious activities. OpenBSD has a powerful tool to deal with these guys and it is called relayd. Relayd is most used as proxy software but it can do many powerful things. Here is the configuration file that I use for my website, sanitized of course.
Dirk Eddelbuettel: RcppArmadillo 0.9.860.2.0
Armadillo is a powerful and expressive C++ template library for linear algebra aiming towards a good balance between speed and ease of use with a syntax deliberately close to a Matlab. RcppArmadillo integrates this library with the R environment and language–and is widely used by (currently) 706 other packages on CRAN.
A new upstream release 9.860.2 of Armadillo was just released. The theme of “convergence” continues; the previous release increased the minor from 800 to 850, now we are at 860. We first wrapped this up as version 0.9.859.1.0, but it turned out to have been held back by a buglet between R 4.0.0 and Rcpp which the recent patch release fixed (along with other woes on old R or non-CRAN-alike macOS). It then turns out that the new (upstream) version 9.860.1 had a minor bug which I missed as I reverse-depends checked the prior version. Doh. My thanks, as always, to CRAN for spotting this. The fix was added upstream and we have 9.860.2 as RcppArmadillo 0.9.860.2.0.
Introduction to R and RStudio for Data Science
This is a short introductory training session on the use of R in data science.
R is a statistical programming language that can be used for data manipulation, visualisation of data and statistical analysis. The R language consists of a set of tokens and keywords and a grammar that you can use to explore and understand data from many different sources.
We focus on a common task in data science: import a data set, manipulate its structure, and then visualise the data. We shall use R and RStudio to accomplish this task.
RStudio is an integrated development environment (IDE) that can be used to carry out data science tasks using R. It contains an editor for R scripts, a console to interact directly with the R interpreter, and a file manager similar to that available in your operating system.
This is an interactive training session, so you should try to follow along with the tutorial.
Emmanuel Kasper: Recommended keyboard settings for Productivity and Usability, for European Programmers
If you’re working on Unix / Linux, or C based programming languages, it can make sense to switch to the qwerty(us) keyboard layout. Why ?
Unix, C, Perl, Java, and most of programming languages have been conceived on QWERTY keyboards.
So when the designers choose special characters to use for the language synthax, they simply choose what was easy to access on their own keyboard. This has been historically documented for the vi editor.
To give an example, using an Unix shell you have to type the dot . and slash / symbols quite often to navigate the filesystem. The two keys producing these symbols, are nicely aligned on a QWERTY layout and do not require a key combination to be entered. So you can quickly enter something like ‘../..’ using a single hand.
Now using a QWERTZ layout, like in Germany / Austria, you have the ‘.’ symbol easily accessible, but you need to combine two keys ( Shift + 7 ) to get a ‘/’.
And if you are a poor soul using an AZERTY layout, to get the ‘.’ and ‘/’ symbol you need each time a key combo.
The need of key combos is bad not only for speed (multiple keys to lookup) but also for usability, as you have to stretch your fingers to reach the key if using a single hand, provoking repetitive strain injury. You might be smiling but this is commonly known amongst Emacs Users, due to the prominent use of commands using Ctrl and Alt combos, and led to the creation of an Emacs Ergonomic wiki.
2020.15 An eASTer Surprise
Jonathan Worthington tweeted that they finally found the time and the voice to record the presentation they had planned for the German Perl and Raku Workshop. You can either watch the video and/or look through the slides. It basically touches on these four subjects...
EuroPython 2020: Talk voting is open
The talk voting page lists all submitted proposals, including talks, helpdesks and posters. The proposals are sorted in random order.
Using Twisted to Massively Parallelize Web Clients
The Twisted Requests (treq) package is an HTTP client built on the popular Twisted library that is used for web requests. Async libraries offer the ability to do large amounts of network requests in parallel with relatively little CPU impact. This can be useful in HTTP clients that need to make several requests before they have all the information they need.
This post shows an example of a problem like this, and how to solve it using treq.
I enjoy playing the real-time strategy game Clash Royale. Clash Royale is a mobile strategy player-vs-player game where players play cards in an arena to win. Each card has different strengths and weaknesses, and different players prefer different cards. Clash Royale remembers which card a player plays the most; this is their "favorite" card. Players come together in clans where they can help each other. Supercell, Clash Royale's developer, released an HTTP-based API where different statistics can be queried.
An Overview of Profiling Tools for Python
What does it mean to profile ones code? The main idea behind benchmarking or profiling is to figure out how fast your code executes and where the bottlenecks are. The main reason to do this sort of thing is for optimization. You will run into situations where you need your code to run faster because your business needs have changed. When this happens, you will need to figure out which parts of your code are slowing it down.
This article will only cover how to profile your code using a variety of tools. It will not go into actually optimizing your code. Let’s get started!
