  1. eBPF for Advanced Linux Infrastructure Monitoring

    A year has passed since the pandemic left us spending the better part of our days sheltering inside our homes. It has been a challenging time for developers, Sysadmins, and entire IT teams for that matter who began to juggle the task of monitoring and troubleshooting an influx of data within their systems and infrastructures as the world was forced online. To do their job properly, free, open-source technologies like Linux have become increasingly attractive, especially amongst Ops professionals and Sysadmins in charge of maintaining growing and complex environments. Engineers, as well, are using more open-source technologies largely due to the flexibility and openness they have to offer, versus commercial offerings that are accompanied by high-cost pricing and stringent feature lock-ins.

    One emerging technology in particular - eBPF - has made its appearance in multiple projects, including commercial and open-source offerings. Before discussing more about the community surrounding eBPF and its growth during the pandemic, it’s important to understand what it is and how it’s being utilized. eBPF, or extended Berkley packet filtering, was originally introduced as BPF back in 1992 in a paper by Lawrence Berkeley Laboratory researchers as a rule-based mechanism to filter and capture network packets. Filters would be implemented to run inside a register-based Virtual Machine (VM), which itself would exist inside the Linux Kernel. After several years of non-activity, BPF was extended to eBPF, featuring a full-blown VM to run small programs inside the Linux Kernel. Since these programs run from inside the Kernel, they can be attached to a particular code path and be executed when it is traversed, making them perfect to create applications for packet filtering and performance analysis and monitoring.

    Originally, it was not easy to create eBPF programs, as the programmer needed to know an extremely low-level language. However, the community around that technology has evolved considerably through their creation of tools and libraries to simplify and speed up the process of developing and loading an eBPF program inside the Kernel. This was crucial for creating a large number of tools that can trace system and application activity down to a very granular level. The image that follows demonstrates this, showing the sheer number of tools that exist to trace various parts of the Linux stack.

  2. How to set up a CrowdSec multi-server installation


    CrowdSec is an open-source & collaborative security solution built to secure Internet-exposed Linux services, servers, containers, or virtual machines with a server-side agent. It is a modernized version of Fail2ban which was a great source of inspiration to the project founders.

    CrowdSec is free (under an MIT License) and its source code available on GitHub. The solution is leveraging a log-based IP behavior analysis engine to detect attacks. When the CrowdSec agent detects any aggression, it offers different types of remediation to deal with the IP behind it (access prohibition, captcha, 2FA authentication etc.). The report is curated by the platform and, if legitimate, shared across the CrowdSec community so users can also protect their assets from this IP address.

    A few months ago, we added some interesting features to CrowdSec when releasing v1.0.x. One of the most exciting ones is the ability of the CrowdSec agent to act as an HTTP rest API to collect signals from other CrowdSec agents. Thus, it is the responsibility of this special agent to store and share the collected signals. We will call this special agent the LAPI server from now on.

    Another worth noting feature, is that mitigation no longer has to take place on the same server as detection. Mitigation is done using bouncers. Bouncers rely on the HTTP REST API served by the LAPI server.


    In this article we’ll describe how to deploy CrowdSec in a multi-server setup with one server sharing signal.

    CrowdSec Goals Infographic

    Both server-2 and server-3 are meant to host services. You can take a look on our Hub to know which services CrowdSec can help you secure. Last but not least, server-1 is meant to host the following local services:

    • the local API needed by bouncers

    • the database fed by both the three local CrowdSec agents and the online CrowdSec blocklist service. As server-1 is serving the local API, we will call it the LAPI server.

    We choose to use a postgresql backend for CrowdSec database in order to allow high availability. This topic will be covered in future posts. If you are ok with no high availability, you can skip step 2.

  3. Develop a Linux command-line Tool to Track and Plot Covid-19 Stats

    It’s been over a year and we are still fighting with the pandemic at almost every aspect of our life. Thanks to technology, we have various tools and mechanisms to track Covid-19 related metrics which help us make informed decisions. This introductory-level tutorial discusses developing one such tool at just Linux command-line, from scratch.

    We will start with introducing the most important parts of the tool – the APIs and the commands. We will be using 2 APIs for our tool - COVID19 API and Quickchart API and 2 key commands – curl and jq. In simple terms, curl command is used for data transfer and jq command to process JSON data.

    The complete tool can be broken down into 2 keys steps:

    1. Fetching (GET request) data from the COVID19 API and piping the JSON output to jq so as to process out only global data (or similarly, country specific data).

    $ curl -s --location --request GET 'https://api.covid19api.com/summary' | jq -r '.Global'
      "NewConfirmed": 561661,
      "TotalConfirmed": 136069313,
      "NewDeaths": 8077,
      "TotalDeaths": 2937292,
      "NewRecovered": 487901,
      "TotalRecovered": 77585186,
      "Date": "2021-04-13T02:28:22.158Z"

    2. Storing the output of step 1 in variables and calling the Quickchart API using those variables, to plot a chart. Subsequently piping the JSON output to jq so as to filter only the link to our chart.

    $ curl -s -X POST \
           -H 'Content-Type: application/json' \
           -d '{"chart": {"type": "bar", "data": {"labels": ["NewConfirmed ('''${newConf}''')", "TotalConfirmed ('''${totConf}''')", "NewDeaths ('''${newDeath}''')", "TotalDeaths ('''${totDeath}''')", "NewRecovered ('''${newRecover}''')", "TotalRecovered ('''${totRecover}''')"], "datasets": [{"label": "Global Covid-19 Stats ('''${datetime}''')", "data": ['''${newConf}''', '''${totConf}''', '''${newDeath}''', '''${totDeath}''', '''${newRecover}''', '''${totRecover}''']}]}}}' \
           https://quickchart.io/chart/create | jq -r '.url'

    That’s it! Now we have our data plotted out in a chart:

    LJ Global-Stats-Track-And-Plot-Covid19-Stats

  4. LibrePlanet 2021 Free Software Conference

    On Saturday and Sunday, March 20th and 21st, 2021, free software supporters from all over the world will log in to share knowledge and experiences, and to socialize with others within the free software community. This year’s theme is “Empowering Users,” and keynotes will be Julia Reda, Nathan Freitas, and Nadya Peek. Free Software Foundation (FSF) associate members and students attend gratis at the Supporter level. 

    You can see the schedule and learn more about the conference at https://libreplanet.org/2021/, and participants are encouraged to register in advance at https://u.fsf.org/lp21-sp

    The conference will also include workshops, community-submitted five-minute Lightning Talks, Birds of a Feather (BoF) sessions, and an interactive “exhibitor hall” and “hallway” for socializing.

  5. weLees Visual LVM Manager

    Maintenance of the storage system is a daily job for system administrators. Linux provides users with a wealth of storage capabilities, and powerful built-in maintenance tools. However, these tools are hardly friendly to system administrators while generally considerable effort is required for mastery.

    As a Linux built-in storage model, LVM provides users with plenty flexible management modes to fit various needs. For users who can fully utilize its functions, LVM could meet almost all needs. But the premise is thorough understanding of the LVM model, dozens of commands as well as accompanying parameters.

    The graphical interface would dramatically simplify both learning curve and operation with LVM, in a similar approach as partition tools that are widely used on Windows/Linux platforms. Although scripts with commands are suitable for daily, automatic tasks, the script could not handle all functions in LVM. For instance, manual calculation and processing are still required by many tasks.

    Significant effort had been spent on this problem. Nowadays, several graphical LVM management tools are already available on the Internet, some of them are built-in with Linux distributions and others are developed by third parties. But there remains a critical problem: desire for remote machines or headless servers are completely ignored.

    This is now solved by Visual LVM Remote. Front end of this tool is developed based on the HTTP protocol. With any smart device that can connect to the storage server, Users can perform management operations.

    Visual LVM is developed by weLees Corporation and supports all Linux distributions. In addition to working with remote/headless servers, it also supports more advanced features of LVM compared with various on-shelf graphic LVM management tools.

    Dependences of Visual LVM Remote

    Visual LVM Remote can work on any Linux distribution that including two components below:

    • LVM2

    • Libstdc++.so

    UI of Visual LVM Remote

    With a concise UI, partitions/physical volumes/logical volumes are displayed by disk layout. With a glance, disk/volume group information can be obtained immediately. In addition, detailed relevant information of the object will be displayed in the information bar below with the mouse hover on the concerned object.

  6. Image
    Nvidia Linux Drivers

    The recent fiasco with Nvidia trying to block Hardware Unboxed from future GPU review samples for the content of their review is one example of how they choose to play this game. This hatred is not only shared by reviewers, but also developers and especially Linux users.

    The infamous Torvalds videos still traverse the web today as Nvidia conjures up another evil plan to suck up more of your money and market share. This is not just one off shoot case; oh how much I wish it was. I just want my computer to work.

    If anyone has used Sway-WM with an Nvidia GPU I’m sure they would remember the –my-next-gpu-wont-be-nvidia option.

    These are a few examples of many.

    The Nvidia Linux drivers have never been good but whatever has been happening at Nvidia for the past decade has to stop today. The topic in question today is this bug: [https://forums.developer.nvidia.com/t/bug-report-455-23-04-kernel-panic-due-to-null-pointer-dereference]

    This bug causes hard irrecoverable crashes from driver 440+. This issue is still happening 5+ months later with no end in sight. At first users could work around this by using an older DKMS driver along with a LTS kernel. However today this is no longer possible. Many distributions of Linux are now dropping the old kernels. DKMS cannot build. The users are now FORCED with this “choice”:

    {Use an older driver and risk security implications} or {“use” the new drivers that cause random irrecoverable crashes.}

    This issue is only going to get more and more prevalent as the kernel is a core dependency by definition. This is just another example of the implications of an unsafe older kernel causing issue for users: https://archlinux.org/news/moving-to-zstandard-images-by-default-on-mkinitcpio/

    If you use Linux or care about the implications of a GPU monopoly, consider AMD. Nvidia is already rearing its ugly head and AMD is actually putting up a fight this year.

  7. Parallel Shells With xargs Unix


    One particular frustration with the UNIX shell is the inability to easily schedule multiple, concurrent tasks that fully utilize CPU cores presented on modern systems. The example of focus in this article is file compression, but the problem rises with many computationally intensive tasks, such as image/audio/media processing, password cracking and hash analysis, database Extract, Transform, and Load, and backup activities. It is understandably frustrating to wait for gzip * running on a single CPU core, while most of a machine's processing power lies idle.

    This can be understood as a weakness of the first decade of Research UNIX which was not developed on machines with SMP. The Bourne shell did not emerge from the 7th edition with any native syntax or controls for cohesively managing the resource consumption of background processes.

    Utilities have haphazardly evolved to perform some of these functions. The GNU version of xargs is able to exercise some primitive control in allocating background processes, which is discussed at some length in the documentation. While the GNU extensions to xargs have proliferated to many other implementations (notably BusyBox, including the release for Microsoft Windows, example below), they are not POSIX.2-compliant, and likely will not be found on commercial UNIX.

    Historic users of xargs will remember it as a useful tool for directories that contained too many files for echo * or other wildcards to be used; in this situation xargs is called to repeatedly batch groups of files with a single command. As xargs has evolved beyond POSIX, it has assumed a new relevance which is useful to explore.

    Why is POSIX.2 this bad?

    A clear understanding of the lack of cohesive job scheduling in UNIX requires some history of the evolution of these utilities.

  8. Bypassing Deep Packet Inspection

    In some countries, network operators employ deep packet inspection techniques to block certain types of traffic. For example, Virtual Private Network (VPN) traffic can be analyzed and blocked to prevent users from sending encrypted packets over such networks.

    By observing that HTTPS works all over the world (configured for an extremely large number of web-servers) and cannot be easily analyzed (the payload is usually encrypted), we argue that in the same manner VPN tunneling can be organized: By masquerading the VPN traffic with TLS or its older version - SSL, we can build a reliable and secure network. Packets, which are sent over such tunnels, can cross multiple domains, which have various (strict and not so strict) security policies. Despite that the SSH can be potentially used to build such network, we have evidence that in certain countries connections made over such tunnels are analyzed statistically: If the network utilization by such tunnels is high, bursts do exist, or connections are long-living, then underlying TCP connections are reset by network operators.

    Thus, here we make an experimental effort in this direction: First, we describe different VPN solutions, which exist on the Internet; and, second, we describe our experimental effort with Python-based software and Linux, which allows users to create VPN tunnels using TLS protocol and tunnel small office/home office (SOHO) traffic through such tunnels.


    Virtual private networks (VPN) are crucial in the modern era. By encapsulating and sending client’s traffic inside protected tunnels it is possible for users to obtain network services, which otherwise would be blocked by a network operator. VPN solutions are also useful when accessing a company’s Intranet network. For example, corporate employees can access the internal network in a secure way by establishing a VPN connection and directing all traffic through the tunnel towards the corporate network. This way they can get services, which otherwise would be impossible to get from the outside world.


    There are various solutions that can be used to build VPNs. One example is Host Identity Protocols (HIP) [7]. HIP is a layer 3.5 solution (it is in fact located between transport and network layers) and was originally designed to split the dual role of IP addresses - identifier and locator. For example, a company called Tempered Networks uses HIP protocol to build secure networks (for sampling see [4]).

  9. Knapsack Pro Ruby JavaScript Tests

    Automated tests are part of many programming projects, ensuring the software is flawless. The bigger the project, the larger the test suite can be.This can result in automated tests taking a lot of time to run. In this article you will learn how to run automated tests faster with parallel Continuous Integration machines (CI) and what problems can be encountered. The article covers common parallel testing problems, based on Ruby & JavaScript tests.

    Knapsack Pro LogoSlow automated tests

    Automated tests can be considered slow when programmers stop running the whole test suite on their local machine because it is too time consuming. Most of the time you use CI servers such as Jenkins, CircleCI, Github Actions to run your tests on an external machine instead of your own. When you have a test suite that runs for an hour then it’s not efficient to run it on your computer. Browser end-to-end tests for your web project can take a really long time to execute. Running tests on a CI server for an hour is also not efficient. You as a developer need a fast feedback loop to know if your software works fine. Automated tests should help you with that.

    Split tests between many CI machines to save time

    A way to save you time is to make CI build as fast as possible. When you have tests taking e.g. 1 hour to run then you could leverage your CI server config and setup parallel jobs (parallel CI machines/nodes). Each of the parallel jobs can run a chunk of the test suite. 

    You need to divide your tests between parallel CI machines. When you have a 60 minutes test suite you can run 20 parallel jobs where each job runs a small set of tests and this should save you time. In an optimal scenario you would run tests for 3 minutes per job. 

    How to make sure each job runs for 3 minutes? As a first step you can apply a simple solution. Sort all of your test files alphabetically and divide them by the number of parallel jobs. Each of your test files can have a different execution time depending on how many test cases you have per test file and how complex each test case is. But you can end up with test files divided in a suboptimal way, and this is problematic. The image below illustrates a suboptimal split of tests between parallel CI jobs where one job runs too many tests and ends up being a bottleneck.

  10. KISS Framework

    Perhaps the most popular platform for applications is the web. There are many reasons for this including portability across platforms, no need to update the program, data backup, sharing data with others, and many more. This popularity has driven many of us to the platform.

    Unfortunately, the platform is a bit complex. Rather than developing in a particular environment, with web applications it is necessary to create two halves of a program utilizing vastly different technologies. On top of that, there are many additional challenges such as the communications and security between the two halves.

    A typical web application would include all of the following building blocks:

    1. Front-end layout (HTML/CSS)
    2. Front-end functionality (JavaScript)
    3. Back-end server code (Java, C#, etc.)
    4. Communications (REST, etc.)
    5. Authentication
    6. Data persistence (SQL, etc.)

    All these don't even touch on all the other pieces that are not part of your application proper, such as the server (Apache, tomcat, etc), the database server (PostgreSQL, MySQL, MongoDB, etc), the OS (Linux, etc.), domain name, DNS, yadda, yadda, yadda.

    The tremendous complexity notwithstanding, most application developers mainly have to concern themselves with the six items listed above. These are their main concerns.

    Although there are many fine solutions available for these main concerns, in general, these solutions are siloed, complex, and incongruent. Let me explain.

    Many solutions are siloed because they are single-solution packages that are complete within themselves and disconnected from other pieces of the system.

    Some solutions are so complex that they can take years to learn well. Developers can struggle more with the framework they are using than the language or application they are trying to write. This is a major problem.

    Lastly, by incongruent I mean that the siloed tools do not naturally fit well together. A bunch of glue code has to be written, learned, and supported to fit the various pieces together. Each tool has a different feel, a different approach, a different way of thinking.

    Being frustrated with all of these problems, I wrote the KISS Web Development Framework. At first it was just various solutions I had developed. But later it evolved into a single, comprehensive web development framework. KISS, an open-source project, was specifically designed to solve these exact challenges.

    KISS is a single, comprehensive, fully integrated web development framework that includes integrated solutions for:


    1. Custom HTML controls
    2. Easy communications with the back-end with built-in authentication
    3. Browser cache control (so the user never has to clear their cache)
    4. A variety of general purpose utilities