Updated Common Prefix Notation Transformation

This is an update to my previous article on determining the common prefixes of a set of strings (passed a sequence of lines) and printing them in common prefix notation (CPN). The problem of reconstructing strings from the original input that are complete prefixes of other input strings is addressed.

The code is now in a Git repository:

Gitea project trieLast 3 commits: by Tilman Kranz: 69fc4506 add test, escape spaces in output by Tilman Kranz: 0ae62285 include formal definition by Tilman Kranz: 6099f6d2 Initial commitThis project has currently no releases.

Lesen Sie mehr »

Find Files by Size given in Bytes

Some examples:

Find files in current directory that have a size of 400 bytes or more:

sfind -min 400

Find files in /etc that have a size of 50 kilobytes (1 kilobyte = 1024 bytes) or more:

sfind -dir /etc -min 50k

Find files in /var with size between 100 and 500 megabytes, suppress warnings, print sizes in human-readable format:

sfind -dir /var -min 100m -max 500m -quiet -human

Note: Yes, i know, Oracle distributes a tool that is also called „sfind“, but anyway …

Lesen Sie mehr »

Print XDG Desktop Definition for Application

Update Nov 3 2024:

For an application given by „application name“ or „executable name“, output the corresponding .desktop file, if any:

#!/bin/sh
IFS=":"
xdg_data_dirs=${XDG_DATA_DIRS:-/usr/local/share:/usr/share}
search=$1

for i in $xdg_data_dirs ; do
  a="$i/applications"

  [ -d $a ] && for d in "$a"/*.desktop ; do
    grep -q -e "^Name=.*$search" -e "^Exec=.*$search" "$d" && {
      echo "# $d:"
      grep -Ev '^(Comment|GenericName|Keywords|Name\[)' "$d"
    }
  done
done

To try this code out, save it to /usr/local/bin/xdg-desktop-search, make it executable and test it, for example, as follows:

xdg-desktop-search gnome-terminal

Transforming Sets of Strings to their Common Prefix Notation

Update Aug 20 2022:

Any two strings s1, s2 have a common prefix cp, which is the string of characters that s1 and s2 have in common up from the start. If s1, s2 have no such common characters, cp is the empty string. Let 1, 2 be the remainders or suffixes of s1 and s2 if cp is removed from them. Then, the common prefix notation (CPN) of s1, s2 – in short, cpn(s1,s2) – shall be:

cpn(s1,s2) := cp{1,2}

Given the definition above, for any given CPN of a set of strings s1, s2, …, sn, the CPN of the set extended by an additional string sn+1 is defined as follows:

Case 1: If the suffix of sn+1 with cp removed, n+1, has no nonempty common prefix with any of the suffixes 1, 2, …, n:

cpn(s1,s2,…,sn+1) := cp{1,2,…,n+1}

Case 2: Otherwise, since the first characters of 1, 2, …, n are distinct, there can only be one element m that has a nonempty common prefix with n+1. Given m to be that element:

cpn(s1,s2,…,sn+1) := cp{1,2,…,cpn(m,n+1),…,n}

Examples:

cpn("a", "b") = "{a,b}"
cpn("aa", "ab") = "a{a,b}"
cpn("aa", "ab", "abc", "abd") = "a{a,b{c,d}}"
cpn("a", "ab", "abc") = "{a{b{c}}}"

Note: For the sake of reasoning, strings containing the CPN’s reserved characters {, } and , shall be considered invalid input. In a practical implementation, a syntax for „escaping“ these reserved characters should be available.

Given a set of strings, one per line, on standard input, the following Python program prints the CPN of these strings on standard output: http://tk-sls.de/ref/trie1.py

Try the examples:

./trie1.py << EOF
a
b
EOF
./trie1.py << EOF
aa
ab
EOF
./trie1.py << EOF
aa
ab
abc
abd
EOF

The following example renders the wrong output, because a trie does not actually store the common prefix (this is what a suffix tree would accomplish), it just stores different prefix paths, and a, ab and abc have no different prefix paths, it’s just a degenerate tree a → b → c:

./trie1.py << EOF
a
ab
abc
EOF

So my guess is, i must implement one of these fancy suffix tree algorithms to get it done right. (see the update instead) 🙂

In the meantime, try this:

find /etc -type d | ./trie1.py

🙂

Pinguin-Patenschaft

Mein Arbeitgeber, B1 Systems GmbH hat in meinem Namen eine Patenschaft für einen afrikanischen Pinguin gesponsert. 🙂

Lesen Sie mehr »

Pi-KVM Hat v3

Ich freue mich, zu den Glücklichen zu gehören, die den Pi-KVM Hat v3 mit passendem Stahlgehäuse ergattert haben (https://pikvm.org/). Das Warten hat sich gelohnt.

Lesen Sie mehr »

CentOS/8 Stream on libvirt/KVM with Kickstart and virt-install

This article describes using Kickstart to automate the CentOS installer and virt-install to automate the creation of a VM.

The following setup is assumed:

  • There is a libvirt hypervisor called virthost.
  • ssh to virthost as „root“ is possible.

Lesen Sie mehr »

WordPress-Plugin to Embed Gitlab Project Information

For my personal use, i wrote a small WordPress plugin that allows me to embed a link to a Gitlab repository, a list of commits and a link to the releases of that project in a WP post.

Example (linebreaks added to shortcode for readability):

[​gitlab-show-project
    url="https://tk-sls.de/gitlab"
    project_id=43
    max=3
    author="none"
    releases="latest"​]

Output:

Project information not readable.

Lesen Sie mehr »