include formal definition
This commit is contained in:
parent
6099f6d242
commit
0ae62285f0
68
README.md
68
README.md
@ -1,6 +1,65 @@
|
|||||||
# Transform Set of Strings to Common Prefix Notation
|
# Transform Set of Strings to Common Prefix Notation
|
||||||
|
|
||||||
## Description
|
## 1 Definition of a Common Prefix Notation
|
||||||
|
|
||||||
|
### 1.1 Formal Definition
|
||||||
|
|
||||||
|
Any two strings s1, s2 have a common prefix cp, which is the string of characters that s1 and s2 have in common up from the start. If s1, s2 have no such common characters, cp is the empty string. Let s̅1, s̅2 be the remainders or suffixes of s1 and s2 if cp is removed from them. If s1 is a prefix of s2 then s̅1 is the empty string, and if s2 is a prefix of s1 then s̅2 is the empty string.
|
||||||
|
|
||||||
|
Given the definitions above, the common prefix notation (CPN) of s1, s2 – in short, cpn(s1,s2) – shall be:
|
||||||
|
|
||||||
|
* Case 1.1: If neither s̅1 nor s̅2 is empty:
|
||||||
|
|
||||||
|
```
|
||||||
|
cpn(s1,s2) := cp{s̅1,s̅2}
|
||||||
|
```
|
||||||
|
|
||||||
|
* Case 1.2: If s̅1 is empty and s̅2 is not empty:
|
||||||
|
|
||||||
|
```
|
||||||
|
cpn(s1,s2) := cp{,s̅2}
|
||||||
|
```
|
||||||
|
|
||||||
|
* Case 1.3: If s̅1 is not empty and s̅2 is empty:
|
||||||
|
|
||||||
|
```
|
||||||
|
cpn(s1,s2) := cp{s̅1,}
|
||||||
|
```
|
||||||
|
|
||||||
|
* Case 1.4: If both s̅1 and s̅2 are empty, then s1 and s2 are equal:
|
||||||
|
|
||||||
|
```
|
||||||
|
cpn(s,s) := s
|
||||||
|
```
|
||||||
|
|
||||||
|
Given the definition above, for any given CPN of a set of strings s1, s2, …, sn, the CPN of the set extended by an additional string sn+1 is defined as follows:
|
||||||
|
|
||||||
|
* Case 2.1: If the suffix of sn+1 with cp removed, s̅n+1, has no nonempty common prefix with any of the suffixes s̅1, s̅2, …, s̅n:
|
||||||
|
|
||||||
|
```
|
||||||
|
cpn(s1,s2,…,sn+1) := cp{s̅1,s̅2,…,s̅n+1}
|
||||||
|
```
|
||||||
|
|
||||||
|
* Case 2.2: Otherwise, since the first characters of s̅1, s̅2, …, s̅n are distinct, there can only be one element s̅m that has a nonempty common prefix with s̅n+1. Given s̅m to be that element:
|
||||||
|
|
||||||
|
```
|
||||||
|
cpn(s1,s2,…,sn+1) := cp{s̅1,s̅2,…,cpn(s̅m,s̅n+1),…,s̅n}
|
||||||
|
```
|
||||||
|
|
||||||
|
### 1.2 Examples
|
||||||
|
|
||||||
|
```
|
||||||
|
cpn("a", "b") = "{a,b}"
|
||||||
|
cpn("aa", "ab") = "a{a,b}"
|
||||||
|
cpn("aa", "ab", "abc", "abd") = "a{a,b{,c,d}}"
|
||||||
|
cpn("a", "ab", "abc") = "{a{,b{,c}}}"
|
||||||
|
```
|
||||||
|
|
||||||
|
*Note:* For the sake of reasoning, strings containing the CPN’s reserved characters {, } and , shall be considered invalid input. In a practical implementation, a syntax for „escaping“ these reserved characters should be available. For example, in the reference implementation, the print_trie() function inserts a backslash character \ in front of every literal {, , and }.
|
||||||
|
|
||||||
|
## 2 Implementation
|
||||||
|
|
||||||
|
### 2.1 Reference Implementation
|
||||||
|
|
||||||
Presented with a series of lines on standard input, the program will
|
Presented with a series of lines on standard input, the program will
|
||||||
print an expression on standard output that denotes the line strings
|
print an expression on standard output that denotes the line strings
|
||||||
@ -8,7 +67,7 @@ in a syntax described in <https://tk-sls.de/wp/6071>.
|
|||||||
|
|
||||||
The expression is suitable for Bourne Again Shell brace expansion.
|
The expression is suitable for Bourne Again Shell brace expansion.
|
||||||
|
|
||||||
## Test
|
### 2.2 Test
|
||||||
|
|
||||||
```
|
```
|
||||||
$ python3 trie.py << EOF
|
$ python3 trie.py << EOF
|
||||||
@ -36,3 +95,8 @@ Expected output:
|
|||||||
a ab abc
|
a ab abc
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Appendix A: References
|
||||||
|
|
||||||
|
* [Initial announcement](https://tk-sls.de/wp/6071)
|
||||||
|
* [Update](https://tk-sls.de/wp/6144)
|
||||||
|
* [Git repository](https://tk-sls.de/gitlab/tilman/trie)
|
||||||
|
Loading…
Reference in New Issue
Block a user