Janis Papanagnou
2023-10-12 12:04:56 UTC
In a recent thread I posted an Awk code pattern to define words that
match a pattern and conditionally transforms it; it just relied on
POSIX Awk features. Actually, though, it's a generally usable code
pattern. With standard Awk you can substitute the entity pattern and
function to transform the defined data entities as necessary.
GNU Awk supports a couple newer features to make that generalization
more explicit, by use of first class patterns and indirect functions.
# generic function to transform specified data entities
function trent (line, pattern, transform, out)
{
for (line=$0; match(line, pattern);
line=substr(line, RSTART+RLENGTH))
{
out = out substr(line, 1, RSTART-1) \
@transform(substr(line, RSTART, RLENGTH))
}
out = out line
return out
}
With a transformation function like
function highlight (str)
{
return "\033[7m" str "\033[0m"
}
a sample usage can be
BEGIN { words = @/[[:alpha:]]+/ }
{
print trent($0, words, "highlight")
}
Applied to the task from the other thread you can provide
function isogram_highlight (str)
{
return (isogram(str) ? "\033[7m" str "\033[0m" : str)
}
using Mike's (only slightly changed by me) isogram() algorithm
function isogram(str, c, x, y) {
y = length(str)
for (x = 1; x < y; x++) {
c = substr(str, x, 1)
if (index(substr(str, x + 1), c)) return 0
}
return 1
}
in a context like
BEGIN { words = @/[[:alpha:]]+/ }
{
print trent($0, words, "highlight")
print trent($0, words, "isogram_highlight")
}
Note again that this solution based on a generalized algorithm
uses GNU Awk specific features and is not conforming to POSIX!
Janis
match a pattern and conditionally transforms it; it just relied on
POSIX Awk features. Actually, though, it's a generally usable code
pattern. With standard Awk you can substitute the entity pattern and
function to transform the defined data entities as necessary.
GNU Awk supports a couple newer features to make that generalization
more explicit, by use of first class patterns and indirect functions.
# generic function to transform specified data entities
function trent (line, pattern, transform, out)
{
for (line=$0; match(line, pattern);
line=substr(line, RSTART+RLENGTH))
{
out = out substr(line, 1, RSTART-1) \
@transform(substr(line, RSTART, RLENGTH))
}
out = out line
return out
}
With a transformation function like
function highlight (str)
{
return "\033[7m" str "\033[0m"
}
a sample usage can be
BEGIN { words = @/[[:alpha:]]+/ }
{
print trent($0, words, "highlight")
}
Applied to the task from the other thread you can provide
function isogram_highlight (str)
{
return (isogram(str) ? "\033[7m" str "\033[0m" : str)
}
using Mike's (only slightly changed by me) isogram() algorithm
function isogram(str, c, x, y) {
y = length(str)
for (x = 1; x < y; x++) {
c = substr(str, x, 1)
if (index(substr(str, x + 1), c)) return 0
}
return 1
}
in a context like
BEGIN { words = @/[[:alpha:]]+/ }
{
print trent($0, words, "highlight")
print trent($0, words, "isogram_highlight")
}
Note again that this solution based on a generalized algorithm
uses GNU Awk specific features and is not conforming to POSIX!
Janis