Discussion:
Display Isograms Using Awk and Inverse ANSI
(too old to reply)
Mike Sanders
2023-10-11 00:51:07 UTC
Permalink
# Michael Sanders 2023
# https://busybox.neocities.org/notes/isogram.txt
#
# awk script that displays isograms using inverse ANSI
# escapes (meaning fore & background colors are swapped)
# requires an ANSI capable terminal, rename this file
# and invoke script as:
#
# awk -f isogram.awk file
#
# isogram test block...
#
# aberration lucrative concurrent espouse obfuscate
# garrulous promenade epiphany requiem juxtapose
# languid ephemeral abscond extricate circumvent
# obstinate vivacious corroborate attenuate paragon
# penchant serendipity superfluous immutable mitigate
# aplomb concatenate ethereal diaphanous demagogue
# cogitate pervasive anathema juxtaposition memento
# disparate oscillate ennui perfunctory parabola
# mellifluous recumbent ephemeral sycophant timorous
# voracious quixotic serenade conundrum vicarious
# insipid ornate camaraderie cogent introspection
# sanguine deleterious impeccable extraneous loquacious

BEGIN { print "\nisograms...\n" }

function hilite(str) { return "\033[7m" str "\033[0m" }

function isogram(str, c, x, y) {
y = length(str)
for (x = 1; x <= y; x++) {
c = substr(str, x, 1)
if (index(substr(str, x + 1), c) > 0) return 0 # !isogram
}
return 1 # isogram
}

{
word = ""
line = ""
for (x = 1; x <= length($0); x++) {
c = substr($0, x, 1)
if (c ~ /[[:space:]]/ || x == length($0)) {
if (x == length($0) && c !~ /[[:space:]]/) word = word c
line = (isogram(word) ? line hilite(word) : line word)
if(c ~ /[[:space:]]/) line = line c
word = ""
} else {
word = word c
}
}
print line
}

# eof
--
:wq
Mike Sanders
Janis Papanagnou
2023-10-11 08:15:37 UTC
Permalink
A quick glimpse at your code gives the impression that you
are parsing the line character-wise to identify "words".
In Awk it is usually better to use the inherent splitting
procedure and operate on $1, $2, etc. Even for cases where
punctuation and other characters may get into your way you
can just define the FS regular expression so that it fits
your needs. That should make your program much simpler and
also easier to understand and maintain.

Janis
Post by Mike Sanders
# Michael Sanders 2023
# https://busybox.neocities.org/notes/isogram.txt
#
# awk script that displays isograms using inverse ANSI
# escapes (meaning fore & background colors are swapped)
# requires an ANSI capable terminal, rename this file
#
# awk -f isogram.awk file
#
# isogram test block...
#
# aberration lucrative concurrent espouse obfuscate
# garrulous promenade epiphany requiem juxtapose
# languid ephemeral abscond extricate circumvent
# obstinate vivacious corroborate attenuate paragon
# penchant serendipity superfluous immutable mitigate
# aplomb concatenate ethereal diaphanous demagogue
# cogitate pervasive anathema juxtaposition memento
# disparate oscillate ennui perfunctory parabola
# mellifluous recumbent ephemeral sycophant timorous
# voracious quixotic serenade conundrum vicarious
# insipid ornate camaraderie cogent introspection
# sanguine deleterious impeccable extraneous loquacious
BEGIN { print "\nisograms...\n" }
function hilite(str) { return "\033[7m" str "\033[0m" }
function isogram(str, c, x, y) {
y = length(str)
for (x = 1; x <= y; x++) {
c = substr(str, x, 1)
if (index(substr(str, x + 1), c) > 0) return 0 # !isogram
}
return 1 # isogram
}
{
word = ""
line = ""
for (x = 1; x <= length($0); x++) {
c = substr($0, x, 1)
if (c ~ /[[:space:]]/ || x == length($0)) {
if (x == length($0) && c !~ /[[:space:]]/) word = word c
line = (isogram(word) ? line hilite(word) : line word)
if(c ~ /[[:space:]]/) line = line c
word = ""
} else {
word = word c
}
}
print line
}
# eof
Mike Sanders
2023-10-11 10:12:13 UTC
Permalink
Post by Janis Papanagnou
A quick glimpse at your code gives the impression that you
are parsing the line character-wise to identify "words".
In Awk it is usually better to use the inherent splitting
procedure and operate on $1, $2, etc. Even for cases where
punctuation and other characters may get into your way you
can just define the FS regular expression so that it fits
your needs. That should make your program much simpler and
also easier to understand and maintain.
Hi Janis.

Sure enough, you're 100% correct on this in my thinking.
In fact, I'm working now on a variant that does use $1, $2,
etc... One issue I'm groping to understand is how *not* to
destroy the layout of a given file upon output. In other
words, I want the output equal to the input with only
difference being that isograms are inverse color. The only
way I've worked through, so far at least, is to not assume
any file structure other than words...

Its an interesting problem to think about =)
--
:wq
Mike Sanders
Janis Papanagnou
2023-10-11 16:41:51 UTC
Permalink
Post by Mike Sanders
Post by Janis Papanagnou
A quick glimpse at your code gives the impression that you
are parsing the line character-wise to identify "words".
In Awk it is usually better to use the inherent splitting
procedure and operate on $1, $2, etc. Even for cases where
punctuation and other characters may get into your way you
can just define the FS regular expression so that it fits
your needs. That should make your program much simpler and
also easier to understand and maintain.
Hi Janis.
Sure enough, you're 100% correct on this in my thinking.
In fact, I'm working now on a variant that does use $1, $2,
etc... One issue I'm groping to understand is how *not* to
destroy the layout of a given file upon output. In other
words, I want the output equal to the input with only
difference being that isograms are inverse color. The only
way I've worked through, so far at least, is to not assume
any file structure other than words...
Its an interesting problem to think about =)
Yes. A solution may also depend on the Awk version you are
allowed to use. With GNU Awk you can preserve the formatting
by using its newer features (array of separators!).

And with standard Awk you can work on patterns and preserve
formatting e.g. with a frame like

function predicate (s) { ...here's your isogram function... }

function escape (s) { return predicate(s) ? "E" s "E" : s }
# replace the two "E" by your ANSI escape code strings

{
out = ""
for (line=$0; match(line, /[[:alpha:]]+/);
line=substr(line,RSTART+RLENGTH)) {
out = out substr(line,1,RSTART-1)
escape(substr(line,RSTART,RLENGTH))
}
out = out line
print out
}

This code specifies the 'alpha' words as entities to consider;
change as desired. (I saw that your code also highlights a '#'
for example; not sure this is intended, though.)

Janis
Janis Papanagnou
2023-10-11 16:47:17 UTC
Permalink
Post by Janis Papanagnou
[...]
{
out = ""
for (line=$0; match(line, /[[:alpha:]]+/);
line=substr(line,RSTART+RLENGTH)) {
out = out substr(line,1,RSTART-1)
escape(substr(line,RSTART,RLENGTH))
}
out = out line
print out
}
[...]
(Sorry, my newsreader splitted two of the long lines.)
The two lines in above code starting at column 1 shall
be on one line:

for (line=$0; match(...); line=substr(...)) {

out = out substr(...) escape(substr(...))


Janis
Mike Sanders
2023-10-12 18:54:30 UTC
Permalink
Post by Janis Papanagnou
Yes. A solution may also depend on the Awk version you are
allowed to use. With GNU Awk you can preserve the formatting
by using its newer features (array of separators!).
Very much for me, I cant always install things I'd like to,
but its okay, I'll work it out =)
Post by Janis Papanagnou
function predicate (s) { ...here's your isogram function... }
Excellent name for a function.
Post by Janis Papanagnou
function escape (s) { return predicate(s) ? "E" s "E" : s }
# replace the two "E" by your ANSI escape code strings
{
out = ""
for (line=$0; match(line, /[[:alpha:]]+/);
line=substr(line,RSTART+RLENGTH)) {
out = out substr(line,1,RSTART-1)
escape(substr(line,RSTART,RLENGTH))
}
out = out line
print out
}
This code specifies the 'alpha' words as entities to consider;
change as desired. (I saw that your code also highlights a '#'
for example; not sure this is intended, though.)
A single char... isogram or not? Probably not really, and then
there's the case of 'mixed' strings as your snippet deals with,
'abc-321'.

But back to the single character issue I'm going with:

function isogram(str, c, x, y) {
y = length(str)
if (y < 2) return 0 # !isogram <-- added this

for (x = 1; x <= y; x++) {
c = substr(str, x, 1)
if (index(substr(str, x + 1), c) > 0) return 0 # !isogram
}
return 1 # isogram
}

Thanks for your input Janis.
--
:wq
Mike Sanders
Janis Papanagnou
2023-10-13 07:22:53 UTC
Permalink
Post by Mike Sanders
Post by Janis Papanagnou
function predicate (s) { ...here's your isogram function... }
Excellent name for a function.
It's meant as generic name for the code pattern I wanted to show.
Post by Mike Sanders
Post by Janis Papanagnou
[...]
This code specifies the 'alpha' words as entities to consider;
change as desired. (I saw that your code also highlights a '#'
for example; not sure this is intended, though.)
A single char... isogram or not? Probably not really, and then
there's the case of 'mixed' strings as your snippet deals with,
'abc-321'.
Oh, my question was more whether a non-alpha character shall be
considered a possible isogram.
Post by Mike Sanders
function isogram(str, c, x, y) {
y = length(str)
if (y < 2) return 0 # !isogram <-- added this
[...]
That's why I also think a pattern based approach has advantages.

Janis
Mike Sanders
2023-10-14 00:34:34 UTC
Permalink
Post by Janis Papanagnou
That's why I also think a pattern based approach has advantages.
Yes, anything the human mind can conceive is valid.
--
:wq
Mike Sanders
Kpop 2GM
2023-10-27 23:14:03 UTC
Permalink
if you want an ultra quick ANSI color chart :

jot - 16 231 | mawk ' BEGIN { print (ORS = _)
. . . . . . . . . . . . . . . . . . . . . _ *= __ = RS RS
} $NF = sprintf("\33[38;5;%dm%3d%.*s%.*s",
. . . . . . . . . . . . . . . . . . . $_, $_, NR % 6^2 == _, RS,
. . . . . . . . . . . . . . . . . . . . . . . . NR % 6^3 == _, __)'

the result is a VERY wide table spanning 6 rows, but that's the only way I could get the colors to properly line up with each other without complicated math.

— The 4Chan Teller

Mike Sanders
2023-10-12 22:33:25 UTC
Permalink
# requires an ANSI capable terminal...
for your notes:

tags: ANSI, escapes, colors, code

invert fore/background color: "\033[7m" str "\033[0m"

clear screen: "\033[H\033[2J"

hide cursor: "\033[?25l"

show cursor: "\033[?25h"

output to row & column: "\033[<ROW>;<COLUMN>H"

set titlebar (for terminals that support it): \033]0;Your Title Here\007

reset colors: "\033[0m"

foreground colors...

"\033[30mBlack"
"\033[31mRed"
"\033[32mGreen"
"\033[33mYellow"
"\033[34mBlue"
"\033[35mMagenta"
"\033[36mCyan"
"\033[37mWhite"

background colors...

"\033[40mBlack"
"\033[41mRed"
"\033[42mGreen"
"\033[43mYellow"
"\033[44mBlue"
"\033[45mMagenta"
"\033[46mCyan"
"\033[47mWhite"
--
:wq
Mike Sanders
Loading...