Discussion:
{} Questions
Add Reply
Mike Sanders
2024-08-21 05:35:01 UTC
Reply
Permalink
Assuming any awk variant...

1. Is this valid? (it works with mawk, gawk, busy box awk)

BEGIN { debug = 1 }

(debug) { code_here }

(!debug) { code_here }

END { ... }

2. what is the name or accepted term in AWK for unamed functions, main()?
--
:wq
Mike Sanders
Ben Bacarisse
2024-08-21 08:06:51 UTC
Reply
Permalink
Post by Mike Sanders
Assuming any awk variant...
1. Is this valid? (it works with mawk, gawk, busy box awk)
BEGIN { debug = 1 }
(debug) { code_here }
(!debug) { code_here }
END { ... }
Yes, that's valid. You don't need the ()s round the expressions.
Post by Mike Sanders
2. what is the name or accepted term in AWK for unamed functions, main()?
I don't know what you mean. Can you give an example?

It occurs to me that maybe you think

(debug) { code }

is a function? It's not. It's just a normal pattern/action AWK pair.
An AWK pattern can just be an expression. That expression is evaluated
for every input line and, if true, the corresponding action is executed.
You could have written

debug != 0 { code }

instead.
--
Ben.
Mike Sanders
2024-08-21 11:48:31 UTC
Reply
Permalink
Post by Ben Bacarisse
Post by Mike Sanders
Assuming any awk variant...
1. Is this valid? (it works with mawk, gawk, busy box awk)
BEGIN { debug = 1 }
(debug) { code_here }
(!debug) { code_here }
END { ... }
Yes, that's valid. You don't need the ()s round the expressions.
Thanks Ben.
Post by Ben Bacarisse
Post by Mike Sanders
2. what is the name or accepted term in AWK for unamed functions, main()?
I don't know what you mean. Can you give an example?
It occurs to me that maybe you think
(debug) { code }
is a function? It's not. It's just a normal pattern/action AWK pair.
An AWK pattern can just be an expression. That expression is evaluated
for every input line and, if true, the corresponding action is executed.
And yet there's still more implied nuance somehow. Let me try to articulate
my thoughts...

- These types of constructs are 'auto' ran per line of input (assuming
its not located within another user-written function) that I get.
Perhaps a potential efficiency hit to be aware of.

- There's also the scope of variables to consider... Because any
variable's located outside of a user-written function or conversely
located within a 'bare naked' construct is global, or at least
exposed to the entire script's 'world', so I want be careful here...

I'm guessing that any construct that lacks a function signature
is globally scoped:

{ im_global = 42 }

while those with a function signature (located within parentheses)
are locally scoped:

function private(private_var1) { private_var1 = "foo" }

- And 2 more...

$ awk '{ v="im_global" }'

$ awk -f foo -v global=55

Anything within the above 2 are global as well?

Just thinking aloud here so, I'll let you long time posters describe
it more lucidly.
--
:wq
Mike Sanders
Janis Papanagnou
2024-08-21 12:08:08 UTC
Reply
Permalink
On 21.08.2024 13:48, Mike Sanders wrote:
[ Concerning the basic awk syntax: condition { action } ]
Post by Mike Sanders
And yet there's still more implied nuance somehow. Let me try to articulate
my thoughts...
- These types of constructs are 'auto' ran per line of input (assuming
its not located within another user-written function) that I get.
You cannot have these constructs with their given semantics inside a
function. You'd have to formulate them explicitly (in the imperative
form) with 'if', as in

function f ()
{
if (condition) action
}
Post by Mike Sanders
Perhaps a potential efficiency hit to be aware of.
- There's also the scope of variables to consider... Because any
variable's located outside of a user-written function or conversely
located within a 'bare naked' construct is global, or at least
exposed to the entire script's 'world', so I want be careful here...
All variables have global scope, with the exception of those specified
in a function argument list along with the real arguments, as in

function f (arg1, arg2, local1, local2) { global = arg1 ; ... }
f ("Hello", 42);

(There's some caveat with arrays in the function argument list.)

Janis
Post by Mike Sanders
I'm guessing that any construct that lacks a function signature
{ im_global = 42 }
while those with a function signature (located within parentheses)
function private(private_var1) { private_var1 = "foo" }
- And 2 more...
$ awk '{ v="im_global" }'
$ awk -f foo -v global=55
Anything within the above 2 are global as well?
Just thinking aloud here so, I'll let you long time posters describe
it more lucidly.
Janis Papanagnou
2024-08-21 12:10:00 UTC
Reply
Permalink
Post by Janis Papanagnou
function f (arg1, arg2, local1, local2) { global = arg1 ; ... }
f ("Hello", 42);
f("Hello", 42);

Of course no space between the identifier and the parenthesis with
user-defined functions.

Janis
Mike Sanders
2024-08-21 18:57:32 UTC
Reply
Permalink
Post by Janis Papanagnou
[ Concerning the basic awk syntax: condition { action } ]
Post by Mike Sanders
And yet there's still more implied nuance somehow. Let me try to articulate
my thoughts...
- These types of constructs are 'auto' ran per line of input (assuming
its not located within another user-written function) that I get.
You cannot have these constructs with their given semantics inside a
function. You'd have to formulate them explicitly (in the imperative
form) with 'if', as in
function f ()
{
if (condition) action
}
Of course I tried & failed eariler today...

function foo() {

!boo { code }
}
Post by Janis Papanagnou
Post by Mike Sanders
Perhaps a potential efficiency hit to be aware of.
- There's also the scope of variables to consider... Because any
variable's located outside of a user-written function or conversely
located within a 'bare naked' construct is global, or at least
exposed to the entire script's 'world', so I want be careful here...
All variables have global scope, with the exception of those specified
in a function argument list along with the real arguments, as in
function f (arg1, arg2, local1, local2) { global = arg1 ; ... }
f ("Hello", 42);
(There's some caveat with arrays in the function argument list.)
Janis
Not sure why that would be, can you offer more detail?

At any rate, as always thanks for the brainfood.
--
:wq
Mike Sanders
Janis Papanagnou
2024-08-21 22:51:15 UTC
Reply
Permalink
Post by Mike Sanders
Post by Janis Papanagnou
[...]
(There's some caveat with arrays in the function argument list.)
Not sure why that would be, can you offer more detail?
Assuming you meant the statement in parenthesis... - Most prominent
one is the difference in parameter passing depending on the parameter
type; arrays are passed by reference, scalars by value. So any change
of array elements inside the function will change the array object on
the caller's side.

(There was also a case where awk got an issue in differentiating an
array from a scalar in functions, but details evade my memory at the
moment.)

Janis

Ben Bacarisse
2024-08-21 15:38:03 UTC
Reply
Permalink
Post by Mike Sanders
Post by Ben Bacarisse
Post by Mike Sanders
Assuming any awk variant...
1. Is this valid? (it works with mawk, gawk, busy box awk)
BEGIN { debug = 1 }
(debug) { code_here }
(!debug) { code_here }
END { ... }
Yes, that's valid. You don't need the ()s round the expressions.
Thanks Ben.
Post by Ben Bacarisse
Post by Mike Sanders
2. what is the name or accepted term in AWK for unamed functions, main()?
I don't know what you mean. Can you give an example?
It occurs to me that maybe you think
(debug) { code }
is a function? It's not. It's just a normal pattern/action AWK pair.
An AWK pattern can just be an expression. That expression is evaluated
for every input line and, if true, the corresponding action is executed.
And yet there's still more implied nuance somehow. Let me try to articulate
my thoughts...
- These types of constructs are 'auto' ran per line of input (assuming
its not located within another user-written function) that I get.
Perhaps a potential efficiency hit to be aware of.
AWK does not have the usual nested syntax you have probably come to
expect. In, for example, Algol 68 blocks are made up of statements that
can include procedure declarations that include blocks make up of
statements and so on to whatever level of nest you might want.

AWK's top-level syntax -- a list of function declarations and
pattern/action pairs -- is never nested. It's the outer syntax of a AWK
program and that's it. None of it can appear inside any other AWK
construct.
Post by Mike Sanders
- There's also the scope of variables to consider... Because any
variable's located outside of a user-written function or conversely
located within a 'bare naked' construct is global, or at least
exposed to the entire script's 'world', so I want be careful here...
I think your "conversely" is misplaced. And some variables inside a
user-defined functions are also global; the exceptions are names written
as parameters to a function, which I think is what you are saying
below.
Post by Mike Sanders
I'm guessing that any construct that lacks a function signature
{ im_global = 42 }
This does not "lack a function signature", or at least that's a very odd
way of putting it. It is, presumably, the action of a pattern/action
pair (with maybe an empty pattern).
Post by Mike Sanders
while those with a function signature (located within parentheses)
function private(private_var1) { private_var1 = "foo" }
Yes. The names listed as parameters in the function declaration are
local to the function and denote local variables. That's why AWK
programmers have to play tricks like adding extra parameters to get
local variables.
Post by Mike Sanders
- And 2 more...
$ awk '{ v="im_global" }'
$ awk -f foo -v global=55
Anything within the above 2 are global as well?
Yes, everything is global except for the names parameters in a function
declaration.
Post by Mike Sanders
Just thinking aloud here so, I'll let you long time posters describe
it more lucidly.
--
Ben.
Mike Sanders
2024-08-21 19:01:27 UTC
Reply
Permalink
Post by Ben Bacarisse
AWK does not have the usual nested syntax you have probably come to
expect. In, for example, Algol 68 blocks are made up of statements that
can include procedure declarations that include blocks make up of
statements and so on to whatever level of nest you might want.
Have seen this elsewhere too.
--
:wq
Mike Sanders
Loading...