Storing just the match in a line, Was: Re: How do you do this in awk?
This is relevant to the previous discussion in this thread but is
different enough that it's less confusing for me to start from scratch
in this post. The main problem is that the regular expression is not
always finding the correct pattern match.
Background:
I'm using gawk in a Win32 console window, though that may not be
relevant. I need to extract, in this case, the names of .c or .cpp
source files that may appear at random in an ASCII text file.
Fortunately only one instance of a source file name may appear in any
one line.
Provided this function:
====
# Extract a substring that matches the regular expression
function extract(str,regexp)
{ RMATCH = (match(str,regexp) ? substr(str,RSTART,RLENGTH) : "")
return RSTART
}
====
.... I have this snippet of code (note the source file name is always in
quotation marks in the text file):
====
if (line ~ /[^\"]*\.cp*/ ) {
extract(line,"[^\"]+\.cp*");
print "The source file is named " RMATCH;
}
====
.... to try to get the quoted name of the source file. When the line is
as follows, the match is "foo.c", which is what I want:
RelativePath="foo.c"
.... but when the name of the source file begins with a c, the match is
either:
RelativePath="c
or
RelativePath="cp
.... (including all the leading whitespace) depending on whether the
source file is "coo.c" or "coo.cpp".
I do not understand why the file name starting with a "c" will change
the behavior, since the regular expression specifies the match must
have a period before the c. Help, please?
Thank you,
JP
|