Grep: using a variable in a regexp - Programmers Heaven

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories

Welcome to the new platform of Programmer's Heaven! We apologize for the inconvenience caused, if you visited us from a broken link of the previous version. The main reason to move to a new platform is to provide more effective and collaborative experience to you all. Please feel free to experience the new platform and use its exciting features. Contact us for any issue that you need to get clarified. We are more than happy to help you.

Grep: using a variable in a regexp

fu2zifu2zi Posts: 3Member
Hi,
I'm trying to use grep, but I can't get a variable to be substituted inside of the regular expression i'm using.
This is what i'm trying:

@fils2 = grep {/^$thing[.*]/} @files;

I can't get $thing to be substituted so the expression ends up returning everything. @files is a list of files in a directory and I'm trying to get another list of files that start with $thing.

Any help would be much appreciated!
Pete

Comments

  • mdw1982mdw1982 Posts: 124Member
    : Hi,
    : I'm trying to use grep, but I can't get a variable to be substituted inside of the regular expression i'm using.
    : This is what i'm trying:
    :
    : @fils2 = grep {/^$thing[.*]/} @files;
    :
    : I can't get $thing to be substituted so the expression ends up returning everything. @files is a list of files in a directory and I'm trying to get another list of files that start with $thing.
    :
    : Any help would be much appreciated!
    : Pete
    :

    Hi Pete,

    It really would be much easier if you used a regex to search the array you're listing here. you'll want to do that with a foreach() statement so you can do it line by line, and if the expression is found then return the line containing the string you're searching for.

    Using grep as you're showing here, when done correctly returns an integer value:

    0 if the string is found
    -1 if it is not found

    if you want to use a system command such as grep then you're going to have to do it in one of a few ways. one way to do it would be this:

    exec(`grep "$thing" @fils2`);

    however...

    the correct manner in which to do this would be something like this:

    [code]
    foreach $line(@fils2){
    chomp($line);
    if ( $line =~ /^$thing[.*]/ ){
    print "$line
    ";
    }
    }
    [/code]

    Using regular expression inside the program like this are far more prefered then to go outside the program and use a system command to perform the function Perl is more then able to handle natively.

    however, if you [b]Really[/b] want a solution using grep, give me a little time and I'll see if I can get one together.

    Mark
  • JonathanJonathan Posts: 2,914Member
    Hi,

    : if you want to use a system command such as grep then you're going to
    : have to do it in one of a few ways. one way to do it would be this:
    :
    : exec(`grep "$thing" @fils2`);
    The grep function is built into Perl. It takes the first parameter as a pattern to match each value in the array that is passed as the second parameter against, and returns an array to the left of those that match.

    : however...
    :
    : the correct manner in which to do this would be something like this:
    There's more than one way to do almost anything in Perl. Though I'd suggest the grep method is more appropriate for this problem.... ;-)

    As for why it won't work, I suggest you get rid of the [] in your expression. [] defines a character class, but . is the character class for any character and * is a modifier meaning any number of characters, so that maybe shouldn't be in a character class either. Unless you're somehow using Perl 6, in which case I believe they are changing [] to be non-capturing brackets. At the moment it may be seeing what is in the [] after $thing to be the element of the @thing array maybe and that's why it's failing.

    Of course, I may be completely wrong, and reserve the right to be so. ;-)

    Later,

    Jonathan



    -------------------------------------------
    Count your downloads:
    http://www.downloadcounter.com/
    And host your site:
    http://www.incrahost.com/
    Don't say I never give you anything... ;-)

  • mdw1982mdw1982 Posts: 124Member
    : Hi,
    :
    : : if you want to use a system command such as grep then you're going to
    : : have to do it in one of a few ways. one way to do it would be this:
    : :
    : : exec(`grep "$thing" @fils2`);
    : The grep function is built into Perl. It takes the first parameter as a pattern to match each value in the array that is passed as the second parameter against, and returns an array to the left of those that match.
    :
    : : however...
    : :
    : : the correct manner in which to do this would be something like this:
    : There's more than one way to do almost anything in Perl. Though I'd suggest the grep method is more appropriate for this problem.... ;-)
    :
    : As for why it won't work, I suggest you get rid of the [] in your expression. [] defines a character class, but . is the character class for any character and * is a modifier meaning any number of characters, so that maybe shouldn't be in a character class either. Unless you're somehow using Perl 6, in which case I believe they are changing [] to be non-capturing brackets. At the moment it may be seeing what is in the [] after $thing to be the element of the @thing array maybe and that's why it's failing.
    :
    : Of course, I may be completely wrong, and reserve the right to be so. ;-)
    :
    : Later,
    :
    : Jonathan
    :

    Jonathan,

    I'm not so sure thats the problem with this expression. It's been my experience that when using system commands from inside perl scripts you've either got to use the exec() function or the system() function...errr operator. (sometimes its hard shifting gears between java and perl).

    Since he was setting an array = to the grep command, which I'm not all that sure is legal, or at least desirable, using it in that manner would place a zero (0) or negative 1 (-1) as the value of the variable to the left of the statement, rather then placing the results of the command into the variable. I've use similar methods in the past to see if something has worked.

    secondly, and most importantly, there were no ticks (`) surrounding the grep command, which are crucial if you want to use system commands. Without those perl will likley ignore the command if not toss an error for that line.

    I haven't test it yet, but I shall... but this method may work for what he's attempting:

    [code]
    $results = `grep /$thing[.*]/ @files`;
    [/code]

    And I really hope they're not going to make that change you mentioned in perl 6. I rather like the way [ ] work in perl now!

    Mark
  • JonathanJonathan Posts: 2,914Member
    Hi,

    : I'm not so sure thats the problem with this expression. It's been my
    : experience that when using system commands from inside perl scripts
    : you've either got to use the exec() function or the system()
    : function...errr operator. (sometimes its hard shifting gears between
    : java and perl).
    Yes, but Perl actually has a built in grep function - built into the language. I believe it is not invoking the external "grep" program at all. Though you could do it that way as you suggest. But why create the overhead when Perl can do it for you?

    : Since he was setting an array = to the grep command, which I'm not a
    : all that sure is legal, or at least desirable,
    It's legal AND desriable. The grep function takes one array and returns an array containing the elements from the original array that matched the patter. See:-
    http://aspn.activestate.com/ASPN/Products/ActivePerl/lib/Pod/perlfunc.html#item_grep

    : using it in that manner would place a zero (0) or negative 1 (-1) as
    : the value of the variable to the left of the statement, rather then
    : placing the results of the command into the variable. I've use
    : similar methods in the past to see if something has worked.
    It might even throw an error if you attempt to set an array to a scalar value. The scalar value of an array is the number of elements in it, but that may be a read only property...

    : secondly, and most importantly, there were no ticks (`) surrounding
    : the grep command, which are crucial if you want to use system
    : commands. Without those perl will likley ignore the command if not
    : toss an error for that line.
    But I think the person who posted never intended to use the external system command, but rather the built in Perl function. :-)

    : I haven't test it yet, but I shall... but this method may work for
    : what he's attempting:
    :
    : [code]
    : $results = `grep /$thing[.*]/ @files`;
    : [/code]
    I'm not sure if it would - don't you have to pipe the data into the external grep command? That captures it's STDOUT output, but we need to pipe the data into STDIN I think. As for the return, if the array seperator is set to
    you could set an array as the return value in place of $results and it may put each item from the external program into the array, one line in each element. Maybe. You might have to use the split function instead though, I can't say off the top of my head.

    : And I really hope they're not going to make that change you mentioned
    : in perl 6. I rather like the way [ ] work in perl now!
    'fraid the entire way regex work in Perl 6 is to change. See:-
    http://dev.perl.org/perl6/apocalypse/5
    The section "Brave New World". For practical examples, see:-
    http://dev.perl.org/perl6/exegesis/5

    Later,

    Jonathan


    -------------------------------------------
    Count your downloads:
    http://www.downloadcounter.com/
    And host your site:
    http://www.incrahost.com/
    Don't say I never give you anything... ;-)

  • mdw1982mdw1982 Posts: 124Member
    : : And I really hope they're not going to make that change you mentioned
    : : in perl 6. I rather like the way [ ] work in perl now!
    : 'fraid the entire way regex work in Perl 6 is to change. See:-
    : http://dev.perl.org/perl6/apocalypse/5
    : The section "Brave New World". For practical examples, see:-
    : http://dev.perl.org/perl6/exegesis/5
    :
    : Later,
    :
    : Jonathan

    Hi Jonathan,

    Holy hanna! I had no idea! I've been living in a cave it would appear where Perl is concerned. I really need to get up to date. thanks for the links. that is some awesome information, but won't those new regex functions make coders lazy, and possibly ignorant?

    Mark
  • JonathanJonathan Posts: 2,914Member
    Hi,

    : Holy hanna! I had no idea! I've been living in a cave it would
    : appear where Perl is concerned. I really need to get up to date.
    I've been following the Perl 6 development for a while now. Each week on http://www.perl.com/ a summary is written of what is going on. Some of it (e.g. the internals stuff) goes right over my head. But it's interesting to follow it and see where Perl is headed. I hope to adopt it and play with it as soon as I can.

    : thanks for the links. that is some awesome information, but won't
    : those new regex functions make coders lazy, and possibly ignorant?
    Lazy and ignorant in what way? I think it will actually make regular expressions better, as the default is to ignore whitespace and allow comments now, which will leave programmers with less excuses not to space out and comment their expression. In Perl regexes etc are a central part of the langauge and anything that allows them to look less arcane I guess is a bonus.

    If anything it will be a pain to those of us who know the old system well, as so much has changed we've got a lot to re-learn. I'm not entirely convinced on the . now matching a newline and we have to use N to match any character but a newline - N doesn't occur to me as being an "any character" thing I guess. Well, it does as I know that W is not an alphanumeric, D is not a digit etc, but I don't think it's crystal clear. But, IMHO Larry Wall has got a lot right with Perl before now, and while I'm not all that comfortable with the changes just yet, I'm willing to trust him that this very radical shake up is for the best. I guess it's better to have one radical overall fix than try and do little hacks to improve it here and there.

    Later,

    Jonathan

    -------------------------------------------
    Count your downloads:
    http://www.downloadcounter.com/
    And host your site:
    http://www.incrahost.com/
    Don't say I never give you anything... ;-)

  • fu2zifu2zi Posts: 3Member
    Hi,
    Thanks for all the help guys. I managed to sort it out last week so I didn't really get the benefit of most of your replys...oh well :)
    I've pasted in my program below in the hope that it might be helpful to someone else out there (I imagine that it's not that uncommon to want to remove the oldest backup copies of a file). The code is not efficient or pretty, but there might be something in it to help the newbie perl coder (like myself).Apologies if the formatting gets screwed up!
    Pete.
    [code]
    ########################################################################
    ## ##
    ## PROGRAM: FILEKILL.PL ##
    ## DESC: File remover for old generations of files ##
    ## AUTHOR: Pete Johnson ##
    ## DATE: 12/03/03 ##
    ## ##
    ########################################################################
    ##Filekill loops through all the items on a drive starting from $dir. ##
    ##If the item is a file it is pushed into an array (@filearr), if it ##
    ##is a directory, the contents of @filearr are processed (before ##
    ##reseting the array). Filenames are stripped of their extensions ##
    ##(e.g. "test.txt.001" becomes "test") and then the duplicates are ##
    ##removed (e.g. if there are three files "test.txt.001","test.txt.002"##
    ##and "test.txt.003" one stripped value "test" is kept and the other ##
    ##two are removed). These values are stored in @arr2. A loop iterates ##
    ##through the values of @arr2 and searches @filearr for filenames that##
    ##start with each value. These are put in the array @matches (e.g. ##
    ##@arr2 value = "test", matching values in @filearr= "test.txt.001", ##
    ##"test.txt.002","test.txt.003"). The mtime for each filename is found##
    ##and added to a hash array,which is sorted to find the oldest files. ##
    ##If there are more files than the value of $maxgens then the oldest ##
    ##ones are added to @killarr and deleted using the unlink command ##
    ##(e.g. $maxgens=2 @filearr contains 3 files that start with "test". ##
    ##The oldest one is added to @killarr and unlinked leaving ##
    ##"test.txt.001" and "test.txt.002"). ##
    ########################################################################

    #!/usr/bin/perl
    use File::Find;

    $dir=".";
    @filearr=(); # filenames in a directory
    @arr2 = (); # stripped filenames
    $size = 0;
    $maxgens = 2; # change to 10 generations. 2 is for testing
    @killarr = (); # list of files to delete
    $olddir = $dir;
    $newdir = $dir;
    find(&lister, $dir); # recursive call lister() on all files

    ######## final run of processing for last directory...
    print("'$curdir'
    ");
    @arr2 = @filearr;

    foreach $thing (@arr2) { # strip file extensions
    $thing =~ s/..*//;
    }

    @arr2 = grep{ $numWords{$_}++ == 0 } @arr2; # remove duplicates

    foreach $thing3 (@arr2) {

    @matches = grep{/^$thing3[.*]/} @filearr;
    # all files starting with $thing3
    %hash = (); # initialize arrays: %hash stores filenames with mtimes
    @killarr = (); # killarr is the array of files to remove

    foreach $thing4 (@matches){ # for each file starting with $thing

    $hash{$thing4} = (stat($thing4))[9];
    # add to hash array with the mtime (field 9 from stat function)
    }

    $counter = 0;

    foreach $key (sort by_value keys %hash) {# sort the files by date
    $counter = ($counter + 1);
    if ($counter > $maxgens) {
    push(@killarr,$key);
    }
    }

    print("$thing3, @killarr");

    for $kill (@killarr) {
    unlink("$curdir/$kill"); # delete files
    }
    }
    ######### end of final run

    sub lister() { # each call returns an item on the the drive to the variable $_

    $foob=(stat($_))[2]; # 33000+ for a dir. 17000+ for file
    if ($foob lt 33000) { # if item is a directory
    $olddir = $curdir;
    $curdir = $File::Find::name; # full path of current directory
    $size = @filearr; # No. of items in directory
    @arr2 = @filearr; # copy list of files to @arr2

    foreach $thing (@arr2) { # remove file extensions
    $thing =~ s/..*//;
    }

    @arr2 = grep{ $numWords{$_}++ == 0 } @arr2;
    # remove duplicate entries
    print("
    '$olddir' $size
    ");

    foreach $thing (@arr2) {

    @matches = grep{/^$thing[.*]/} @filearr;
    # All filenames that start with value of $thing
    %hash = (); # initialize arrays: %hash stores filenames with mtimes
    @killarr = (); # killarr is the array of files to remove

    foreach $thing2 (@matches){ # for each file starting with $thing
    $hash{$thing2} = (stat($thing2))[9];
    # add to hash array with the mtime (field 9 from stat function)
    }
    # (this is necessary to sort files by date)

    $counter = 0;
    foreach $key (sort by_value keys %hash) { # sort the files by date
    $counter = ($counter + 1);

    if ($counter > $maxgens) {
    # if there are more than $maxgens files that start with $thing
    push(@killarr,$key); # then add the oldest ones to @killarr
    }
    }

    print("$thing, @killarr");

    for $kill (@killarr) {
    unlink("$olddir/$kill"); # delete files
    }

    }

    @filearr = (); # reset arrays for next loop
    @arr2 = ();
    @killarr = ();

    }else{ # item found is a file.
    push(@filearr,$_); # Add to list of files in current directory
    }

    $foob=();

    }

    sub by_value { $hash{$a} cmp $hash{$b}; } # used for sorting
    [/code]


Sign In or Register to comment.