Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Sign In with Facebook Sign In with Google Sign In with OpenID

Categories

We have migrated to a new platform! Please note that you will need to reset your password to log in (your credentials are still in-tact though). Please contact lee@programmersheaven.com if you have questions.
Welcome to the new platform of Programmer's Heaven! We apologize for the inconvenience caused, if you visited us from a broken link of the previous version. The main reason to move to a new platform is to provide more effective and collaborative experience to you all. Please feel free to experience the new platform and use its exciting features. Contact us for any issue that you need to get clarified. We are more than happy to help you.

Paragraph search - Regular expresions

Hi,

This is example in PHP, but problem is RE and 'thinking' is universal

I have to 'take' paragraphs from html file where word paragraph is not only

...

paragraph is and so if file is:

test


test2 test3
test2

bla bla



result has to be:

test
test2 test3
test2
bla bla

that mean everything between

and

, and and between

and start of new paragraph

or

. Here are some 'holes' and bugs in this request I know, and I know this script can be 'broken' on some unfriendly html file but I want to make some solution for start for 'friendly' files :))))

Because I am not expert in RE I got idea to replace all

,

, and with some unusual word and then to take parts between that words. I know it isn't good solution but I hope it will solve problem.

I made something like this:

<?php /* Paragraph search */
// Take a source
$File = implode ("", file("page.html"));
// Remove all styles (not neccessary):
$File = eregi_replace ("<style(.*)</style>", "", $File);
// Remove all scripts (not neccessary):
$File = eregi_replace ("", "", $File);
// Remove 'baggage' (not neccessary):
$File = eregi_replace ("", "", $File);
// Replace 'targets' with some unusual word:
$File = eregi_replace ("", " ", $File);
$File = eregi_replace ("", " ", $File);
$File = eregi_replace ("

", " ", $File);
$File = eregi_replace ("

", " ", $File);
// Take everything between i
if (preg_match_all("/(.*)/", $File, $matches)) {
for ($i=0; $i < count($matches[0]); $i++) {
echo "Paragraph" . $i . ": " . $matches[1][$i] . "
";
echo "--------------
";
}
}
?>

but I got no results :(((

What can be problem, what I can to do and what do You mean about solution? Somebody has better solution?

Thanks and regards,

Aleksandar Ljubojevic - LJUBA
ljubas@yahoo.com
http://ljubas.tripod.com
Sign In or Register to comment.