Mombu the Php Forum sponsored links

Go Back   Mombu the Php Forum > Php > Regex to catch

s

User Name
Password
REGISTER NOW! Mark Forums Read

sponsored links


Reply
 
1 20th April 20:47
genphp
External User
 
Posts: 1
Default Regex to catch

s




Hey all!

To say I **** at regex is an understatement so really need any help I can get on this, I have a page of text with different html tags in them, but each "block" of text has a <p> or a < class="something"> tag... anybody have any regex that will catch each of these paragraphs and put then into an array
example:
array[0]="<p> first block </p>";
array[1]="<p class="blah"> block X</p>";

Thanks!
R


__________________________________________________ __________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
  Reply With Quote


  sponsored links


2 20th April 20:48
eric.butera
External User
 
Posts: 1
Default Regex to catch

s




If you're using php5 you can use DOM's getElementsByTagName.

If you still think you need to do some sort of regex it is possible
but it will be buggy at best.
  Reply With Quote


  sponsored links


3 20th April 20:48
genphp
External User
 
Posts: 1
Default Regex to catch

s



<clip>

If you're using php5 you can use DOM's getElementsByTagName.

If you still think you need to do some sort of regex it is possible
but it will be buggy at best.


</clip>

Nope, need a regex... guess I have no choice, either chancy regex or nothing... I know for a fact that the first paragraph tag wont contain a class, and for the <p> tags that contain a class="blah" does it matter that i know exactly what the classname is?

__________________________________________________ __________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
  Reply With Quote
4 20th April 20:48
aschwin
External User
 
Posts: 1
Default Regex to catch

s



Hi,

Maybe the example is overkill, but I give you a quick setup that can
save you some time finding HTML tags with a certain attribute.

<?php
$html = <<<END_OF_HTML
<b>hello</b>
<b class="blah">hello</b>
<p>hello</p>
<p class="blah">hello</p>
<a>hello</a>
<a href="url">hello</a>
END_OF_HTML;

$tags = array();
$tags[] = 'p';
$tags[] = 'a';

$tags = implode('|', $tags);
$pattern = '/<('.$tags.')[^>]*>/i';

echo $pattern."\n";

preg_match_all($pattern, $html, $matches); var_dump($matches);
?>

I'm not an expression guru either, but I think it works OK. I had to
find 'link', 'img', 'a' and other tags in HTML and used a more complex
expression for it which worked like a charm.

It's just an example. For you, you have to leave away the 'a' tag in the
$tags array, to get what you want.

Hope it helps!
--

Aschwin Wesselius

/'What you would like to be done to you, do that to the other....'/
  Reply With Quote
5 20th April 20:48
aschwin
External User
 
Posts: 1
Default Regex to catch

s



Hi,

I'm sorry. I didn't read your request properly. Below you'll have a
correct solution:

<?php
$html = <<<END_OF_HTML
<b>hello</b>
<b class="blah">hello</b>
<p>hello</p>
<p class="blah">hello</p>
<a>hello</a>
<a href="url">this</a>
<a>hello</a>
<a href="regex yo">hello</a>
<a>hello</a>
<a id="2" href="regex yo">hello</a>
<p>that</p>
<p class="blah" title="whatever">hello</p>
END_OF_HTML;

$tags = array();
$tags[] = 'p';
$tags[] = 'a';

$attr = array();
$attr[] = 'class';
$attr[] = 'href';

$vals = array();
$vals[] = 'blah';
$vals[] = 'url';
$vals[] = 'yo';

$text = array();
$text[] = 'hello';
$text[] = 'this';
$text[] = 'that';

$tags = implode('|', $tags);
$attr = implode('|', $attr);
$vals = implode('|', $vals);
$text = implode('|', $text); $pattern =
'/<('.$tags.')[^>]*('.$attr.')[^>]*('.$vals.')[^>]*>('.$text.')[^<\/]*<\/\1>/i';

echo $pattern."\n";
echo "--------------------\n";

preg_match_all($pattern, $html, $matches); var_dump($matches);
?>
  Reply With Quote
6 20th April 20:48
aschwin
External User
 
Posts: 1
Default Regex to catch

s



Hi,

It is obvious I haven't had my caffeine yet. This is my last try to get
the pattern straight:

<?php
$html = <<<END_OF_HTML
<b>hello</b>
<b class="blah">hello</b>
<p>those</p>
<p class="blah">hello</p>
<a>hello</a>
<a href="url">this</a>
<a>rose</a>
<a href="regex yo">hello</a>
<a>nose</a>
<a id="2" href="regex yo">hello</a>
<p>that</p>
<p class="blah" title="whatever">hello</p>
END_OF_HTML;

$tags = array();
$tags[] = 'p';
$tags[] = 'a';

$attr = array();
$attr[] = 'class';
$attr[] = 'href';

$vals = array();
$vals[] = 'blah';
$vals[] = 'url';
$vals[] = 'yo';

$text = array();
$text[] = 'hello';
$text[] = 'this';
$text[] = 'that';

$tags = implode('|', $tags);
$attr = implode('|', $attr);
$vals = implode('|', $vals);
$text = implode('|', $text); $pattern =
'/<('.$tags.')[^>]*('.$attr.')?[^>]*('.$vals.')?[^>]*>('.$text.')[^<\/]*<\/\1>/i';

echo $pattern."\n";
echo "--------------------\n";

preg_match_all($pattern, $html, $matches); var_dump($matches);
?>
--

Aschwin Wesselius

/'What you would like to be done to you, do that to the other....'/
  Reply With Quote
7 20th April 22:42
External User
 
Posts: 1
Default Regex to catch

s



Or you could simplify and just do this:

$html = <<<END_OF_HTML
<b>hello</b>
<b class="blah">hello</b>
<p>those</p>
<p class="blah">hello</p>
<a>hello</a>
<a href="url">this</a>
<a>rose</a>
<a href="regex yo">hello</a>
<a>nose</a>
<a id="2" href="regex yo">hello</a>
<p>that</p>
<p class="blah" title="whatever">hello</p>
END_OF_HTML;

// This will give you any tag
preg_match_all("/<[\s\S]*?>*?<\/[\s\S]*?>/", $html, $matches);
print_r($matches);

// This will give you any p tag
preg_match_all("/<p[\s\S]*?>*?<\/p[\s\S]*?>/", $html, $matches);
print_r($matches);
  Reply With Quote
8 20th April 22:42
nospam
External User
 
Posts: 1
Default Regex to catch

s



preg_match_all('|<p[^>]*>(.*)</p>|Ui', $myText, $myArray);
  Reply With Quote
9 20th April 22:42
s4nnym
External User
 
Posts: 1
Default Re gex to catch

s



$tag_regex=array(
'/\<p(\s*)\>(.*?)\<\/p\> /si' => "$1",
'/\<(\s*)(*.?)class\=(*.?)\>(.*?)\<\/(*.?)\>/si' => "$3"
);

$paragraphs=preg_replace(array_keys($tag_regex),ar ray_values($tag_regex),$page);

I am not sure what tag is that you mean on <class="something">, but in this
RE .. it should capture any <p> tags (the first element of the array) and
any tags (the second element of the array) that has attribute class on it.

You can find another example of this kind of HTML parsing in the PHP... try
googling it..

HTH


--
View this message in context: http://www.nabble.com/Re%3A-Regex-to-catch-%3Cp%3Es-tp17075329p17089906.html
Sent from the PHP - General mailing list archive at Nabble.com.
  Reply With Quote
Reply


Thread Tools
Display Modes




Copyright © 2006 SmartyDevil.com - Dies Mies Jeschet Boenedoesef Douvema Enitemaus -
666