Trouble finding: Extracting tags from xhtml content

Wednesday, August 4, 2010

Extracting tags from xhtml content

There are two ways of doing it. One is the dirty way, using Mid function in VB.Net or IndexOf string method but the more appropriate way would be to use regular expressions.

Following code will get you title using regular expression

Regex regex = new Regex("<title>(?<title>.*?)</title>", RegexOptions.IgnoreCase);
Match titleMatch = regex.Match(html);
string title = titleMatch.Groups["title"].Value;

Following code will get you meta tag My_Meta

Regex regex = new Regex("<META +NAME=\"(?<name>My_Meta*?)\" +CONTENT=\"(?<content>.*?)\" */?>", RegexOptions.IgnoreCase);

Match metaMatch = regex.Match(html);

title = metaMatch.Groups["content"].Value;

Trouble finding

Wednesday, August 4, 2010

Extracting tags from xhtml content

No comments:

Blog Archive

About Me