How to remove HTML tags in content or string?
We can remove HTML tags in string; it is little bit easy using Regex in .net. Here is the code for removing HTML tags for content or string. Code is in C#.net.
First import namespace called System.Text.RegularExpressions.
Code:-
using System.Text.RegularExpressions;
Then create one method like this below
Code :-
#region reusable regex's
protected static Regex htmlRegex = new Regex("<[^>]+>|\ \;", RegexOptions.IgnoreCase | RegexOptions.Singleline | RegexOptions.Compiled);
protected static Regex inlineTextHtmlRegex = new Regex("<!-- ]*(?.*? // -->|<!-- ]*(?.*? -->", RegexOptions.IgnoreCase | RegexOptions.Singleline | RegexOptions.Compiled);
protected static Regex spacer = new Regex(@"s{2,}", RegexOptions.Compiled);
#endregion
public static string RemoveHtml(string html)
{
if(string.IsNullOrEmpty(html))
return string.Empty;
string nonhtml = spacer.Replace(htmlRegex.Replace(inlineTextHtmlRegex.Replace(html, ""), " ").Trim(), " ");
return nonhtml;
}
We can call this method like this below
Code:-
string htmlStr=”<a href=’http://shareourideas.wordpress.com/’ title=’Share our ideas’> share our ideas</a>”;
sring textOnly=RemoveHtml(htmlStr);
It just returns share our ideas.
Enjoy while coding..!
Thanks,
Naga Harish.
One thought on “remove HTML tags in string”