Converting HTML to Plain Text in C#:
Below is a simple C# program that shows this conversion:
using System;
using System.Text.RegularExpressions;
class Program
{
static void Main(string[] args)
{
string htmlContent = "<h1>Hello, <em>world</em>!</h1>";
string plainText = StripHtmlTags(htmlContent);
Console.WriteLine(plainText);
}
static string StripHtmlTags(string html)
{
return Regex.Replace(html, "<.*?>", string.Empty);
}
}
We defined a method called StripHtmlTags
, which takes an HTML string as input and uses a regular expression to remove all HTML tags from it. The Main
method shows how to use this function by providing sample HTML content and printing the resulting plain text.
Output:
Hello, world!
Comments (0)