Validating Rich TextArea in ASP.NET MVC 3

There’s a few things to consider when validating data from a rich textarea. After setting it up (here’s a blog post about that), we have to allow the property to accept HTML, otherwise, you’ll get this:

Which would normally be good a thing for normal textboxes to avoid XSS. However, here we have to accept it. So assuming we have a property in our model called message, we add the AllowHtml attribute:

  [AllowHtml]
  public string Message { get; set; }

This is all great and dandy, however, once you do that, it won’t allow other validations, like Range(). So what we have to do is create our own attribute. To do this, we just create a class that inherits ValidationAttribute:

class AllowHtmlRangeLength : ValidationAttribute
{
  public override bool IsValid(object value)
  {
    string property = (string)value ?? "";
    if (property.Length >= 3 && property.Length <= 2000)
    {
      return true;
    }
 
    return false;
  }

In this case, we’re only accepting (to attempt to validate) a string from 3 to 2000 characters. Then we use it like this in our model:

  [AllowHtml]
  [AllowHtmlRangeLength(ErrorMessage="Please have at least 3 to 2000 characters."]
  public string Message { get; set; }

The problem with this set up is that we don’t want HTML tags to count as characters. So we need to parse out HTML. Because parsing data with just one regular expression can be error-prone, it’s best to use a library like HTML Agility pack – get it from NuGet:

Once you have it installed in your project, we can reuse the sample code parse out HTML from its CodePlex site.

For simplicity’s sake, I made the class and its methods static. Additionally, I added the following method to add additional scrubbing logic:

    public static string ConvertHTMLToCleanText(string html)
    {
      HtmlDocument doc = new HtmlDocument();
      doc.LoadHtml(html);
 
      StringWriter sw = new StringWriter();
      ConvertTo(doc.DocumentNode, sw);
      sw.Flush();
 
      // let's clean the string + Remove double spaces.
      string clean = Regex.Replace(sw.ToString().Trim(), @"\s{2,}", "");
 
      return clean;
    }

Once we have that set up, let’s now tweak our AllowHtmlRangeLength attribute we started out with to use this new ConvertHTMLToCleanText method (from above):

  class AllowHtmlRangeLength : ValidationAttribute
  {
    public int minHtmlLength { get; set; }
    public int maxHtmlLength { get; set; }   
 
    public override bool IsValid(object value)
    {
      string property = (string)value ?? "";
 
      if (HtmlToText.ConvertHTMLToCleanText(property.ToString()).Length >= minHtmlLength && HtmlToText.ConvertHTMLToCleanText(property.ToString()).Length <= maxHtmlLength)
      {
        return true;
      }
 
      return false;
    }
 
    public override string FormatErrorMessage(string s)
    {
      return String.Format(CultureInfo.CurrentCulture, ErrorMessageString, minHtmlLength, maxHtmlLength);
    }
  }

We’re overriding the FormatErrorMessage method so we can customize the error message to show the min and max values of the range. Also, we need to set min and max as public properties so we can use it as arguments as part of the AllowHtmlRangeLength attribute. So this is how it’s called now:

    [AllowHtml]
    [AllowHtmlRangeLength(ErrorMessage="Please have at least {0} to {1} characters.", minHtmlLength=3, maxHtmlLength=255)]
    public string Message { get; set; }

So let’s see. This throws an error, even though there’s HTML that bolds, underlines, and adds a heading of 1.

The following has no error:

The following has 3000 chars and it fails.

Check out the source files.

Leave a Reply