It is occasionally useful for server-side code, either inside View templates, Ajax request handlers, or Model classes, to search html markup, extract information from it, or even modify it. On the client this can be done using jQuery, and now on the server it's possible to perform these operations using AngleSharp and some utility methods in A4DN.Core.MVC.CMS.Infrastructure.Helpers.

// In controller or model:
var document = HtmlHelpers.am_GetParsedDocument(content);

// In view:
var document = Html.am_GetParsedDocument(content);

These methods return an AngleSharp IHtmlDocument, which has many methods for querying and/or modifying the content. By default the parsed document will be cached for 10 seconds so that repeated calls using the same content string will not re-parse the content. An optional second argument can be passed to alter the cache idle timeout. The first method uses HttpContext.Current.Cache as the cache, while the second uses Html.ViewContext.HttpContext.Cache. In most scenarios these are equivalent.

The rest of the methods will be shown using the Html.am_* syntax that's used inside views, but they can all be called using the HtmlHelpers.am_* syntax from controllers and models as well.

var elements = Html.am_GetElements(content, selector);

This is similar to jQuery's $("...") function, and takes the same kind of selector (except for jQuery's additions to the standard CSS-like selector syntax.) The return value is an IHtmlCollection<IElement>, which supports foreach and LINQ methods.

var value = Html.am_GetDataAttributeValue(content, elementSelector, dataAttributeName);

This method finds the first element in the content that matches elementSelector, and returns the value of that element's data-{dataAttributeName} attribute. Looking up data attributes is a common use case for server-side parsing.

var content = Html.am_RenderDocument(document);

If you use AngleSharp to make modifications to the DOM inside of document, am_RenderDocument() can be used to get an updated content string.