Skip to main content
Version: 1.25

Clean HTML

Audience: Citizen Developers

Skill Prerequisites: Tokens

This action extracts a clean HTML structure with just the content part from the given raw HTML.

Typical Use Cases

  • Clean up raw HTML content before processing or displaying it
  • Extract the main content of an HTML page for further analysis

Don't use it to

  • Parse non-HTML content
  • Modify the original HTML structure

None

Input Parameter Reference

ParameterDescriptionSupports TokensDefaultRequired
Raw HTMLThe raw HTML content to be cleaned.Yesempty stringYes

Output Parameters Reference

ParameterDescription
Store Clean HTMLThe cleaned HTML content.

Examples

1. Clean Raw HTML

This example demonstrates how to clean raw HTML content and store the result in an output token.

[
{
"Id": -1,
"ActionErrorMessage": null,
"ActionType": "Custom.Actions.CleanHtml",
"Condition": null,
"Description": null,
"IsDeleted": false,
"Parameters": {
"RawHtml": "<html><head><title>Sample Page</title></head><body><nav>Navigation</nav><article><h1>Title</h1><p>Content</p></article><footer>Footer</footer></body></html>",
"StoreCleanHtml": "CleanedHtml"
}
}
]

After executing this action, the [CleanedHtml] token will contain the following cleaned HTML content:

<h1>Title</h1><p>Content</p>

For example, you can import this JSON content into a workflow using the "Import Actions" button.