31 December 2023

Stripping or replacing HTML tags in a string using PHP is a common practice for various reasons. For instance, it helps enhance security by removing HTML tags from user-generated content, mitigating the risk of cross-site scripting (XSS) attacks. Additionally, this process aids in data processing, cleaning HTML-formatted text for analysis or further manipulation. It also allows for formatting adjustments to match specific styling requirements and supports SEO optimization by extracting plain text for search engine compatibility. Moreover, it contributes to standardization efforts by ensuring that content adheres to a particular format or standard. It's important to note that the decision to remove or replace HTML tags depends on the specific requirements of the project, and validating and sanitizing user inputs is crucial for improved security. In this example snippet we replace b tags with strong tags.

Source code viewer
  1. // Load html as DOMDocument.
  2. $dom = new DOMDocument();
  3.  
  4. // Ignore warnings during loading possibly bad HTML syntax.
  5. @$dom
  6. ->loadHTML(
  7. '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"><html xmlns="http://www.w3.org/1999/xhtml"><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body>' .
  8. $html .
  9. '</body></html>'
  10. );
  11.  
  12. // Replace HTML Tags in String.
  13. $boldElements = $dom->getElementsByTagName('b');
  14. foreach ($boldElements as $boldElement) {
  15. $newElement = $dom->createElement('strong', $boldElement->nodeValue);
  16. $boldElement->parentNode->replaceChild($newElement, $boldElement);
  17. }
  18.  
  19. // Save the changed html.
  20. $result = trim(preg_replace('~<(?:!DOCTYPE|/?(?:html|head|body|\?xml))[^>]*>\s*~i', '', $dom->saveHTML()));
  21.  
Programming Language: PHP