4 February 2024

This regex is useful for validating and extracting Roman numerals in a given text or input. It's particularly handy in applications where you need to identify and work with Roman numerals, such as parsing historical documents, validating user input, or processing data related to Roman numeral representations.

Explanation:

  • \b: Word boundary to ensure that the Roman numeral is a standalone word.
  • (?:M{0,3}): Matches from 0 to 3 occurrences of the Roman numeral 'M'.
  • (CM|CD|D?C{0,3}): Matches combinations for hundreds place - CM, CD, or D followed by 0 to 3 occurrences of 'C'.
  • (XC|XL|L?X{0,3}): Matches combinations for tens place - XC, XL, or L followed by 0 to 3 occurrences of 'X'.
  • (IX|IV|V?I{0,3}): Matches combinations for ones place - IX, IV, or V followed by 0 to 3 occurrences of 'I'.
  • \b: Word boundary to ensure the Roman numeral is a standalone word.
Source code viewer
  1. \b(?:M{0,3})(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})\b
Programming Language: PCRE