Monday, June 1, 2020

[VB][Resolved] vb.net regex exclude characters

The solution is using ?>! syntax.
Input:
<a href="https://www.google.com">Google</a>
<a href="1.html">test</a>
<a href="2.html"></a>

Firstly, Here is the Regular Expression to select all hyper link html tab in string:
<a.*?>.*?<\/a>

Matched examples :
<a href="https://www.google.com">Google</a>
<a href="1.html">test</a>
<a href="2.html"></a>

And this is the Regular Expression to select the hyper link html tab without content:
<a.*?><\/a>

Matched examples :
<a href="2.html"></a>

By using ?>! pattern, we can select the hyper link html tab with content, excluding the tab without content:
(<a.*?>.*?<\/a>)(?>!(<a.*?>.*?<\/a>))

Matched examples :
<a href="https://www.google.com">Google</a>
<a href="1.html">test</a>



Remarks:

Simplier pattern : we can use \w+ pattern to select the html tab with content.
<a.*?>\w+<\/a>

Matched examples :
<a href="https://www.google.com">Google</a>
<a href="1.html">test</a>

 

Reference:


No comments :

Post a Comment