Looked at what the Net has to offer, but no one has offered a solution which can be generalized. The problem is simple. There is a string of charcters s which would produce a 'non-match'. i.e. any string with substring s would not match this RE. Can any bright sparks produce this RE?
Simple example: porteus.org ends up in a blanket blacklist, so wish to whitelist everythinng porteus except for strings with 'mchat', using an RE, as 'mchat' quite often throws browser into tight loop.
Would be grateful for an elegant solution.
Regular Expression Exclusion
-
- Full of knowledge
- Posts: 2564
- Joined: 25 Jun 2014, 15:21
- Distribution: 3.2.2 Cinnamon & KDE5
- Location: London
Regular Expression Exclusion
Linux porteus 4.4.0-porteus #3 SMP PREEMPT Sat Jan 23 07:01:55 UTC 2016 i686 AMD Sempron(tm) 140 Processor AuthenticAMD GNU/Linux
NVIDIA Corporation C61 [GeForce 6150SE nForce 430] (rev a2) MemTotal: 901760 kB MemFree: 66752 kB
NVIDIA Corporation C61 [GeForce 6150SE nForce 430] (rev a2) MemTotal: 901760 kB MemFree: 66752 kB
- brokenman
- Site Admin
- Posts: 6105
- Joined: 27 Dec 2010, 03:50
- Distribution: Porteus v4 all desktops
- Location: Brazil
Re: Regular Expression Exclusion
Any particular reason you need a REGEX for this? Depending on the tool you are using there may be other more elegant ways. In any case with a REGEX you could use negative look ahead.
Code: Select all
(?!^mchat$)(^.*$)
How do i become super user?
Wear your underpants on the outside and put on a cape.
Wear your underpants on the outside and put on a cape.
-
- Full of knowledge
- Posts: 2564
- Joined: 25 Jun 2014, 15:21
- Distribution: 3.2.2 Cinnamon & KDE5
- Location: London
Re: Regular Expression Exclusion
Using Silent Block, an excellent Moz Add-on from Japan. On the one hand there is a blacklist, which is processed first, before the whitelist. All matches are REGEX. You have no idea the amount of rubbish that gets logged as blocked. Not having the luxury of 8 processors, have to protect single cpu from onslaught of scripts, So had already blocked doubleclick.net long before it achieved NSA notoriety. 8) After instituting blocks was amazed by the almost instantaneous nature of system responses. Final RE in my blacklist is now '\.js$'brokenman wrote:Any particular reason you need a REGEX for this? Depending on the tool you are using there may be other more elegant ways.
Thanks, I'll try that.brokenman wrote:In any case with a REGEX you could use negative look ahead.Code: Select all
(?!^mchat$)(^.*$)
Linux porteus 4.4.0-porteus #3 SMP PREEMPT Sat Jan 23 07:01:55 UTC 2016 i686 AMD Sempron(tm) 140 Processor AuthenticAMD GNU/Linux
NVIDIA Corporation C61 [GeForce 6150SE nForce 430] (rev a2) MemTotal: 901760 kB MemFree: 66752 kB
NVIDIA Corporation C61 [GeForce 6150SE nForce 430] (rev a2) MemTotal: 901760 kB MemFree: 66752 kB
-
- Full of knowledge
- Posts: 2564
- Joined: 25 Jun 2014, 15:21
- Distribution: 3.2.2 Cinnamon & KDE5
- Location: London
Re: Regular Expression Exclusion
In trying to understand suggested RE, trawled the Net for 'look ahead'. Came across cookkbook recipe for excluding strings containing a certain substring, in this case 'invalid':brokenman wrote:In any case with a REGEX you could use negative look ahead.Code: Select all
(?!^mchat$)(^.*$)
Code: Select all
^(?!.*invalid.*).*
So, as memo to myself and anyone else:
i.e. if something is not followed by something else we have a match.Wisdom of the Web wrote:Negative look ahead is used to match something not followed by something else.
^[^u] => Start of String not followed by 'u'
Code: Select all
usa: no match
eu: match
Code: Select all
usa: no match
eu: match
Extrapolating: ^(?!.*mchat) => Start of String not followed by generalized string '.*mchat.*'
Code: Select all
No match: http://archive.linuxfromscratch.org/lfs-museumchat/2.3.1/LFS-BOOK-2.3.1-HTML/index.html
Match: http://media7.fast-torrent.ru/media/js/jquery-ui-1.10.3.custom1.min.js
Expression (?!^mchat$)(^.*$):
Code: Select all
Match: http://archive.linuxfromscratch.org/lfs-museumchat/2.3.1/LFS-BOOK-2.3.1-HTML/index.html
Match: http://media7.fast-torrent.ru/media/js/jquery-ui-1.10.3.custom1.min.js
All tests done at http://www.regular-expressions.info/jav ... ample.html.
So, ^(?!.*mchat) is not very far from cookbook example ^(?!.*invalid.*).*, but someonne commenting there suggested ^(?!invalid)(.(?!invalid))*$, although more complex, would lead to a more simple (less matches) result. This defeats me.
IMHO ^(?!.*mchat) would just be one scan for 'mchat', which would decide the match.
Nonetheless it apppears that the Regular Expression to exclude a string containing a substring s would be: ^(?!.*s)
Real Life whitelist example: '.*porteus\.org(?!.*mchat)'
Code: Select all
No match: http://forum.porteus.org/mchat/jquery_cookie_mini.jsg/styles/prosilver/template/forum_fn.js
Match: http://forum.porteus.org/styles/prosilver/template/forum_fn.js
Match: http://forum.porteus.org/chat/jquery_cookie_mini.jshttp://forum.porteus.org/styles/prosilver/template/forum_fn.js
As opposed to 'porteus\.org(?!.*mchat)'
Code: Select all
No match: http://forum.porteus.org/mchat/jquery_cookie_mini.js
Match: http://forum.porteus.org/styles/prosilver/template/forum_fn.js
Match: http://forum.porteus.org/chat/jquery_cookie_mini.jshttp://forum.porteus.org/styles/prosilver/template/forum_fn.js
porteus.org
Epilog
Code: Select all
SilentBlock: /\.js$/i
blocked http://forum.porteus.org/mchat/jquery-1.5.0.min.js
SilentBlock: /\.js$/i
blocked http://forum.porteus.org/styles/prosilver/template/forum_fn.js
but unblocked by /porteus\.org(?!.*mchat)/i
Linux porteus 4.4.0-porteus #3 SMP PREEMPT Sat Jan 23 07:01:55 UTC 2016 i686 AMD Sempron(tm) 140 Processor AuthenticAMD GNU/Linux
NVIDIA Corporation C61 [GeForce 6150SE nForce 430] (rev a2) MemTotal: 901760 kB MemFree: 66752 kB
NVIDIA Corporation C61 [GeForce 6150SE nForce 430] (rev a2) MemTotal: 901760 kB MemFree: 66752 kB