Daniel Cazzulino's Blog : .NET Regex bug causes application to hang!!!

.NET Regex bug causes application to hang!!!

Today I reported a bug on the new Microsoft Connect site (replacement for Product Feedback), which I had in the pipeling for quite a while.

Turns out that if you have a fairly complex regex (like the ones typically used for parsing small custom languages), you can easily kill your application, because the regex engine will hang completely evaluating even fairly small strings. Here's my repro regex:

    static readonly Regex ReferenceExpression = new Regex(@"

            # Matches invalid empty brackets #

            (?<empty>\$\(\)) |

            # Matches a valid argument reference with potencial method calls and indexer accesses #

            (?<reference>\$\(([^\(]+([\(\[][^\)\]]*[\)\]])?)+\)) |

            # Matches opened brackes that are not properly closed #

            (?<opened>\$\([^\)\$\(]*(?!\)))",

        RegexOptions.Compiled | RegexOptions.Multiline | RegexOptions.IgnorePatternWhitespace);

and this is the string I'm parsing (part of the Mobile Client Software Factory guidance package, which uses these kind of pseudo-MSBuild syntax):

    static void Main(string[] args)

    {

        string hangString = @"DisconnectedAgents\$(CurrentItem.Name)\$(ProxyType.Name)AgentCallback.cs\$(ProxyType.Name)AgentCallbackBase.cs";

 

        Console.WriteLine(ReferenceExpression.IsMatch(hangString));

    }

If a site allows evaluations of arbitrary regex patterns using the .NET engine, they should be careful as this can easily bring the site down.

Please vote the bug if you also think it's critical.

posted on Friday, June 09, 2006 8:21 AM by kzu

# .NET Regex bug causes application to hang!!! @ Friday, June 09, 2006 8:32 AM

Note: this entry has moved.
Today I reported a bug on the new Microsoft Connect site (replacement for...

Anonymous

# re: .NET Regex bug causes application to hang!!! @ Sunday, June 11, 2006 11:49 PM

Are you sure it is hung.. and not just stuck in O(e^n) backtracking.

Having one qualifier over a group with another qualifier can easily lead to this happening. The answer can be to use minimal match ("*?" rather than "*") or otherwise refactoring the regex.

See the usual references for all the details.

Richard