Algorithm to check if any of a set of strings matches the input text? -
so, have set of multiple string , want optimal algorithm check if of strings can found in input text. important thing we're not interested in finding of matching strings, suffices find one.
i've found this: http://en.wikipedia.org/wiki/aho%e2%80%93corasick_string_matching_algorithm seems good, finds matching patterns. there way faster if don't need information?
of course, can terminate aho corasick algorithm when finds first matching pattern, perhars there approach faster type of problem?
you can't make much lower, because complexity
the complexity of algorithm linear in length of patterns plus length of searched text plus number of output matches.
and obviously, must go on each of patterns , text.
the thing reduce number of output matches. simple. in original algorithm, once accepting state met, outputs dictionary items matching it. variation, mark accepting state accepting , output "accept" symbol", or choose 1 dictionary item arbitrarily when constructing fsm , output it.
Comments
Post a Comment