How to efficiently match hundred thousands of substring in one string using elasticSearch -
my problem simple: have database containing 400,000 substrings (movies , tv shows titles). i'd match these titles in message such as:
i love game of thrones , suits, spotlight awesome movie.
what need match game of thrones, suits , spotlight in string.
i tried send titles wit.ai seems can't handle 100,000 substrings.
i'm wondering if elasticsearch job?
if that's common problem, sorry, me search in right direction.
thanks!
one of best algorithms find strings dictionary in text aho-corasick one
dictionary-matching algorithm locates elements of finite set of strings (the "dictionary") within input text. matches strings simultaneously. complexity of algorithm linear in length of strings plus length of searched text plus number of output matches.
but wonder database engine not provide possibilities such searching... can, don't know?
Comments
Post a Comment