find_similar_str¶
- pyhelpers.text.find_similar_str(input_str, lookup_list, n=1, ignore_punctuation=True, engine='difflib', **kwargs)[source]¶
Finds
n
strings that are similar toinput_str
from a sequence of candidates.- Parameters:
input_str (str) – The string to find similar matches for.
lookup_list (Iterable) – A sequence of strings to search for matches.
n (int | None) – Number of similar strings to return; defaults to
1
; whenn=None
, the function returns the entirelookup_list
sorted by similarity in descending order.ignore_punctuation (bool) – Whether to ignore punctuation in the comparison; defaults to
True
.engine (str | Callable) –
Method for finding similarities; options include:
'difflib'
(default), which uses difflib.get_close_matches().'rapidfuzz'
(or'fuzz'
), which uses rapidfuzz.fuzz.QRatio().
kwargs – [Optional] Additional parameters for the chosen engine; for instance,
cutoff
for'difflib'
andscore_cutoff
for'rapidfuzz'
.
- Returns:
A string or list of strings similar to
input_str
, depending onn
and theengine
used.- Return type:
str | list | None
Note
Examples:
>>> from pyhelpers.text import find_similar_str >>> lookup_list = ['Anglia', ... 'East Coast', ... 'East Midlands', ... 'North and East', ... 'London North Western', ... 'Scotland', ... 'South East', ... 'Wales', ... 'Wessex', ... 'Western'] >>> find_similar_str('angle', lookup_list) 'Anglia' >>> find_similar_str('angle', lookup_list, n=2) ['Anglia', 'Wales'] >>> find_similar_str('angle', lookup_list, engine='fuzz') 'Anglia' >>> find_similar_str('angle', lookup_list, n=2, engine='fuzz') ['Anglia', 'Wales'] >>> find_similar_str('x', lookup_list) is None True >>> find_similar_str('x', lookup_list, cutoff=0.25) 'Wessex' >>> find_similar_str('x', lookup_list, n=2, cutoff=0.25) 'Wessex' >>> find_similar_str('x', lookup_list, engine='fuzz') 'Wessex' >>> find_similar_str('x', lookup_list, n=2, engine='fuzz') ['Wessex', 'Western']