text¶
Manipulation of textual data.
Textual data (pre)processing¶
|
Removes punctuation from textual data. |
|
Generates an acronym (in capital letters) from textual data. |
|
Extracts words from a string by splitting it at occurrences of uppercase letters. |
|
Converts a number written in English words into its equivalent numerical value represented in Arabic numerals. |
|
Counts the occurrences of each word in the given text. |
|
Calculates Inverse Document Frequency (IDF) for a sequence of textual documents. |
|
Calculates TF-IDF (Term Frequency-Inverse Document Frequency) for the given textual documents. |
Textual data similarity¶
|
Computes the Euclidean distance between two sentences. |
|
Calculates the cosine similarity between two sentences. |
|
Finds all strings (in a sequence) that match a given string or regex pattern. |
|
Finds |