SLiCK: Using Next to Look Up a Length-Restricted Keyword
![SLiCK: Using Next to Look Up a Length-Restricted Keyword SLiCK: Using Next to Look Up a Length-Restricted Keyword](https://i0.wp.com/mlr.cdn-apple.com/media/Home_1200x630_48225d82e9.png?w=780&resize=780,470&ssl=1)
Realizing a user-defined keyword on a device at the edge is a challenge. However, keywords are often bound by the length of the keyword, which has not been widely achieved in previous works. Our analysis of the keyword length distribution shows that detecting a user-defined keyword can be treated as a long-length problem, eliminating the need to combine with variable text length. This leads to our proposed effective keyword detection method, SLiCK (Sequential exploitation of Forced keyword detection). We also introduce a next-level matching system to learn audio-text relationships at a finer granularity, thus classifying similar-sounding keywords more effectively with enhanced context. In SLiCK, the model is trained in a multi-task learning approach using two modules: Simulator (articulation level matching task, novel level sequence matching task) and Encoder (phoneme recognition task). The proposed method improves the basic results on the robust Libriphrase dataset, increasing the AUC from 88.52 to 94.9 and decreasing the EER from 18.82 to 11.1.