Skip to main content

A new voice of wisdom

The semantic prediction system and supporting hardware based on multimodal large model encourage and help aphasia patients to achieve more pleasant communication

  • product rendering

  • interactive process

  • Hardware interaction mode

  • industrial design

  • product mix

What it does

"Zhiyi Xinsing" is an intelligent alternative or auxiliary communication agent device based on a large language model, which can motivate and help patients with aphasia after stroke to actively express their intentions.


Your inspiration

A third of stroke survivors develop aphasia: clots in the brain that block blood flow to areas of the brain associated with language, leaving them with no intellectual changes but greatly reduced ability to hear, speak, read and write. Even in patients who have recovered well, they are still unable to arrange words in the correct logical order, resulting in communication blockages, and further reducing the overall physical and mental health of patients. At present, there have been a large number of clinical rehabilitation therapies, but they still have many defects...


How it works

We designed the hardware equipment used by patients and the supporting software system. The hardware device integrates a camera, voice recognition module, IMU module, etc., so that users can input by shooting video, speaking, action, etc., while getting feedback on the screen. When the patient wants to record, he picks up the device from the table with one hand and uses the IMU module and the infrared module for shaking selection and confirmation. Once the recording begins, the camera at the bottom of the device will record the picture in front of him, recognize his gestures and record his voice at that moment. After the recording, the AI will analyze the information in the picture and voice and give basic keywords and associative keywords, and the patient will be screened on the device. Finally, the large model connects these key words into sentences, forming complex sentences that the patient wants to express.


Design process

After confirming the basic functions of the hardware device - recording and keyword screening, we thought: Are there any other interaction methods other than touch screen that are more suitable for post-stroke aphasia patients? We chose the latter among the touch screen, gesture recognition and gyroscope (IMU module), because patients with aphasia after stroke often have unilateral hemiplegia and their ability to accurately control the body is reduced, and the IMU module has a certain degree of operating tolerance, can adapt to different situations of users, and is easy to produce. At the same time, we integrated the camera, voice recognition module, IMU module, infrared sensor module, screen and vibration motor into the hardware equipment, and selected STM32 as the development board in the initial exploration stage, so that the equipment can carry out multi-mode input and output. For the shape of the product, we choose from mobile app, neck device and other forms, and finally choose the handheld product, which is related to the behavior and use of patients, and can stimulate their subjective initiative more than app. The interactive mode is consistent with cognition and the learning cost is low.


How it is different

(1) Multi-modal interaction: it improves the accuracy of the algorithm to identify/predict the patient's intention; At the same time, it can cover more users and customize personalized and graded rehabilitation difficulty for patients with different conditions. (2) Intelligent device interaction logic developed for patients with post-stroke aphasia: one-handed interaction using IMU module is more suitable for users and usage scenarios. (3) Pay attention to the communication problems in the family setting and after the rehabilitation period, and infer the intention of patients by integrating language rehabilitation therapy and large model: in the family rehabilitation stage where it is difficult to improve the rehabilitation level, we still have ways to convey the intention of patients, stimulate their willingness to express themselves, and make communication more convenient.


Future plans

Goal 1: Expand the scope of user research to focus on the user group that can maximize the performance of the hardware and system, and explore the application possibilities of the product in the real world; Goal 2: Collect the behavior data of real users, design and train the multi-modal large model framework, make the output of the software more accurate, and realize the personalized expression of different users; Goal 3: To further optimize the hardware and industrial design, make the product more portable, and become a communication partner for users.


Awards


End of main content. Return to top of main content.

Select your location