Face Recognition-Based Automatic Hospital Admission with SMS Alerts


When the person met with an accident is brought to the hospital, there are many official formalities (e.g., admission form to be filled) before the treatment that can be started. In some severe cases, these formalities can delay the treatment, which could be fatal to the patient. The automated system which can fill these forms with the help of face recognition could severely cut down the delays. But, in some cases, the injury and the blood on to the face fail facial recognition. To overcome this problem, we have proposed a facial vector-based algorithm. In the current work, we have also demonstrated, sending the SMS to the concerned authorities (police) and even to the relatives of the patient automatically using GSM modules. The patient’s information was received from centralized databases of different hospitals that are linked through the internet. We have tested the algorithm on more than 213K images from different databases like celebA, LFW, UCFI. We found that the maximum accuracy of our system was 98.23%. As a proof-of-concept, we tried testing on 51 real-time patient images and found that the accuracy is 94.11%. This automated form filling not only reduced the delay in hospital admission, but also also helped in treatment, because of the auto-filled medical history.


When the patient has had an accident and brought to the hospital, it is critical to begin the treatment as soon as possible [1]. Any delay in treatment may cause the patient’s death. Informing the police is important to make sure that the accident case is genuine, and there is no conspiracy. According to a global status report on road safety of the year 2018 released by the World Health Organization (WHO), death due to road accidents has reached 1.35 million in a year [2]. That is almost 3699 people die because of the crash each day. Road accident is the eighth leading cause of death globally.

Fig. 1

System implementation diagram. A web camera is used to capture the video of the patient on the stretcher. We record this video only for patients entered through a special-case-entrance of the hospital. The special-case entrance has a dark box-like structure, placed at the hospital entrance, that avoids the effects of external light. The motion sensor attached to the special case entrance makes it possible to capture a video that is time-synchronized when a stretcher enters the hospital

The severity of this road accident is much more in highly populated countries like India. According to the recent road accident statistics, almost 16 deaths happen per hour in India [3]. The high death rate in India is due to delay in starting treatment because of the various formalities that are mandatory before treating the accident patients. These formalities include admission form filling and sometimes informing the nearest police station. Informing to the police is important to make sure that the accident case is genuine, and there is no conspiracy.

The solution which may reduce the time taken for completing form filling formalities is the auto-filling of the form. This automatic form filling can be achieved using the identity recognition of the patient. Various studies involve the use of image processing in identity recognition. The most popular being biometric scans such as (fingerprint, iris, etc.) But generally, these scanners are costly. Furthermore, the patient’s eyes are closed in most of these kinds of cases, making iris capture difficult. A simple, low-cost alternative to that is face recognition using simple webcam. Face recognition works on the basic principle that is based on facial features (skin color, moles on the face, etc.). Facial recognition becomes difficult if there is damage to the face during the accident.

Various types of accidents usually end up damaging the face (e.g., two-wheeler accidents without helmets, acid attacks, some domestic violence burnt cases). To overcome the problem in face recognition due to facial damage, face vectors are used. In general, the face vector is defined by different points that appear on the face after feature extraction in an ordered manner. For example, the simplest face vector could be the distance between two eyes, eyes and nose, eyes and lips, etc. which in combination forms a unique set for each individual.

Unlike facial features, face vectors are rarely damaged during accidents. In cases when one or more facial vector is not available directly, it is still possible to reconstruct it because of facial symmetry. In this work, we have developed a system that is capable of an automatic admission in the hospital using form filling with help of face recognition. The automatic form filling includes filling the data of the patient along with medical history (if available in the database). If the patient has a major injury or blood spot on the face, then also this proposed system can recognize the identity of the patient. The project has one more advantage of sending SMS to the nearest police station and relatives. The system has been tested on 51 patients for the demonstration purpose. The system takes 1.5 min to scan the face of the patient, filling the form, and send the SMS. As there is a drastic reduction in time required in admission procedures, the patient may get treatment early.

As shown in Fig. 1, a special-case entrance (i.e., dark box-like structure) was placed at the entry door of the hospital. The only unconscious states patients were passed through this special-case entrance. The special-case entrance was attached with a web camera that captured real-time video of the patient. The correct frame was extracted automatically from this captured video. The width of the special-case entrance was 1.2 m, which confirmed the smooth passing of stretcher through it. The curtains were present at both ends of the special-case entrance to avoid interference due to outside light. The proposed system avoided the delay in completing hospital formalities in emergencies.

Literature Review

There are various studies done in the last few decades to improve face recognition algorithms. Recently, Tran et al. [4] published a work to minimize the large pose discrepancy. According to the authors of the paper, conventional techniques either perform face frontalization on an image or learn a pose-invariant representation face recognition. Singh et al. [5] conducted study on the facial injury due to road traffic accidents. They found that significant face damage caused to male face in contrast to the female face. Black et al. [6] compared injuries during sports and he found the effect of facial guards like a helmet on to the reduction of severity of these injuries. Canzi et al. [7] proposed facial recognition with trauma to optimize care and recovery after surgery. Trokielewicz et al. [8] performed dead body iris-based recognition and he was able to recognize people with 90% accuracy. Coulibaly et al. [9] provided survey on interpersonal violence face damage. Majumdar et al. [10] used VGG model to classify the faces injured from domestic violence.  Further, they extended their work for subclass contrastive loss for injured face recognition [11]. McDonagh et al. [12] proposed a method for joint face detection using cascaded regression of Hough Transform. In the proposed technology, they scanned the image and applied cascaded regression rather than passing the image through discriminatively trained filters. Yow and Cipolla [13] proposed a feature-based algorithm for face detection. Feature selection was done based on the aspect ratio of the extracted object. Jun and Kim [14] proposed a face detection method using local gradient patterns; by their study, they found that local gradient pattern for face detection is much better than local binary patterns. Mahajan and Paithane [15] proposed a face detection technique based on Histogram of Oriented Gradients (HOG) features. Their method used hierarchical knowledge-based model. The hierarchical knowledge-based method involved three levels for improving image quality. Images were filtered initially for easy background removal. The group of people proposed a novel technology for facial emotion recognition [16]. During implementation, they used the face vector technique in combination with convolutional neural networks. An enhanced face vector technique compared to literature is implemented in the proposed work. Parallelly, the neural network has done an excellent job in the field of face detection and recognition. In most of the neural network approaches, the input image has sub-sampled in different regions and converted it into a standard-sized sub-images. The sub-images are passed through the neural network filter. The filter improves the quality of an image by removing noise from it. Some of the work using the neural network technique has been done by Rowley et al. [17]. The performance of the system was very well for fronto-parallel faces, but results deteriorated when different views of the face were given as input. This technique failed for faces in different profiles. Labeled random graph matching approach in the cluttered image for detection of the face has been reported by Leung et al. [18]. The system included a technique for locating a quasi-frontal view of a face in a cluttered scene. In the system, statistical model was used, which determined the distance between facial features. The statistical model was coupled with feature detectors. This technique was suitable for multiple views, but, at the same time, the technique failed to work under different imaging conditions. It was difficult to robustly detect the facial feature for every subject because the structures of various facial features vary. Detecting the face in a complex background is a tedious task. Yang et al. [19] designed system to identify human faces in a complex background. This system used a hierarchical knowledge-based method. The hierarchical knowledge-based method involved three levels for improving image quality by filtering the background. The first two levels were based on mosaic images at various resolutions, and the third level was based on edge detection. This system gave excellent results for a fronto-parallel view of the face but failed for face detection in different profiles. Recently, Deshpande et al. used the Viola–Jones algorithm and Fusion of PCA and ANN for face detection and recognition [20]. The system involved a method for recognizing the human face based on various features derived from the captured image. Their proposed method involved two stages. The first stage involved the detection of a face in an image using the Viola–Jones algorithm. The second stage involved the reorganization of the face, based on the fusion of Feed Forward Neural Network and Principle Component Analysis. The system provided better accuracy in recognition of a face. Similar to face detection, a lot of recent works were also carried out in automatic form filling. In 2013, Kadry et al. developed a system for wireless attendance marking systems based on iris recognition [21]. In this system, images of the iris were captured and pre-processed. Then iris feature extraction was done and compared with database. Wagh et al. [22] developed a system for automatic attendance marking in 2015 based on Eigenface and PCA. In this technique, the image of the whole classroom was taken from a particular angle. The captured image was enhanced by the histogram equalization technique to improve the quality of the image. The enhanced image was input for the face detection algorithm which was the Ada-Boost algorithm. Recently, Suri et al. [23] developed a system for hospital automation using face detection and recognition algorithm. The technique was useful to fill the automatic hospital admission form and sending the SMS to the concerned authority. The current study is similar to work presented by Suri et al. [23], but there is a difference in algorithm and implementation. Suri et al. [23]. have used a traditional object recognition algorithm namely, Viola–Jones, which is not so accurate for tilted or turned faces [23]. Face recognition is also sensitive to lighting conditions which can damage the captured image quality. Whereas the current work proposed a facial vector-based algorithm, which eliminated the effect of lighting from the image. The proposed work also offers major advantages like, it provided manual intervention during drug delivery to the patient. Though the system is automated, it does not deliver any drug to the patient by just looking at the past medical data and act as a support system. Their study [23] included only 8 dummy images (printed newspaper cutouts) for testing and validating the system whereas, the proposed work involved roughly 200K images for testing the algorithm. Apart from this, current work has been also validated on 51 real-time images of patients. The current work captured the video of the patient and then extracted the frames from that video (Fig. 2). This allowed us to extract as much information from the video as needed, whereas, Suri et al. [23] have taken static images which restrict the total available information.

Fig. 2

a System flow diagram. The video was acquired through a web camera. The face image is then extracted from the video. The face is detected and cropped from the image. The cropped face is then compared with the database image. If the match is found, then corresponding data is filled into the admission form and SMS is sent to the police and relatives. b Flow chart of the proposed system. The patient was passed through the black passage from the stretcher. Due to the motion of the stretcher, the PIR sensor was activated and a signal is sent to the web camera. Web camera records the video of moving stretcher. The desired image was extracted from the recorded video. The obtained image is the input for the system and further enhanced by preprocessing. The processed image was used for face detection. The face feature vector was then compared with the database image. If a match is found, then the hospital admission form is filled automatically and SMS is sent to the concerned authority


Components and Equipments

The ATmega8 IC was purchased from Mouser Electronics, India. The GSM module SIM300 was acquired from ORKA technologies, India. PIR motion sensor (Part no. 30121) was purchased from Indias Heart, India. A Logitech webcam was used. USB to serial converter FT232 (Part no. FTDIBB-F) was ordered from Robo India. 7805 voltage regulator IC was brought from Electrobot, India. Resistors, capacitors, connecting wires, jumper wires were brought locally.

System Design

The proposed system was designed for filling the patient admission form automatically in emergency and critical conditions. This form filled includes a detailed medical history (if available). The system comes with one more advantage of sending SMS to the concerned authority. As shown in Fig. 2, image acquisition was the primary stage of the system. The USB web camera worked as an image acquisition source. The proposed system consisted of the special-case entrance (a dark passage) at the entrance of the hospital. The patients for which automated form filling was needed were carried on a stretcher through that passage. This dark passage ceiling consisted of a USB web camera with white LED lights to scan the face of the patient. The frame was extracted from captured video based on edges. The motion of the stretcher causes blurring, and hence stable frames had more edges compared to blurred frames from the video.

Facial Vector-Based Algorithm

The system works with real-time video input. From this recorded video, the difference between respective frames (intra-frame difference) is computed. The maximally stable frames occur, whenever the intra-frame difference is zero. Canny edge detector was then applied to all these stable frames and the aggregated sum of all white pixels was calculated. Once the aggregated sums for all frame has been calculated, it is compared with all other stable frames and the frame with the maximum aggregated sum has been selected. The frame with the maximum aggregated sum has maximum details according to edges. This frame is then selected as an input to our face detection system. As the Viola–Jones algorithm work with gray-scale images, we converted the colored image to a gray-scale image. Viola–Jones use Haar-like features to detect a face from a given image, in which all detected parts of the face are rescaled to square-shaped function [24]. In the following step after face detection, the facial vectors are generated by tracking down relevant facial points. Face vectors are directly related to face dimensions. The face vector is nothing but values of normalized Euclidian distance between each face part. These face parts include the nose, lip, ears, forehead, eyes, etc. All face parts (landmarks) are marked using edge detection and nearest point clusters are mapped. For face vector extraction, every single pixel in the image was compared with its surrounding pixels. The change in the corresponding neighbor pixels was computed and the direction of change is noted based on the sign of the difference. The entire step is repeated for every pixel to generate a gradient map. The gradient map helped us to determine the change in the brightness and the higher the level of gradient, more is the information; further, the entire image is divided into \(16\times 16\) block and the gradient block is found in each measured direction. Then, facial landmarks are estimated so that eyes, mouth, nose and, ears, etc. are determined. These facial landmarks were resulted from the standard face and compared with the point cloud gradient that we found from our face. A point cloud is a set of data points in space. The points cloud represent a 2D shape or object. Each point has its set of X, Y coordinates and intensity. A total of 80 specific points were mapped onto the face using the HOG pattern. HOG is defined as a histogram of gradients obtained by plotting the frequency of different gradient in an image. These points are as shown in Fig. 3d. If a particular point is present within landmark face points, we mark it by logic 1 (marked in white), else logic 0 (marked in red) gave a total 80-bit (10 bytes) face vector as shown in Table 1.

Fig. 3

Results a The average time taken versus each operation. Three operations that the system performs are (1) FR: patients to recognize the face (black), (2) DCAF: database comparison and automatic form filling (red) and (3) SMSS: SMS sending (green). The total time of the proposed system is the addition of these three components indicated in blue. It took around 95 s to complete the process. b The bar graph of time taken by the system to recognize the face (FR), database comparison, and automatic form filling (DCAF) and SMS sending (SMSS) for the first 12 patients P1–P12. c Receiver Operating Characteristic Curve (ROC). It is a plot of a false-positive rate versus a true-positive rate. We found that it is very close to the ideal ROC curve in three different kinds of cases (i.e., Accident, burn, and others). d Sample 80 point face vector

Video Frame Processing for Face Extraction

We have applied canny edge detection and then normalized summation to extract a keyframe from the video for further steps. The image was enhanced to improve quality. After the image enhancement step, the colored image was converted into a gray-scale image. The gray-scaled image was further equalized using the histogram equalization technique. Filter was used to eliminate the noise from the image. After all pre-processing steps, face detection was carried out. Then for the face detection algorithm, all the possible junctions of edges such as nose, eyes, ears, chin, endpoints of lips, etc. were considered to form a face vector (as shown in Table 1).

Table 1 Face vector

All the oval-shaped objects in the extracted frame were detected using Hough transform, and then the aspect ratio for each such oval-shaped objects was computed. In general, aspect ratio is defined as the ratio of the major axis to the minor axis. Oval-shaped objects are the object in which both the axis (major and minor) are of different lengths. For a face (1/aspect ratio) of the rounded object lies in the range of 0.5–0.7:

$$\begin{aligned} \text {Aspect ratio} = \frac{\text {Major axis}}{\text {Minor axis}}. \end{aligned}$$

Following the aspect ratio in the range mentioned above for all the ovals in an image, the possible face was determined. The possible face was confirmed with an area-based filter. Too small circles and large circles were neglected. For our webcam resolution, the average face area was about 600–1200 pixels. All the face vector points are then positioned on the face, using k-means clustering and minimum distance decoding. All missing points due to different cases (e.g. closed eyes, face covered in blood, etc.) have been regenerated using the average position of all nearby points in the point cloud. Figure 4k shows the standard face. The standard face is a face with all the facial features marked on it along with their standard distances. We obtained a point cloud, i.e., set of data points in the space representing X, Y coordinates, and corresponding intensity for the standard face whose distances measured by the pixels (px) unit. The various face features like eyes, nose, ears, lip, and chin were represented by Y, N, E, L, and H, respectively, and the positions like left, right, and center were represented by L, R, C respectively. All the face vector points, which were not fitting the cluster, were discarded. Table 1 shows all eight positions and their corresponding k-means cluster centers. As shown in Table 1, the number in parentheses indicates X and Y coordinates. (400, 200) px indicates the mean distance between nose and origin. Origin has been considered as (0, 0) (as shown in Fig. 4k). The mean distance between origin and left eye (LY) was (175, 125) px; whereas, it was (175, 275) px for the right eye (RY). The mean distance from the origin to left ear (LE) and right ear (RE) were (175, 30) px and (175, 370) px, respectively. Chin (CH) was (550, 200) px away from the origin. The mean distance from the origin to the left lip (LP) and right lip (RP) was (475, 125) and (475, 275) px. The mean height and width were 600 and 400 px, respectively. After the face detection has been done from the image, it was cropped. This cropped image of the patient face was compared with database images. If the final face vectors of database images were found to be within the margin of 1% from the search final face vector, then ‘match found’ was declared. If the confirmation is received after comparison, then the hospital admission form was filled automatically.

Automatic admission form filling involved filling general and medical history mentioned in the database into the form. The doctor can treat or medicate the patient by considering this past medical history, but during such action, manual supervision is provided on the software front end. Doctors or related medical staff can be able to change the suggested person’s identity if there is a false result. This will ensure minimizing or avoiding the wrong treatment. Once the form filling has been done, SMS is sent to the nearest police station and relatives (provided their number is available in the database).

Fig. 4

System inputs–outputs and standard face. a, c, e, g, i keyframes extracted from video while stretcher is passing through the dark passage. b, d, f, h, j Detected face region marked in the yellow rectangular box. Snippets of b, d, f, h, j shows matching image extracted from database. All cases, except i gave expected accurate results. In the i image, the patient face was not facing the camera, and a large wound on the face caused a mismatch with the database. Snippet (k) standard face. The face vectors were marked on the background-removed face. Here, lip (P), nose (N), eyes (Y) and ears (E) were marked using edge detection and nearest cluster analysis. The positions like right, left and center are represented by R, L, and C, respectively

To test the system, special permissions were taken from the Institute Ethics Committee(IEC) and the Medical regulatory authority of Karve hospital. All the images attached in Fig. 4 are with written authorization from concern patients after they were recovered. The system was validated on 51 patients from Karve hospital, Thane. The system does not interrupt the usual routine of the hospital, so hospital authorities permitted us to install a special-case entrance.

Circuit Details

The Fig. 5 indicates the circuit diagram used to build the system. To implement the proposed system, we have chosen ATmega8 as the main microcontroller. For synchronization, the controller was connected to 16 MHz crystal. As shown in Fig. 5 pin number 1 of ATmega8 is pulled up to 5 V (VCC) via 10 K\(\Omega\) resistor. The power supply was driven with a lead-acid battery (6 V, 1300 mAh). The output of this battery was then fed to regulator IC 7805. The regulated 5 V output was further filtered through a 100 \(\upmu\)F capacitor. ATmega8 was connected to the computer using FT232. FT232 converts serial communication (i.e., RS232) signals to USB signals. The microcontroller communicates to the GSM module SIM300 through RS232 at 9600 bps. GSM module was used to send SMS to police and relatives. Since the microcontroller only had one serial communication port, the GSM module was connected through the software serial port. PIR motion sensor is connected to the ATmega8 controller at pin 14.

Results and Discussion

A demonstration of the proposed system has been tested with 51 patients. The types of patients included for face verification were varying in age, gender, and physique. To compute the results, the time delay was always considered as our output parameters since the aim of the proposed work is to reduce the delay. The total time for each operation Fig. 3a, and total time with respect to each patient were considered as the performance measure of the system (Table 2). Fig. 3b shows the bar graph of total time taken by the system for first 12 patients out of the 51 patients.

Fig. 5

Circuit diagram of the proposed system. ATmega8 was connected to 16 MHz crystal and pulled up with 10 K\(\Omega\) resistor. IC 7805 with 100 \(\upmu\)F was used to generate 5 V supply. FT232 connects USB to ATmega8. SIM300 was connected for sending SMS

Table 2 Comparison with different face recognition and verification methods
Table 3 Accuracy of proposed method with different datasets

The system took the overall time (\(\sim\) 95 s) (Fig. 3a blue) to detect and recognize the face of the patient and fill the data in the hospital form automatically. The total time taken by the system for each patient varied (Fig. 3b). This variation was due to the time required to recognize the face, compare with the database, and sending the SMS. The average time taken by face detection was about \(32 \pm 4\) s (Fig. 3a black). It took around \(13 \pm 4.5\) s (Fig. 3a red) for comparing the image with a database and for sending the SMS it took \(46.5 \pm 7.7\) s (Fig. 3a green). It is seen that the time required for sending the SMS was high and largely varying because of its dependency on network connectivity. In one case (P3), the system took almost a minute to send the SMS due to heavy rains. Due to heavy rains, there was a poor network connection to SIM300. However, sending SMS is the secondary task of the system, and it does not affect the auto form filling and hospitalization. In general, another issue, along with the network problem that we faced during the actual testing, was people with facial hairs. Facial hair causes extra feature vectors which makes the system slower in identifying the person. Figure 3c shows a receiver operating characteristic curve or a ROC curve. The ROC curve has been created by plotting the true-positive rate (TPR) against the false-positive rate (FPR) at various threshold settings. In this study, we have prepared the ROC curve for both training and testing datasets. For plotting ROC for training, we had used the celebA, LFW, and UCFI database. The database included 200K celebrity images. For testing purposes, we included 51 real-time images of patients (48 were road accident cases, 2 were burnt cases and 1 was of face stab.)

Figure 4 shows practical inputs received (Fig. 4a, c, e, g, i), and corresponding outputs (Fig. 4b, d, f, h, j) of the proposed system. The system successfully interpreted the result of 48 patients out of 51 (94.11% accuracy). In Fig. 4 only 5 cases out of 51 are discussed. In the first four (Fig. 4a, c, e, g) cases, the system interpreted results accurately whereas in the fifth case (Fig. 4i), the system was unable to detect the patient as the feature vector extraction algorithm failed due to wounds and non-fronto-parallel image. In the first case (Fig. 4a), a 12-year-old boy met an accident and brought to the hospital when he was unconscious. He was passed through the system. The keyframe image (Fig. 4a) was extracted from the real-time video captured. This keyframe is used to detect a face, as shown in Fig. 4b. The snippet indicates the matched database image. The system accurately detected the identity of the patient and also fill the form. The system also sent the message to the parents of the patients, who could be unaware of the incident. The second case (Fig. 4c) was a 25-year-old female with a cut in the vein, also detected correctly even with closed eyes. The third case (Fig. 4e), a 58-year-old lady got admitted, the system took approximately 0.4 s extra to recognize the face because of the head bindi. The 15-year-old boy (Fig. 4g) with head injury also gave a proper match. All the extra feature vectors appeared due to blood and wound were discarded during the k-means stage. We have also reported one failed case here for future researchers. In this case (Fig. 4i), a 30-year-old man was brought to the hospital who had met with the road accident. The face was wounded, and the angle at which the camera captures was not fronto-parallel; hence, matching algorithm failed. We also included 2 patients from burnt cases, where the face was 20–35% damaged. The system was able to correctly recognize these patients also. The average time taken by the system to recognize the person and fill the form was about 2.5 and 1.8 min, respectively. Among all the admitted cases, three cases were failed to identify because of extensive blood on the face. At the same time, we found that the occlusions like bindi and facial hair increased the processing time slightly (less than 0.5 s), as it increased the number of points to be processed.

The system is meant only to fill the form during the hospital admission in emergency cases, so that it will reduce the further delay for treatment. Although the procedure to fill the form is completely automatic, we have made the provision for manual supervision in the software. There are chances when false matches may occur; in such cases, if drug delivery has been suggested for the patient based on past history, then it could be disastrous. Manual supervision provided to hospital staff will help to overcome this issue. By manual supervision, we can avoid any mishaps during treatment and also can avoid false drug entry in the patient’s body.

There could be cases of incorrectly identifying someone when using facial recognition and can cause a catastrophe. To avoid it fully or to minimize such situations, we have provided a manual supervision facility at least before starting drugs based on past medical history. Sending wrong SMS to the wrong family is still a downside of our system.

Comparison with Other Methods and Datasets

To compare the results of the proposed method with different datasets, we used the celebA [34], LFW [35], and UCFI [33] datasets. We also created our own dataset with 512 images. As shown in Table 3, the celebA [34] dataset consisted of 200k images; whereas, LFW [35] and UCFI [33] consisted of 13K and 299 images respectively. We found that the highest accuracy of 98.23% was achieved from the LFW [35] dataset. The lowest accuracy of 92% was achieved with the CelebA [34] dataset. The dataset created by us with 512 images gave an accuracy of 94.9%. Also, a real-time scan for 51 patients gave us an accuracy of 94.11%. As shown in Table 2, the performance of the system was compared with various existing methods. The accuracy of our proposed method was comparable to the other different methods reported in the literature. Taigman et al. [32] achieved maximum accuracy of 97% also their number of images was the highest. But on the other hand, their method required a 16K long feature vector compared to the proposed method of the 80-bit vector. Our feature vector is the smallest, still the accuracy is within the range of one reported in the literature. The small size feature vector and simplicity make the proposed algorithm more suitable to implement on the portable systems.


Avoiding delay in starting treatment could save a lot of lives. We proposed a system that provided the solution for automatic form filling and sending the SMS to police and relative automatically and thus avoiding delays. The automated form filling was achieved by face detection. The face detection was based on a novel method of calculating the face vectors. The system was tested with 51 patients in emergency conditions, and we found that the system worked with an accuracy of almost 94.11%. The system takes a relatively large time to calculate the face vectors in special cases like bindi on the forehead and facial hair. We also recommend using the proposed system with all the time-critical processes, such as interconnecting flights at international airports. Another possible application of this system is for people with disabilities, such as people who cannot speak or deaf.


  1. 1.

    Souminen P, Kivioja A, Öhman J, Korpela R, Rintala R, Olkkola KT. Severe and fatal childhood trauma. Injury. 1998;29(6):425.

    Article  Google Scholar 

  2. 2.

    World health Organization Global status report on road safety (2018). https://www.who.int/violence_injury_prevention/road_safety_status/report/en/. Accessed 10 Jan 2021

  3. 3.

    NDTV Road accident statistics in india (2016). https://sites.ndtv.com/roadsafety/important-feature-to-you-in-your-car-5/. Accessed 10 Jan 2021

  4. 4.

    Tran L, Yin X, Liu X. Disentangled representation learning gan for pose-invariant face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2017. pp.1415–1424.

  5. 5.

    Singh V, Malkunje L, Mohammad S, Singh N, Dhasmana S, Das SK. The maxillofacial injuries: a study. Natl J Maxillofac Surg. 2012;3(2):166.

    Article  Google Scholar 

  6. 6.

    Black AM, Patton DA, Eliason PH, Emery CA. Prevention of sport-related facial injuries. Clin Sports Med. 2017;36(2):257.

    Article  Google Scholar 

  7. 7.

    Canzi G, De Ponti E, Novelli G, Mazzoleni F, Chiara O, Bozzetti A, Sozzi D. The CFI score: validation of a new comprehensive severity scoring system for facial injuries. J Cranio Maxillofac Surg. 2019;47(3):377.

    Article  Google Scholar 

  8. 8.

    Trokielewicz M, Czajka A, Maciejewicz P. Post-mortem human iris recognition. In: 2016 International Conference on Biometric (ICB), IEEE, 2016. p. 1–6.

  9. 9.

    Coulibaly TA, Béogo R, Traoré I, Kohoun HM, Ili BV. Inter personal violence-related facial injuries: a 10-year survey. J Oral Med Oral Surg. 2018;24(1):2.

    Article  Google Scholar 

  10. 10.

    Majumdar P, Chhabra S, Singh R, Vatsa M. On detecting domestic abuse via faces. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2018. pp. 2173–2179.

  11. 11.

    Majumdar P, Chhabra S, Singh R, Vatsa M. Subclass Contrastive Loss for Injured Face Recognition. In: 2019 IEEE 10th international conference on biometrics theory, applications and systems (BTAS), IEEE. 2019. pp. 1–7

  12. 12.

    McDonagh J, Tzimiropoulos G. Joint face detection and alignment with a deformable hough transform model. In: European Conference on Computer Vision, Springer, Cham, 2016. pp. 569–580.

  13. 13.

    Yow KC, Cipolla R. Feature-based human face detection. Image Vis Comput. 1997;15(9):713.

    Article  Google Scholar 

  14. 14.

    Jun B, Kim D. Robust face detection using local gradient patterns and evidence accumulation. Pattern Recognit. 2012;45(9):3304.

    Article  Google Scholar 

  15. 15.

    Mahajan J, Paithane A. Face detection on distorted images by using quality HOG features. In: 2017 International conference on inventive communication and computational technologies (ICICCT), IEEE, 2017. pp. 439–444.

  16. 16.

    Mehendale N. Facial emotion recognition using convolutional neural networks (FERC). SN Appl Sci. 2020;2(3):1.

    Article  Google Scholar 

  17. 17.

    Rowley HA, Baluja S, Kanade T. Human face detection in visual scenes. In: Advances in neural information processing. systems, 1996. pp. 875–881.

  18. 18.

    Leung TK, Burl MC, Perona P. Finding faces in cluttered scenes using random labeled graph matching." In Proceedings of IEEE international conference on computer vision, IEEE, 1995. pp. 637–644.

  19. 19.

    Yang G, Huang TS. Human face detection in a complex background. Pattern Recognit. 1994;27(1):53.

    Article  Google Scholar 

  20. 20.

    Deshpande NT, Ravishankar S. Face detection and recognition using Viola–Jones algorithm and fusion of PCA and ANN. Adv Comput Sci Technol. 2017;10(5):1173.

    Google Scholar 

  21. 21.

    Kadry S, Smaili M. Wireless attendance management system based on iris recognition. Sci Res Essays. 2013;5(12):1428.

    Google Scholar 

  22. 22.

    Wagh P, Thakare R, Chaudhari J, Patil S. Attendance system based on face recognition using eigen face and PCA algorithms. In: 2015 International conference on green computing and internet of things (ICGCIoT), IEEE, 2015. pp. 303–308.

  23. 23.

    Suri N, Marne M, Ghotekar M, Pacharaney U. Design of facial features based hospital admission using GSM. In: 2016 International conference on inventive computation technologies (ICICT), IEEE, 2016. vol. 1, pp. 1–6.

  24. 24.

    Lienhart R, Maydt J. An extended set of haar-like features for rapid object detection. In: Proceedings international conference on image processing, IEEE, 2002. vol. 1, pp. I-I.

  25. 25.

    Chen D, Cao X, Wang L, Wen F, Sun J. Bayesian face revisited: A joint formulation. In: European conference on computer vision, Springer, Berlin, Heidelberg, 2012. pp. 566–579.

  26. 26.

    Sun Y, Chen Y, Wang X, Tang X. Deep learning face representation by joint identification-verification. Adv Neural Inf Proce Syst 2014;27:1988–1996.

  27. 27.

    Huang C, Zhu S, Yu K. Large scale strongly supervised ensemble metric learning, with applications to face verification and retrieval. arXiv:1212.6094 (2012).

  28. 28.

    Simonyan K, Parkhi OM, Vedaldi A, Zisserman A. Fisher vector faces in the wild. In BMVC, 2013. vol. 2, no. 3, p. 4.

  29. 29.

    Berg T, Belhumeur PN. Tom-vs-Pete Classifiers and Identity-Preserving Alignment for Face Verification. In: Bmvc, 2012. vol. 2, p. 7.

  30. 30.

    Chen D, Cao X, Wen F, Sun J. Blessing of dimensionality: High-dimensional feature and its efficient compression for face verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2013. pp. 3025–3032.

  31. 31.

    Chen D, Cao X, Wipf D, Wen F, Sun J. An efficient joint formulation for Bayesian face verification. IEEE Transactions on pattern analysis and machine intelligence, 2016. pp. 32–46.

  32. 32.

    Taigman Y, Yang M, Ranzato M, Wolf L. Deepface: Closing the gap to human-level performance in face verification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, 2014. pp. 1701–1708.

  33. 33.

    Sharma P, Reilly RB. A colour face image database for benchmarking of automatic face detection algorithms." In: Proceedings EC-VIP-MC 2003. 4th EURASIP Conference focused on Video/Image Processing and Multimedia Communications (IEEE Cat. No. 03EX667), IEEE, 2013. vol. 1, pp. 423–428.

  34. 34.

    Liu Z, Luo P, Wang X, Tang X. Large-scale celebfaces attributes (celeba) dataset. 2018;15:2018.

  35. 35.

    Jalal A, Tariq U. The LFW-gender dataset. In: Asian conference on computer vision, Springer, Cham, 2016. pp. 531–540.

Download references


First of all, we would like to express our sincere gratitude to the management, doctors, and supporting staff of Karve Hospital, without their support, we could not have verified the results in the field. Second, We would like to thank all the 51 anonymous patients of Karve Hospital for allowing us to use their cartoonized photos in the manuscript for a better understanding of our work. We are grateful to Mr. Ghone (stretcher boy) who helped us every time standing at the entrance of the special-case entry to pass the patient through the special-case entrance. Finally, we would like to thank all the colleagues of Ninad’s Research Lab and K. J. Somaiya College of Engineering who made this work possible.

Author information



Corresponding author

Correspondence to Ninad Mehendale.

Ethics declarations

Conflict of Interest

Authors M. Parab and N. Mehendale declare that they have no conflict of interest.

Involvement of Human Participant and Animals

This article does not contain any studies with animals or Humans performed by any of the authors. All the necessary permissions were obtained from the Institute Ethical Committee and concerned authorities to use video captures of patients.


No funding was involved in the present work.

Information About Informed Consent

Informed consent was acquired from all human participants whose videos were used for this Novel work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Parab, M., Mehendale, N. Face Recognition-Based Automatic Hospital Admission with SMS Alerts. SN COMPUT. SCI. 2, 55 (2021). https://doi.org/10.1007/s42979-021-00448-4

Download citation


  • Face recognition
  • Hospital admission
  • Automatic form filling