|Year : 2022 | Volume
| Issue : 2 | Page : 82-87
Interdisciplinary collaboration between engineering and nursing on baby crying analyzing and classification: A biotechnology study
Serap Ozdemir1, Efe Çetin Yilmaz2
1 Department of Nursing, Yusuf Şerefoğlu Faculty of Health Sciences, Kilis 7 Aralık University, Kilis, Turkey
2 Department of Control Systems Electrical and Electronic Engineering, Kilis 7 Aralık University Faculty of Engineering and Architecture, Kilis, Turkey
|Date of Submission||22-Feb-2022|
|Date of Decision||20-Mar-2022|
|Date of Acceptance||29-May-2022|
|Date of Web Publication||15-Jun-2022|
Dr. Serap Ozdemir
Department of Nursing, Yusuf Şerefoğlu Faculty of Health Sciences, Kilis 7 Aralik University, Kilis
Source of Support: None, Conflict of Interest: None
This study aims to reveal a multidisciplinary study on analysis and signal processing on infant crying in the field of engineering and nursing. It is a known fact that babies report all their needs with crying behavior. It is often very difficult for those responsible for the baby to determine the needs of the baby with this crying behavior. It is of great importance for the comfort of the baby that the parents can accurately predict the crying behavior and needs of the babies. For this reason, the analysis of the sound signals produced by babies during crying behavior is an interesting subject in the field of engineering. In the literature, proposed approaches capture the baby's cry signal and extract a unique set of features from this signal using Mel Frequency Cepstral Coefficients, Linear Predictive Cepstral Coefficients, and pitch. This feature set is used to distinguish between partner signals to recognize the causes of crying. Furthermore, this classification is used to represent different classes of causes of crying, such as hunger, pain, sleep, and discomfort. As a result, in this study, the clinical analysis of infant crying behaviors was examined and optimum solutions were evaluated in terms of engineering. Thus, new approaches have been tried to be brought by analyzing artificial intelligence-based sound analysis systematics.
Keywords: Crying behaviour, engineering, nursing, signal analyses
|How to cite this article:|
Ozdemir S, Yilmaz EÇ. Interdisciplinary collaboration between engineering and nursing on baby crying analyzing and classification: A biotechnology study. J Prev Diagn Treat Strategies Med 2022;1:82-7
|How to cite this URL:|
Ozdemir S, Yilmaz EÇ. Interdisciplinary collaboration between engineering and nursing on baby crying analyzing and classification: A biotechnology study. J Prev Diagn Treat Strategies Med [serial online] 2022 [cited 2022 Jun 26];1:82-7. Available from: http://www.jpdtsm.com/text.asp?2022/1/2/82/347539
| Introduction|| |
Crying is a very effective communication tool for newborns, babies use it from birth to express their needs and feelings. The baby's crying provides information about his basic biological needs (hunger, cold, warmth, pain, cramps, security, pleasure, etc.), the functional expression of his psychological states, and their development. In addition, the baby tries to announce his physical, physiological, and pathological problems by crying. Crying is caused by the rhythmic transitions between inhalation and exhalation due to the vibration of the vocal cords that produce periodic air pulses. The period of these pulses is called frequency and their typical value in healthy babies is 250-600 Hz. The cry signal is shaped by the vocal tract and gives rise to resonant frequencies called formants. The first two formants typically occur around 1100 Hz and 3300 Hz, respectively. Crying requires a well-functioning respiratory, laryngeal and supralaryngeal muscle network and robust neurophysiological coordination. Physiologically, other parts of the brain, especially the brain stem and limbic system, need to work in coordination during crying, which is also related to respiratory and lung mechanisms. It provides information about the development and integrity of the central nervous system. Babies use Dustan Baby Language (DBL) to communicate during the first trimester. According to some studies, five words express needs such as “Neh” (I am hungry), “Eh” (needs burping), “Owh/Oah” (fatigue), “Eair/Eargghh” (cramps), “Heh” (physical discomfort; feeling hot or wet). Besides this purpose, smart home technology also implements a baby cry detection feature to monitor the baby. To detect a crying baby, several stages are preprocessing, feature extraction, and classification. Using these classifications, being able to predict baby needs during the baby crying process will have a great impact on comfort. Within the scope of this study, the evaluation of the studies in this field was carried out by using the most used qualified databases in the literature.
| Methods|| |
Physiology of crying
The act of crying in newborn babies is a complex neurophysiological condition. Crying occurs in response to a particular stimulus and causes perinatal stress, hunger, pain, anger, sleepiness, etc., in the baby. Cry response; it is mainly regulated by structural or functional changes in neurological areas such as the hypothalamus, amygdala, caudal periaqueductal region, and cranial nerves. Peripheral phonatory effector organs mainly include the lungs, airways, vocal cords, subglottic, and supraglottic spaces., Compressed air expelled from the lungs through the airways causes rhythmic oscillations of the vocal cords, alternating condensation, and sparse air currents to produce sound waves. The sound waves are then amplified by the subglottic and supraglottic cavities of the larynx, which act as resonance cavities. Therefore, phonation can be maintained as a result of periodic oscillations and adequate amplification of the vocal cords. A high pitch can be described as hyper pronation, while a low pitch corresponds to hyphenation, and dysphonia can be considered an irregular phonation characterized by alternating hyper and hyphenations. From a respiratory perspective, crying can also be defined as a series of four movement phases that can be assimilated into the adult vasalva maneuver. It begins with a short, rapid, strenuous inspiration known as the tension phase, followed by exhalation or expiration, known as the sigh phase, followed by a pause corresponding to the nonphonatory phase. An inspiratory breath follows before the next cry. Therefore, the act of crying appears to involve energy-producing metabolism, including psycho-neurological, respiratory, musculoskeletal, cardiovascular, and genetic involvements. Disturbances in one or more of these markers may be responsible for an imperfect crying process.
Crying peaks at around 6 weeks and decreases by 4 months and then remain stable. The type, frequency, and duration of crying are highly variable, especially after early infancy. Later in life, these situations may depend on the individual's temperament, or attachment. Studies indicate that early in life people exhibit a variety of crying patterns, such as protest crying (where the babyfaces loss, such as being left in a cradle and wants to undo the loss), hopeless-sad crying (low cry in recognition of loss), and disconnected inhibited crying (typically from caregiver, absence of outward crying associated with a life-threatening breakup)., Sometimes, additional types such as hunger and painful crying are also considered. Even mothers for a long time confuse pain with hunger crying, and research has shown in the past that different types of crying cannot be reliably distinguished. The baby is dependent on the caregiver for his basic needs. For example, the failure of mothers to respond consistently when their babies cry may be torn between hugging the mother and resisting contact. Crying in toddlers with the caregiver's response; Crying, such as holding, touching, feeding, typically stops. A positive response from the caregiver signals that the infant can rely on others to help meet environmental demands, while a lack of it signals that the infant must cope with environmental demands himself. Babies have a high adaptive capacity to survive and can adapt themselves to certain situations.
It is known that experienced people can understand the needs of babies from the way they cry and act accordingly, but an individual's differences and some complex situations can make this difficult. It may be easier for parents to recognize the needs and problems of the baby in the 1st days of life., The recent massive development of measuring instruments and analysis software and larger sample sets has begun to contradict the belief that the infant's needs cannot be determined. Studies suggest that facial expressions and crying features can reliably distinguish between cries of pain, hunger, sadness, fear, and anger., Studies have revealed that the acoustic sound produced by an infant according to its physiological needs can determine the way and characteristics of crying.,, Classification of normal and pathological infant crying has been the subject of extensive research, especially in newborn infants. There is not enough information about pathological crying in infants. However, it is generally accepted that a cry signal with a fundamental frequency above 600 Hz indicates health problems. Pain cries are usually longer and often include hypertonic cries. Colic is a condition that is commonly considered pathological and shows signs of excessive crying. They distinguish between internal causes for colics, such as food allergies, and external causes, such as inappropriate physical contact. Studies have primarily focused on a specific pathology such as deafness and respiratory distress syndrome, as well as investigated multiple pathologies. However, little is known about how pathological crying is associated with possible maladaptive behaviors early in life.
When a baby cried after birth, it included information about that baby was born safely and normally. Premature babies and babies with neurological disorders may have different crying characteristics compared to term and healthy babies. Many different studies have investigated the differences in newborn sex, neurological maturity, and risk of brain damage in preterm infants from prolonged crying due to the lack of oxygen.,, The automatic classification of babies' crying is a pattern recognition problem. It is the classification of the type of crying or pathology detected in the baby while the baby is crying. In some studies, pathological crying classification methods have been suggested. With the latest technologies, it is possible to detect a baby's crying sound and to recognize the baby's needs (baby cry transfer). In smart home technologies, the baby's crying can be detected easily. The family can be warned by the system when the baby cries. It is important to know what the baby needs with crying for medical reasons., Before the baby cries, he will try to communicate in a specific language known as DBL. This language is divided into five meanings that are used as the universal language of babies. The sound of a crying baby contains a lot of information about his emotional and physical state and the baby's identity. Priscilla Dunstan has found that first trimester babies use proto-language to communicate, that is, five words to express their needs. These five words are “Neh” (hungry), “Eh” (need to pass gas), “Owh/Oah” (fatigue), “Eair/Eargghh” (cramps), “Heh” (physical discomfort; feeling hot or wet)., It has been reported that the basic frequency of infant crying is between 250 Hz and 600 Hz. It is preferred as a noninvasive and safe method to evaluate the physical condition of preterm infants, especially in infant cry analysis. The difference between normal and abnormal crying findings is clinically striking. In recent years, the analysis of babies' crying and what these crying findings express have gained importance. It is stated that deep learning methods can be applied for successful automatic classification. Automatic methods are used with classical approaches such as a newborn baby crying Fourier transform and autocorrelation analysis and parametric techniques. These methods allow the estimation of the main acoustic properties such as vocal fold vibration frequency, vocal tract resonance frequencies, crying time, and the like. In the studies, it was tried to distinguish baby crying with clinical studies and babies with certain conditions or medical risks. Risk factors such as substance exposure in the prenatal period or preterm birth were investigated. It has recently attracted attention in the analysis of crying in diseases such as hearing difficulties and autism.,
Engineering designs and solutions
It can be defined as the process of creating speech signals that start from the larynx (where the vocal cords are located) and end in the mouth. Speech or voice signals classified as audible and silent are unvoiced, a condition in which the vocal cords do not vibrate. Vocal is a condition in which the state of the vocal cords vibrates and produces a glottis pulse. The pitch is known as the fundamental frequency of the glottis. The human voice has a low-frequency range with a fundamental frequency of around 220 Hz for women and 130 Hz for men, with initial formal vocal discrimination reported below 1000 Hz. [Figure 1] shows the spectrum of the baby cry signal and the spectrum of average human speech.
When these spectrums are examined, the differences between the baby crying spectrum and the human speech spectrum will be seen in the frequency domain. There are similarities and differences between the adult voice and the voice of crying babies. In previous literature studies, differences in the character of the voice were found in terms of the basic frequency (pitch) where the voice of crying babies is higher. However, the analysis and integration of this spectrum into the system have not yet been clarified by the researchers. [Figure 2] shows the time axis spectrum of the sounds a baby makes while crying. The voice of crying babies has short vocal cords and has regular characters as can be seen from the spectrogram because it is thin. Priscilla Dunstan proposed the idea of determining the meaning of a crying baby named Dunstan Baby Language, Crying babies have five types of universal sounds, and their meanings are as follows:
|Figure 1: (a) Spectrogram of the baby cry signal; (b) Spectrogram of speech audio signal|
Click here to view
|Figure 2: The spectrogram of the baby cry signal: (a) Neh; (b) Heh; (c) Well; (d) Eairh/Earghhh; (e) Ohh/Oah|
Click here to view
- “Neh:” The “new” sound comes from sucking and pushing the tongue into the mouth, which means the baby is hungry
- “Owh/Oah:” His voice “Owh” sounds like a man yawning, which means the baby is sleeping
- “Heh:” The “Heh” sound is derived from the baby's response to burning or itching, which means that the baby is not comfortable
- “Earth/Eargghh:” An “Eairh” sound occurs when the baby does not have burping which causes air bubbles to enter his stomach and cannot be released, which means the baby has stomach problems
- “Eh:” The “eh” sound occurs when the wind is trapped and not out of the chest, causing air to come out of the mouth, meaning the baby wants to burp.
After the spectrum analysis, the next step of speech recognition systems is feature extraction. Obtaining the distinctive feature of the sound is an important process. The input signal received from the frame is divided into a length of 10–40 ms, and the characteristics of the sound are extracted and each feature value is calculated., [Figure 3] shows the analysis of the signals occurring during baby crying (referenced to the amplitude axis) according to Mel filter bank and Linear filter bank. Many speech recognition system studies use Mel Frequency Cepstral Coefficient (MFCC) feature extraction because it is considered to be similar to the concept of human hearing.
MFCC is a feature extraction method that converts audio into an audio signal vector. This method provides a representation of the short-term power spectrum of the signal. The concept of MFCC was similar to human hearing, which has a critical bandwidth of the human ear at a frequency below 1000 Hz. The MFCC process starts by splitting the audio signal into a frame form with a time slot duration of 10–40 ms, this is frame blocking. The frame blocking is then windowed by hammering to eliminate the effects of overlapping due to framing. The windowing operation where w is a windowing function and N is the number of instances in a frame, Equation (1) is the formula for the windowing operation.
The results of this windowing operation are followed by the calculation of the Fast Fourier Transform, which transforms the signal from the time domain to the frequency domain. Filter bank applied to the signal with frequency domain, to convert the signal to Mel frequency by Equation (2):
In the MFCC method, a band-pass filter triangle structure is used based on the bank filter Mel scale. This allows the designed structure to obtain a higher frequency filter with a higher bandwidth. The last step of the process in the MFCC technique is to generate the Discrete Cosine Transform after reversing the signal results obtained from the previous processes in the time domain. These results form an acoustic vector sequence, which is called the MFCC. It has been previously reported in the literature that the tonal perception of the human ear has nonlinear properties. To explain the equational relationship between frequency and Mel scale; considering for a situation where the frequency is above 500 Hz; where the incremental value of the pitch is proportional to the increase in the same pitch, yielding Mel scale scores over two octaves over four octaves on the Hz scale. Considering a nonlinear mapping function, it can be argued that the relationship between the Mel scale and the Hz scale is useful for the analysis of seismic signals where there is little difference between the speech signal and the seismic signal. Here, the filter is used for the frequency range 0–22050 Hz in speech recognition. Below 1000 Hz the mapping function is relatively linear, so using the MFCC process at frequencies below 1000 Hz is not good enough. To explain the LFCC system; this structure can be described as a well-ordered feature extraction. The process starts with dividing the resulting audio clip into multiple parts with a fixed number of frames. In general, speaking, LFCC has a subtraction feature similar to MFCC. However, LFCC uses a linear filter bank structure instead of the Mel filter. The usage preferences of linear filter banks provide some advantages at high frequencies. Both methods have advantages and disadvantages over each other in analyses of crying behavior. The baby's crying behavior can be predicted by more than one method by the system, which will increase the stability of the system. Therefore, using more than one method integrated into decision-making mechanisms is an important parameter for system stability.
| Conclusion|| |
It is often very difficult for those responsible for the baby to determine the needs of the baby with this crying behavior. It is of great importance for the comfort of the baby that the parents can accurately predict the crying behavior and needs of the babies. For this reason, the analysis of the sound signals produced by babies during crying behavior is an interesting subject in the field of engineering. The engineering solutions realized in this area have a great impact on the comfort of the baby and the family. However, the stability and accuracy of the designed systems will be possible with interdisciplinary studies. Therefore, the application of the designed engineering solutions on the in vivo environment and the control of the results will be of great importance in terms of system development. As a result, in future studies, it is important to examine analyzing the system outputs of the designed engineering solutions on babies crying of different age and genders.
Financial support and sponsorship
Conflicts of interest
There are no conflicts of interest.
| References|| |
Wermke K, MendeW. From emotion to notion. The importance of melody. In: Decety J, Cacioppo J, editors. Handbook of Social Neuroscience. Oxford, UK: Oxford University Press; 2011.
Butler EA, Randall AK. Emotional coregulation in close relationships. Emot Rev 2013;5:202-10.
LaGasse LL, Neal AR, Lester BM. Assessment of infant cry: Acoustic cry analysis and parental perception. Ment Retard Dev Disabil Res Rev 2005;11:83-93.
Newman JD. Neural circuits underlying crying and cry responding in mammals. Behav Brain Res 2007;182:155-65.
Kheddache Y, Tadj C. Acoustic measures of the cry characteristics of healthy newborns and newborns with pathologies. J Biomed Sci Eng 2013;6:796-803.
Cohen R, Ruinskiy D, Zickfeld J, IJzerman H, Lavner Y. Baby cry detection: Deep learning and classical approaches. In: Development and Analysis of Deep Learning Architectures. Cham: Springer; 2020. p. 171-96.
Dinwiddie R, Pitcher-Wilmott R, Schwartz JG, Shaffer TH, Fox WW. Cardiopulmonary changes in the crying neonate. Pediatr Res 1979;13:900-3.
Ludington-Hoe SM, Cong X, Hashemi F. Infant crying: Nature, physiologic consequences, and select interventions. Neonatal Netw 2002;21:29-36.
Hirschberg J. Dysphonia in infants. Int J Pediatr Otorhinolaryngol 1999;49:293-6.
Vingerhoets AJJM. Why Only Humans Weep. Unravelling the Mysteries of Tears. Oxford: Oxford University Press; 2013.
Owings DH, Zeifman D. Human infant crying as an animal communication system: Insights from an assessment/management approach. In: Evolution of Communication Systems: A Comparative Approach. The MIT Press; 2004. p. 151-70.
Lounsbury ML, Bates JE. The cries of infants of differing levels of perceived temperamental difficultness: Acoustic properties and effects on listeners. Child Dev 1982;53:677-86.
Zeskind PS, Barr RG. Acoustic characteristics of naturally occurring cries of infants with “colic“. Child Dev 1997;68:394-403.
Laan AJ, Van Assen MA, Vingerhoets AJ. Individual differences in adult crying: The role of attachment styles. Soc Behav Pers 2012;40:453-71.
Kay Nelson J. Seeing Through Tears: Crying and Attachment. New York: Routledge; 2005.
Hendriks MC, Nelson JK, Cornelius RR, Vingerhoets AJ. Why crying improves our well-being: An attachment-theory perspective on the functions of adult crying. In: Emotion Regulation. Boston, MA: Springer; 2008. p. 87-96.
Bell SM, Salter Ainsworth MD. Infant crying and maternal responsiveness. Child Dev 1973;43:1171-90.
Halpern SH, Littleford JA, Brockhurst NJ, Youngs PJ, Malik N, Owen HC. The neurologic and adaptive capacity score is not a reliable method of newborn evaluation. J Am Soc Anesthesiol 2001;94:958-62.
Wood RM, Gustafson GE. Infant crying and adults' anticipated caregiving responses: Acoustic and contextual influences. Child Dev 2001;72:1287-300.
Pal P, Iyer AN, Yantorno RE. Emotion detection from infant facial expressions and cries. In: 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings. Vol. 2. IEEE Press; 2006. p. 2.
International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06), 2005, pp. 770-775, doi: 10.1109/CIMCA.2005.1631561.
Fuller BF, Keefe MR, Curtin M. Acoustic analysis of cries from “normal” and “irritable” infants. West J Nurs Res 1994;16:243-51.
Goberman AM, Robb MP. Acoustic characteristics of crying in infantile laryngomalacia. Logoped Phoniatr Vocol 2005;30:79-84.
Green JA, Gustafson GE, McGhie AC. Changes in infants' cries as a function of time in a cry bout. Child Dev 1998;69:271-9.
Furlow FB. Human neonatal cry quality as an honest signal of fitness. Evol Hum Behav 1997;18:175-93.
Jeyaraman S, Muthusamy H, Khairunizam W, Jeyaraman S, Nadarajaw T, Yaacob S, et al
. A review: Survey on automatic infant cry analysis and classification. Health Technol 2018;8:391-404.
Bellieni CV, Sisto R, Cordelli DM, Buonocore G. Cry features reflect pain intensity in term newborns: An alarm threshold. Pediatr Res 2004;55:142-6.
Michelsson K, Eklund K, Leppänen P, Lyytinen H. Cry characteristics of 172 healthy 1-to 7-day-old infants. Folia Phoniatr Logop 2002;54:190-200.
Manfredi C, Bocchi L, Orlandi S, Spaccaterra L, Donzelli GP. High-resolution cry analysis in preterm newborn infants. Med Eng Phys 2009;31:528-32.
Dewi SP, Prasasti AL, Irawan B. The study of baby crying analysis using MFCC and LFCC in different classification methods. In: 2019 IEEE International Conference on Signals and Systems (ICSigSys). Bandung, Indonesia: IEEE; 2019. p. 18-23.
Franti E, Ispas I, Dascalu M. Testing the universal baby language hypothesis-automatic infant speech recognition with cnns. In: 2018 41st
International Conference on Telecommunications and Signal Processing (TSP). Athens, Greece: IEEE; 2018. p. 1-4.
Jagtap SS, Kadbe PK, Arotale PN. System propose for be acquainted with newborn cry emotion using linear frequency cepstral coefficient. In: 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT). Chennai, India: IEEE; 2016. p. 238-42.
Esposito G, Venuti P. Developmental changes in the fundamental frequency (f0) of infant cries: A study of children with autism spectrum disorder. Early Child Dev Care 2010;180:1093-102.
Sheinkopf SJ, Iverson JM, Rinaldi ML, Lester BM. Atypical cry acoustics in 6-month-old infants at risk for autism spectrum disorder. Autism Res 2012;5:331-9.
Lei H, Lopez E. Mel, linear, and antimel frequency cepstral coefficients in broad phonetic regions for telephone speaker recognition. In: Tenth Annual Conference of the International Speech Communication Association. Vol. 1. Tenth Annual Conference; 2009. p. 2307-10.
Bano S, Ravi Kumar KM. Decoding baby talk: A novel approach for normal infant cry signal classification. In: 2015 International Conference on Soft-Computing and Networks Security (ICSNS). Coimbatore, India: IEEE; 2015. p. 1-4.
Jin G, Ye B, Wu Y, Qu F. Vehicle classification based on seismic signatures using convolutional neural network. IEEE Geosci Remote Sens Lett 2018;16:628-32.
Alam MJ, Kenny P, Gupta V. Tandem features for text-dependent speaker verification on the RedDots corpus. Interspeech 2016;1:420-4.
[Figure 1], [Figure 2], [Figure 3]