2 - Theses
Browse
Browsing 2 - Theses by Department "Allgemeine Nachrichtentechnik"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- PublicationOpen AccessDeep learning for image enhancement(Universitätsbibliothek der HSU/UniBw H, 2022-05-09)
; ; ;Helmut-Schmidt-Universität/Universität der Bundeswehr HamburgDeep learning belongs to the family of artificial intelligence and machine learning where the primary objective is to learn and diversify the feature representation for a given system. In deep learning, a machine is able to develop large parameterized models that addresses a plethora of scientific problems based on a number of optimization methods. These models will be capable of retrieving, representing, generating, and combining a large number of features to provide a generalized solution to the intended problems. Unlike traditional machine learning algorithms, deep learning algorithms offer an opportunity to learn, extract, and even generate very large feature spaces via densely parameterized models, which are capable of learning semantic information and an efficient input-output mapping. Hence, they are very suitable in low- level computer vision applications involving multimedia enhancement problems. Deep learning has a very broad scope, but this thesis is primarily focused on artificial neural networks, convolutional neural networks, and their variants which are some of the most powerful deep learning tools today. In this work, the neural network fundamentals are explained, the corresponding derivations are performed, and the workflows are illustrated. Important modules of convolutional neural networks are described and their functions are discussed. Various convolutional architectures are proposed for various computer vision tasks related to image quality improvement and their suitability towards the particular problems are explained. Various networks, which include novel network modules and architectures, are studied and applied in the areas of image and video enhancement. Ablation studies and experiments are performed on the network architectures to analyze them. Finally, the proposed models are evaluated in terms of their prowess towards the aforementioned vision tasks. - PublicationOpen AccessSpatial Audio Through Headphones Based on HRTFs Approximated by Parametric IIR Filters(Universitätsbibliothek der HSU/UniBw H, 2022-06)
; ; ;Helmut-Schmidt-Universität / Universität der Bundeswehr HamburgThe subject of this dissertation is spatial audio through headphones. In the present work, an offline binaural synthesis implementation is proposed using head-related transfer functions (HRTFs) approximated by cascades of parametric infinite impulse response (IIR) filters, parameter interpolation to calculate HRTFs of intermediate directions for generating static as well as moving virtual sound sources, and simulated room effects in order to increase the perceived externalization. The first contribution to the research field lies in representing HRTFs as cascades of low-order parametric IIR filters together with a delay representing the interaural time difference (ITD). Usually, HRTFs are represented as finite impulse response (FIR) filters containing the corresponding head-related impulse responses (HRIRs) as filter coefficients. However, by using cascades of low-order parametric IIR filters, like first-order shelving or second-order peak filters, memory requirements of the used hardware can be decreased to three parameters per filter stage (cut-off or center frequency, gain, and Q-factor). For this purpose, a two-step procedure is proposed that approximates the magnitude responses of HRTFs by parametric IIR filter cascades. In a first step, the individual filter stages are consecutively integrated, initialized, and tuned. Afterwards, the interaction between individual filter stages is post-optimized. Alternatively, an approach for HRTF magnitude response approximation based on instantaneous backpropagation is proposed. After approximating the HRTF magnitude responses, also the ITDs have to be extracted from the HRIRs or HRTFs of the two ears. From this, virtual sound sources are generated by filtering a monaural audio signal with the parametric IIR filter cascades of the desired direction and delaying the filtered audio signal of the contralateral ear by the extracted ITD. In many practical implementations, only a finite number of measured HRTFs is available, resulting in a limited spatial resolution. For HRTFs represented as FIR filters, bilinear rectangular or triangular interpolation can be used to calculate the filter coefficients of intermediate HRTFs. However, when the HRTFs are represented as IIR filters instead, the interpolation is not as straightforward as for FIR filters due to stability considerations. Therefore, in this work, a parameter interpolation algorithm based on bilinear interpolation of the parameters of the individual filter stages together with an assignment of related peak filters is proposed. This interpolation algorithm guarantees the stability of intermediate filters. When generating moving virtual sound sources, two IIR filter cascades are combined in parallel following the cross-fading input-switching combination approach. For evaluating the proposed methods, three listening tests assessing different aspects of binaural synthesis using HRTFs approximated by parametric IIR filters are performed. In a first listening test, the validity of the proposed parametric IIR filter cascades is proven for static virtual sound sources by comparing their localization results to localization results achieved using HRIRs represented as FIR filters. Additionally, a second listening test proves that adding simulated room effects via the image source model increases the perceived externalization of static virtual sound sources generated using HRTFs approximated by parametric IIR filter cascades up to externalization levels achieved using measured binaural room impulse responses represented as FIR filters. Finally, the audio quality of moving virtual sound sources generated using minimum-phase approximated HRIRs represented as FIR filters and parametric IIR filter cascades is evaluated in a third listening test. By using two IIR filters in parallel following the cross-fading input-switching combination approach, comparable audio quality ratings are achieved as for FIR filter implementations using minimum-phase approximated HRIRs. Thus, HRTFs approximated by parametric IIR filter cascades can be used to reduce the number of saved coefficients. By using two first-order shelving filters, ten second-order peak filters, a mean HRTF magnitude value, and an extracted ITD, only 36 parameters have to be saved per HRTF instead of 200 coefficients as in FIR filter implementations using conventional HRIRs.
