2023-03-17 Dissertation Gianni Allevato (1)

Ultrasonic Phased Arrays for
3D Sonar Imaging in Air

Dissertation Gianni Allevato
Zur Erlangung des akademischen Grades Doktor-Ingenieur (Dr.-Ing.)
Genehmigte Dissertation von Gianni Allevato aus Kirchheimbolanden
Tag der Einreichung: 28. März 2023, Tag der Prüfung: 25. Juli 2023
1. Gutachten: Prof. Dr. mont. Mario Kupnik

2. Gutachten: Prof. Dr. Ing. Marius Pesavento
Darmstadt, Technische Universität Darmstadt
Electrical Engineering and

Information Technology
Department
Measurement and Sensor
Technology Group
Ultrasonic Phased Arrays for 3D Sonar Imaging in Air
Dissertation Gianni Allevato
Accepted doctoral thesis by Gianni Allevato
Date of submission: 28. März 2023

Date of thesis defense: 25. Juli 2023
Darmstadt, Technische Universität Darmstadt
Bitte zitieren Sie dieses Dokument als:

URN: urn:nbn:de:tuda-tuprints-244256
URL: http://tuprints.ulb.tu-darmstadt.de/24425
Jahr der Veröffentlichung auf TUprints: 2023
Dieses Dokument wird bereitgestellt von tuprints,

E-Publishing-Service der TU Darmstadt
http://tuprints.ulb.tu-darmstadt.de
tuprints@ulb.tu-darmstadt.de
Die Veröffentlichung steht unter folgender Creative Commons Lizenz:

Namensnennung – Weitergabe unter gleichen Bedingungen 4.0 International
https://creativecommons.org/licenses/by-sa/4.0/
This work is licensed under a Creative Commons License:
Attribution–ShareAlike 4.0 International
https://creativecommons.org/licenses/by-sa/4.0/
Erklärungen laut Promotionsordnung
§ 8 Abs. 1 lit. c PromO
Ich versichere hiermit, dass die elektronische Version meiner Dissertation mit der schriftlichen Version über-
einstimmt.
§ 8 Abs. 1 lit. d PromO

Ich versichere hiermit, dass zu einem vorherigen Zeitpunkt noch keine Promotion versucht wurde. In diesem
Fall sind nähere Angaben über Zeitpunkt, Hochschule, Dissertationsthema und Ergebnis dieses Versuchs
mitzuteilen.
§ 9 Abs. 1 PromO
Ich versichere hiermit, dass die vorliegende Dissertation selbstständig und nur unter Verwendung der angege-
benen Quellen verfasst wurde.
§ 9 Abs. 2 PromO
Die Arbeit hat bisher noch nicht zu Prüfungszwecken gedient.
Darmstadt, 28. März 2023

G. Allevato
iii
Acknowledgements
First of all, I would like to thank Prof. Dr. Mario Kupnik for the opportunity and the secure perspective to
continue the research topic I have become deeply passionate about. I highly appreciate the trust I have been
given and the many freedoms, that allowed me to try out and realize various research ideas. I am grateful
for many valuable discussions, which, even if we did not share the same opinion on all matters, always led
to a positive outcome. Thank you for the support and encouragement to grow as a new researcher and as a
person, and the guidance to learn the fine details of academia - not only the scientific ones.
In addition, I would like to thank Prof. Dr. Marius Pesavento for the numerous advices, ideas and particularly
detailed feedback, which have contributed a great part to the success of many publications and this thesis.
I really enjoyed working with him because of his motivating genuine passion for signal processing and his
down-to-earth way of explaining things, which I have greatly appreciated.
Furthermore, I would like to thank all of my colleagues at the department MUST for creating a working
environment in which we can have fun together, listen to and support each other, discuss openly, feel welcome
and comfortable, and, even though we are all different natures and work on different topics, in the end,
stick together. A special thanks goes to Matthias Rutsch for the constant exchange, motivation and support,
which goes far beyond technical ultrasound physics. The hilarious filming sessions with him and his center-
point-rotation camera skills will remain fond memories. I would also like to express my sincere thanks to
Christoph Haugwitz for the harmonious and amusing teamwork, as well as the various help of all kinds. I
really appreciate his analytical and precise way of thinking, curiosity and enthusiasm, which have opened up
new perspectives for me in many conversations. Nevertheless, I hope that one day he will realize that it is
called gif and not jif. Moreover, I would also like to thank Bastian Latsch, Jan Hinrichs and Helge Dörsam
for their commitment to maintaining an excellent IT infrastructure, which is an easily overlooked, but vital
contribution to all employees. I would like to express my special thanks to Elke Maffenbeier, who not only
provides great support for all administrative matters, but also contributes far beyond that to the group’s
cohesion, by careful listening and finding the right words, making her an indispensable part of the team.
I would also like to thank all of my supervised bachelor’s and master’s students for the great works they have
done, which have strengthened my research, making this thesis possible, and from whom I have also learned
a lot. Special thanks go to Stefan Schulte and Tim Maier, who both, through their hard work, dedication and,
most importantly, courage, succeeded to present their theses in front of an international audience.
Furthermore, I would like to thank my supervisors throughout my bachelor and master degrees, as well as
during my internship at HBK, namely Holger Mößinger, Jan Lotichius, Axel Jäger and Dirk Brand, who shared
with me their expertise and joy in engineering methodology, electronics, firmware and software development,
and therefore helped to pave the way for this work.
Finally, I would like to express my deepest gratitude to my partner, friends, and family for their love, care,
advice, encouragement, and support. You have made it possible to start this journey in the first place, kept me
going and accompanied me through difficult times. Thank you for your patience and understanding while
writing this final thesis.
Gianni Allevato
Darmstadt, March 2023
iv
Abstract
Next-gen autonomous mobile robots are not only required to navigate in a wide variety of challenging
environments, but also have to interact directly with humans. In order to ensure reliability and safety, the
integration of different and complementary perception sensor technologies is crucial. In particular, ultrasonic
sensors stand out due to their robust operation in difficult lighting conditions, in the presence of transparent
and reflective objects, and in smoke-filled and dusty environments, so that they ideally complement lidar
and camera systems. However, conventional one-dimensional ultrasonic range finders restrict the available
navigation capabilities of highly maneuverable robots.
Therefore, in this thesis, three-dimensional sonar perception sensors based on air-coupled ultrasonic phased
arrays and beamforming are investigated, which are capable of simultaneously localizing multiple objects
in terms of distance, direction and height, enabling to form an image of the environment. The focus of this
work is the conception and realization, as well as the numerical and experimental evaluation of five sonar
imaging systems, which pursue different optimization goals in order to highlight the real-world capabilities
and limitations.
Two of the sonar prototypes created consist of 64 piezoelectric ultrasonic transducers (PUTs) with a narrow-
band resonant frequency of 40 kHz, all of which are utilized for both, transmit and receive beamforming.
The single-line-acquisition technique and the resulting array gain enable imaging within a long range of over
6 m. The corresponding transceiver electronics, FPGA and system architecture, as well as the implementation
details of the GPU-accelerated frequency-domain array signal processing and visualization using Nvidia CUDA
and OpenGL are described. One of the systems utilizes a waveguide structure in which the PUTs are inserted
to form a uniform dense λ/2 array geometry, that allows grating-lobe-free beamforming. The other system
prototype exploits a non-uniform sparse spiral array configuration to span a large aperture for achieving a
high angular resolution of 2.3°, enabling to recognize patterns and shapes of objects, e.g. a hand.
Two further minimalistic embedded sonar systems are particularly designed for hardware-limited and
mobile applications. These concepts use narrow-band PUTs for transmitting and wide-band digital MEMS
microphone arrays for receiving. Each system is based on the multi-line acquisition technique, requiring only
a single pulse for image formation, and, thus, providing high frame rates of 30 Hz. One of the systems consists
of one PUT and a hexagonal 36-element microphone array, whereas the signal and 3D image processing is
handled by an FPGA and a GPU-accelerated single-board computer (Nvidia Jetson Nano). The other system
requires only a single microcontroller and relies on a waveguided PUT line array paired with a microphone
line array in a T-configuration.
Moreover, array design strategies are introduced that combine two non-uniform spiral sub-arrays featuring
different element densities, which achieve a lower side lobe level for the same main lobe width compared to
existing density tapering modifications. Based on this geometry, a sonar system with 64 MEMS microphones
is developed, which additionally uses three waveguided PUTs to sequentially transmit different frequencies,
whose resulting individual images are merged into a compound image. This sonar system thus achieves an
advantageous trade-off between angular resolution and image contrast while maintaining a high frame rate
and range.
All of these system prototypes are analyzed with respect to their transmit, receive, and pulse-echo char-
acteristics, as well as their achievable imaging quality in an anechoic chamber. In addition, the relative
v
amplitude and phase errors of the different transducer technologies are investigated, the effects of these
errors on the beamforming are analyzed using Monte Carlo simulations, and the improvements after the
experimental calibration are highlighted. Furthermore, image enhancement by post processing is investigated
using an autoencoder neural network, trained to suppress the typical transmit pulse, main lobe, and sidelobe
characteristics.
All in all, this work highlights the broad application potential of 3D sonar systems, as they provide valuable
localization information, that surpasses conventional 1D range sensors, contributing to the advancement of
emerging technologies in autonomous vehicles, robotics, and industrial environments.
vi
Zusammenfassung
Die nächste Generation autonomer mobiler Roboter muss sich nicht nur in einer Vielzahl herausfordernder
Umgebungen zurechtfinden, sondern tritt auch in direkte Interaktion mit Menschen. Um hierbei eine zu-
verlässige Funktionsweise und Sicherheit zu gewährleisten, spielt der Einsatz von unterschiedlichen, sich
gegenseitig ergänzenden Sensorprinzipien zur Umfelderfassung eine entscheidende Rolle. Insbesondere
Ultraschallsensoren zeichnen sich durch ihre Zuverlässigkeit bei schwierigen Lichtverhältnissen, transparenten
und reflektierenden Objekten, sowie in verrauchten und staubigen Umgebungen aus, wodurch sie Lidar- und
Kamerasysteme ideal ergänzen. Bisher genutzte eindimensionale Ultraschall-Entfernungssensoren schränken
jedoch das mögliche Navigationspotential der wendigen Roboter ein.
Deswegen wird in dieser Thesis die dreidimensionale Sonar-Umfelderfassung basierend auf luftgekoppelten
Ultraschall Phased Arrays untersucht, die mittels Beamforming mehrere Objekte gleichzeitig in der Entfernung,
Richtung und Höhe lokalisieren und somit ein Abbild der Umgebung erzeugen können. Der Schwerpunkt der
Arbeit bildet die Konzipierung und Realisierung, sowie die numerische und experimentelle Evaluation von fünf
3D Sonar Bildgebungs-Systemen, welche unterschiedliche Optimierungsziele verfolgen, um die realistischen
Möglichkeiten und Limitierungen aufzuzeigen.
Zwei aufgebaute Systemprototypen bestehen aus jeweils 64 piezoelektrischen Ultraschall Wandlern (PUTs)
mit einer schmalbandigen Resonanzfrequenz von 40 kHz, welche alle sowohl zum Sende- als auch Empfangs-
beamforming genutzt werden. Mit dem Single-Line-Acquisition Verfahren und dem resultierenden Array Gain
wird die Bildgebung mit einer großen Reichweite von über 6 m ermöglicht. Die benötigte Transceiver-Elektronik,
FPGA- und Systemarchitektur, sowie die Implementierung der GPU-beschleunigten Array Signalverarbeitung
im Frequenzbereich, als auch die Visualisierung mittels Nvidia CUDA und OpenGL werden näher beleuchtet.
Bei einem der Systeme werden die PUTs in die Kanäle einer Waveguide-Struktur eingesetzt, um eine gleichför-
mige dicht-besetzte λ/2 Arraygeometrie zu bilden, welche grating-lobe-freies Beamforming ermöglicht. Der
andere Systemprototyp nutzt eine irreguläre, spärlich-besetzte, spiralförmige Arrayanordnung, um eine große
Apertur aufzuspannen und somit eine hohe Winkelauflösung (2.3°) zu erzielen, die es erlaubt Muster und
Formen von Objekten, z.B. eine Hand, zu erkennen.
Zwei weitere minimalistische eingebettete Sonar Systeme sind besonders für hardware-limitierte und mobile
Applikationen zugeschnitten. Diese Konzepte nutzen schmalbandige PUTs zum Senden und breitbandige,
digitale MEMS Mikrofonarrays zum Empfangen. Hierbei wird jeweils das Multi-Line-Acquistion Verfahren
genutzt, welches einen einzigen Puls zur Bildformation benötigt und dadurch hohe Frameraten (30 Hz)
ermöglicht. Eines der Systeme besteht aus einem PUT und einem hexagonales 36-Element Mikrofonarray,
während die Signal- und 3D Bildverarbeitung mittels FPGA und GPU-beschleunigten Single-Board-Computer
(Nvidia Jetson Nano) erfolgt. Das andere System benötigt nur einen einzelnen Microcontroller zur 2D
Bildgebung und basiert auf einem wellengeleiteten PUT Linienarray, welches mit einem Mikrofon Linienarray
in einer T-Konfiguration kombiniert wird.
Weiterhin wird eine Array Design Strategie eingeführt, welche zwei irreguläre spiralförmige Subarrays mit un-
terschiedlichen Elementdichten kombiniert, die im Vergleich zu bisherigen Dichte-Verjüngungs-Modifikationen
ein geringeres Sidelobe-Level bei gleicher Hauptkeulenbreite erreicht. Basierend auf dieser Geometrie wird
ein Sonarsystem mit 64 MEMS Mikrofonen aufgebaut, das zusätzlich drei wellengeleitete PUTs zum aufeinan-
derfolgenden Aussenden von unterschiedlichen Frequenzen nutzt, deren resultierende Einzelbilder in einem
vii
Gesamtbild zusammengefügt werden. Das Sonarsystem erreicht somit einen vorteilhaften Trade-off zwischen
Winkelauflösung und Kontrast bei gleichzeitig hoher Framerate und Reichweite.
Alle aufgebauten Systemprototypen werden bezüglich der Sende-, Empfangs-, und Pulse-Echo Eigenschaften,
sowie deren erreichbare Bildgebungsqualität in einem schalltoten Raum charakterisiert. Zusätzlich werden die
relativen Amplituden- und Phasenfehler der verschiedenen Wandlertechnologien untersucht, die Auswirkungen
der Fehler auf das Beamforming mittels Monte-Carlo Simulationen analysiert, und die Verbesserungen nach
einer experimentellen Kalibrierung hervorgehoben. Außerdem wird die nachträgliche Bildverbesserung mit
einem neuronalen Autoencoder-Netzwerk untersucht, das trainiert wird, um die typischen Sendepuls-, Haupt-
und Nebenkeulen-Charakteristiken zu unterdrücken.
Insgesamt verdeutlicht die Arbeit das vielfältige Anwendungspotential von 3D Sonar Systemen, da sie
wertvolle Lokalisierungsinformationen liefern, welche konventionelle 1D Entfernungssensoren übertreffen,
und somit zum Fortschritt aufkommender Technologien im Bereich autonomer Fahrzeuge, Roboter und
industriellen Umgebungen beitragen.
viii
List of publications
2023
• G. Allevato, C. Haugwitz, M. Rutsch, R. Müller, M. Pesavento, and M. Kupnik, ”Two-Scale Sparse Spiral
Array Design”, IEEE Open Journal of Ultrasonics, Ferroelectrics, and Frequency Control, under review
2022
• G. Allevato, M. Rutsch, J. Hinrichs, C. Haugwitz, R. Müller, M. Pesavento, and M. Kupnik, ”Air-Coupled

Ultrasonic Spiral Phased Array for High-Precision Beamforming and Imaging”, IEEE Open Journal of
Ultrasonics, Ferroelectrics, and Frequency Control, vol. 2, pp. 40-54, 2022
• C. Haugwitz, C. Hartmann, G. Allevato, M. Rutsch, J. Hinrichs, J. Brötz, D. Bothe, P. F. Pelz, and M.

Kupnik, ”Multipath Flow Metering of High-Velocity Gas Using Ultrasonic Phased-Arrays,” IEEE Open
Journal of Ultrasonics, Ferroelectrics, and Frequency Control, vol. 2, pp. 30-39, 2022
• G. Allevato, T. Frey, C. Haugwitz, M. Rutsch, J. Hinrichs, R. Müller, M. Pesavento, and M. Kupnik,

”Calibration of Air-Coupled Ultrasonic Phased Arrays. Is it worth it?”, in Proc. IEEE International
Ultrasonic Symposium (IUS), 2022
• S. Schulte, G. Allevato, C. Haugwitz, and M. Kupnik, ”Deep-Learned Air-Coupled Ultrasonic Sonar

Image Enhancement and Object Localization”, in Proc. IEEE Sensors Conference, 2022
• M. Rutsch, G. Allevato, J. Hinrichs, C. Haugwitz, R. Augenstein, T. Kaindl, and M. Kupnik, ”A compact

acoustic waveguide for air-coupled ultrasonic phased arrays at 40 kHz”, in Proc. IEEE International
• M. Rutsch, L. Schultz-Fademrecht, G. Allevato, C. Haugwitz, J. Hinrichs, and M. Kupnik, ”Simulation

of acoustic losses in waveguides for air-coupled ultrasonic phased arrays”, in Proc. IEEE International
• M. Rutsch, O. Ben Dali, P. Downing, G. Allevato, C. Haugwitz, J. Hinrichs, and M. Kupnik, ”Optimization
of thin film protection for waveguided ultrasonic phased arrays”, in Proc. IEEE International Ultrasonic
Symposium (IUS), 2022
• C. Haugwitz, J. Hinrichs, M. Rutsch, G. Allevato, J. H. Dörsam, and M. Kupnik, ”Lamb Wave Reflection
and Transmission in Bent Steel Sheets at Low Frequency”, in Proc. IEEE International Ultrasonic
• J. Hinrichs, C. Haugwitz, M. Rutsch, G. Allevato, J. H. Dörsam, and M. Kupnik, Simulation of Lamb

Waves Excited by an Air-Coupled Ultrasonic Phased Array for Non-Destructive Testing”, in Proc. IEEE
International Ultrasonic Symposium (IUS), 2022
ix
2021
• M. Rutsch, A. Unger, G. Allevato, J. Hinrichs, A. Jäger, T. Kaindl and Mario Kupnik , ”Waveguide for
air-coupled ultrasonic phased-arrays with propagation time compensation and plug-in assembly”, Journal
of the Acoustical Society of America, vol. 150, pp. 3228-3237, 2021
• T. Maier, G. Allevato, M. Rutsch and M. Kupnik, ”Single Microcontroller Air-coupled Waveguided

Ultrasonic Sonar System”, in Proc. IEEE Sensors Conference, 2021
• M. Rutsch, F. Krauß, G. Allevato, J. Hinrichs, C. Hartmann, and M. Kupnik, ”Simulation of protection

layers for air-coupled waveguided ultrasonic phased-arrays”, in Proc. IEEE International Ultrasonic
• C. Hartmann, C. Haugwitz, G. Allevato, M. Rutsch, J. Hinrichs, J. Brötz, D. Bothe, P. F. Pelz and M.

Kupnik, ”Ray-tracing simulation of sound drift effect for multi-path ultrasonic high-velocity gas flow
metering”, in Proc. IEEE International Ultrasonic Symposium (IUS), 2021
• J. Hinrichs, M. Sachsenweger, M. Rutsch, G. Allevato, W. M. D. Wright, and M. Kupnik, ”Lamb waves

excited by an air-coupled ultrasonic phased array for non-contact, non-destructive detection of disconti-
nuities in sheet materials”, in Proc. IEEE International Ultrasonic Symposium (IUS), 2021
2020
• G. Allevato, J. Hinrichs, M. Rutsch, J. Adler, A. Jager, M. Pesavento, and M. Kupnik, ”Real-time 3D

imaging using an air-coupled ultrasonic phased-array”, IEEE Transactions on Ultrasonics, Ferroelectrics,
and Frequency Control, vol. 68, no. 3, pp. 796-806, 2020
• G. Allevato, M. Rutsch, J. Hinrichs, M. Pesavento, and M. Kupnik, ”Embedded Air-coupled Ultrasonic

3D Sonar System with GPU Acceleration”, in Proc. IEEE Sensors Conference, 2020
• G. Allevato, M. Rutsch, J. Hinrichs, E. Sarradj, M. Pesavento, and M. Kupnik, ”Spiral air-coupled ultrasonic
phased array for high resolution 3D imaging”, in Proc. IEEE International Ultrasonic Symposium (IUS),
2020
• M. Rutsch, G. Allevato, J. Hinrichs, and M. Kupnik, ”Protection layer for air-coupled waveguide ultrasonic
phased arrays”, in Proc. IEEE International Ultrasonic Symposium (IUS), 2020
• R. Müller, D. Schenck, G. Allevato, M. Rutsch, J. Hinrichs, M. Kupnik, and M. Pesavento, ”Dictionary

based learning for 3D-imaging with air-coupled ultrasonic phased arrays”, in Proc. IEEE International
• J. Hinrichs, Y. Bendel, M. Rutsch, G. Allevato, M. Sachsenweger, A. Jäger, and M. Kupnik, ”Schlieren

photography of 40 kHz leaky Lamb waves in air”, in Proc. IEEE International Ultrasonic Symposium
(IUS), 2020
2019
• G. Allevato, J. Hinrichs, D. Grosskurth, M. Rutsch, J. Adler, A. Jäger, M. Pesavento, and M. Kupnik, ”3D
imaging method for an air-coupled 40 kHz ultrasound phased-array”, in Proc. International Congress on
Acoustics, 2019
x
• C. Haugwitz, A. Jäger, G. Allevato, J. Hinrichs, A. Unger, S. Saul, J. Brotz, B. Matyschok, P. Pelz, and
M. Kupnik, ”Flow metering of gases using ultrasonic phased-arrays at high velocities”, in Proc. IEEE
• A. Jäger, J. Hinrichs, G. Allevato, M. Sachsenweger, S. Kadel, D. Stasevich, W. Gebhard, G. Hübschen, T.

Hahn-Jose, W. M.D. Wright, and M. Kupnik, ”Non-contact ultrasound with optimum electronic steering
angle to excite Lamb waves in thin metal sheets for mechanical stress measurements”, in Proc. IEEE
• M. Rutsch, O. Ben Dali, A. Jäger, G. Allevato, K. Beerstecher, J. Cardoletti, A. Radetinac, L. Alff, and M.
Kupnik, ”Air-coupled ultrasonic bending plate transducer with piezoelectric and electrostatic transduction
element combination”, in Proc. IEEE International Ultrasonic Symposium (IUS), 2019
2017
• J. Bilz, G. Allevato, J. Butz, N. Schäfer, C. Hatzfeld, S. Matich, H. F. Schlaak, ”Analysis of the measuring
uncertainty of a calibration setup for a 6-DOF force/torque sensor”, in Proc. IEEE Sensors Conference,
2017
xi
Contents
1 Introduction 1
1.1 Challenges and objectives of sonar imaging in air . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Original work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Thesis structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Fundamentals of beamforming and imaging 7

2.1 Coordinate system and array terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Transmit beamforming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.1 Unfocused beam steering in the far-field . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Impact of the array aperture size on the beam pattern, near-field and far-field . . . . . 11
2.2.3 Focused beam steering in the near-field . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.4 Impact of the inter-element spacing - grating lobe formation . . . . . . . . . . . . . . . 15
2.2.5 Impact of the element aperture size - pattern multiplication . . . . . . . . . . . . . . . 17
2.3 Receive beamforming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.1 Unfocused receive beam steering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3.2 Focused receive beam steering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3.3 Two-dimensional spatial filter response in the uv-domain . . . . . . . . . . . . . . . . . 24
2.4 Pulse-echo image formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4.1 Discretization of the region-of-interest . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.2 Image formation process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.3 Range resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.4 Angular resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.5 Contrast ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.6 Point spread function and convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.4.7 One-way vs. two-way beamforming - single line acquisition . . . . . . . . . . . . . . . 37
3 Uniform dense waveguided transceiver PUT arrays 40

3.1 Related work on air-coupled ultrasonic phased arrays . . . . . . . . . . . . . . . . . . . . . . . 40
3.2 Waveguided uniform rectangular array geometry . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3 Transceiver electronics and imaging system architecture . . . . . . . . . . . . . . . . . . . . . 42
3.4 Implementation details for beamforming and imaging . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.1 Pulse-echo sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.2 Frequency-domain array signal processing . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4.3 Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.5 Experiments, results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5.1 Measurement setups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5.2 Transmit and receive characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5.3 Range and angular resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.5.4 Range of view and field of view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.5.5 Frame rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
xii
3.6 Chapter summary and conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4 Non-uniform sparse spiral transceiver PUT arrays 53

4.1 Related work on non-uniform sparse spiral array geometries . . . . . . . . . . . . . . . . . . . 53
4.2 Sparse spiral sunflower array geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3 Operation modes and numerical simulation model . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.4 Experiments, results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4.1 Measurement setups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.4.2 Two-dimensional directivity patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.4.3 Sectional directivity patterns for varying focal angles . . . . . . . . . . . . . . . . . . . 58
4.4.4 Radial on-axis pattern for varying focal distances . . . . . . . . . . . . . . . . . . . . . 62
4.4.5 Angular resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4.6 Multi-reflector scenes in the far- and near-field . . . . . . . . . . . . . . . . . . . . . . 65
4.4.7 Relative amplitude and phase errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.5 Calibration of relative amplitude and phase errors . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.5.1 Monte-Carlo simulation, results and discussion . . . . . . . . . . . . . . . . . . . . . . 71
4.5.2 Experimental calibration, results and discussion . . . . . . . . . . . . . . . . . . . . . . 73
5 Embedded low-cost sonar system concepts 76

5.1 Related work on air-coupled ultrasonic sonar systems . . . . . . . . . . . . . . . . . . . . . . . 76
5.2 Embedded GPU-accelerated 3D sonar system with hexagonal aperture . . . . . . . . . . . . . 77
5.2.1 Electronics and system design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.2 Experiments, results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3 Single-microcontroller waveguided sonar system based on a T-aperture . . . . . . . . . . . . . 80
5.3.1 Electronics and system design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6 Two-scale multi-frequency sparse spiral arrays 85

6.1 Related work on sparse spiral array modifications . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2 Two-scale spiral array design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.2.1 Numerical beam pattern and point spread function model . . . . . . . . . . . . . . . . 86
6.2.2 Analysis of the classic sunflower spiral array . . . . . . . . . . . . . . . . . . . . . . . . 87
6.2.3 Geometry of the two-scale spiral array . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2.4 Characteristics of the two-scale spiral array . . . . . . . . . . . . . . . . . . . . . . . . 92
6.2.5 Benchmark, results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
6.3 Two-scale sonar system concept with multi-frequency excitation . . . . . . . . . . . . . . . . . 99
6.3.1 Multi-frequency excitation and image compounding . . . . . . . . . . . . . . . . . . . . 100
6.3.2 Two-scale multi-frequency electronics and system design . . . . . . . . . . . . . . . . . 103
7 Deep-learned sonar image enhancement 121

7.1 Neural auto-encoder network architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.2 Data synthesis, training and testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.3 Test results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
xiii
8 Conclusion and outlook 127
Bibliography 131
List of acronyms 145
List of symbols 147
List of figures 150
List of tables 151
xiv
1 Introduction
Automation becomes increasingly prevalent due to advancements in sensor technologies, artificial intelligence,
and robotics, enabling to complete tasks with high precision and consistency, that were previously performed
by humans. This trend has been ongoing for several decades and has led to significant changes, particularly in
industries, where automation technologies reduce labor costs, increase productivity and improve the quality of
work. One fascinating example of this development is a highly automated warehouse that employs a swarm
of mobile robots to optimize the transport, inspection and packaging of materials and goods. In order to not
impair efficiency and to ensure safety in this environment, it is essential to divide the tasks and responsibilities
of machines and humans. Therefore, the work spaces of robots and humans are in many cases still separated
by physical barriers and interaction takes place only via strictly defined interfaces. For example, only the
automated robots are responsible for gathering the goods of an order, which are handed over at designated
transfer zones to human workers, who then take care of loading and delivering the package. However, as
technology continues to advance, automation is increasingly being integrated not only in industries, but also
into our everyday lives, such that these work spaces gradually begin to merge.
In particular, mobile autonomous robots will significantly change the way humans and machines interact
with each other in the near future. For instance, the experimental roll-out of delivery robots and drones
has launched in multiple countries, which are used for the last-mile delivery of packages, groceries, take-
away meals and other goods, and therefore navigate sidewalks and streets directly to people’s homes. Here,
autonomous aerial drones enable particularly short delivery times, as they can fly over buildings and obstacles
as well as deliver to remote and hard-to-reach locations, such as rural and mountainous areas.
Apart from that, rescue robots and drones are gaining relevance in emergency situations, such as earth-
quakes and other natural catastrophes, industrial accidents, burning buildings or collapsed tunnels. In these
environments, which are particularly difficult to access and dangerous for emergency responders, these robots
support the search and rescue operation of injured persons and provide information for damage and risk
assessment.
Furthermore, patient care and nursing robots are deployed in hospitals, nursery and rehabilitation centers
or directly in patient’s homes. Here, the robots fulfill healthcare tasks, such as administering medication,
monitoring vital signs and enabling remote visits via telepresence robots. In addition, even closer human-
machine interactions are provided, such as physiotherapy or assisting with basic care and hygiene, particularly
valuable for impaired, elderly and injured people. Apart from physical support, robots are developed, that are
specifically intended for social interaction, e.g., to help patients with cognitive disorders or dementia with
socialization, as well as to provide companionship and emotional support.
All these examples show, that the boundaries between machines and humans will be blurred, as the robots
are designed to operate in public spaces and interact with people in a human-like way. Therefore, different
from the automated warehouse example, segregated work spaces, well-defined interfaces, and non-contact
interactions can not be maintained consistently. Additionally, it is evident, that these autonomous robots
must be able to reliably operate and navigate in a wide variety of indoor and outdoor environments, and,
under a diverse set of conditions. As a consequence, the risk of malfunction and inflicting damage increases
significantly, including for example, damage to the autonomous system itself as a result of false navigation and
collisions, financial damages due to frequent maintenance and production downtime, damage in reputation
1
and the loss of trust in automation technology, and most importantly, damages in terms of injuries to humans
and animals. In conclusion, next-gen automation technologies are facing new serious challenges to ensure
reliability and safety for navigation, operation and interaction.
a b c
Figure 1.1: Next-gen mobile robots must navigate and interact safely in a wide variety of environments, e.g.
for delivery (Amazon Scout [1]), for rescue operations (ANYbotics ANYmal [2]), or for patient care
(Fraunhofer IPA Care-O-Bot [3]).
In order to overcome these challenges, perception sensors play a crucial role as they gather and provide
information about the environment, enabling the robot to interpret the situation and to make reasoned
decisions for further actions. However, since there is no ideal perception sensor type that performs flawlessly
in all environments and conditions, autonomous systems must be equipped with various different sensing
modalities, such as radar, ultrasound, camera, and lidar sensors. This way, the redundancy mitigates the impact
of sensor failures, decisions can be made when sensors provide ambiguous information and the strengths and
limitations of each sensor type complement each other.
For example, the optical sensing principles, i.e. camera and lidar, provide a rich and high-resolution visual
image of the environment, which is suitable for tasks such as mapping and localization, as well as object
detection and classification even in complex scenes. However, cameras are affected by lighting conditions,
such as low-angle direct sunlight at dawn and dusk or complete darkness. Additionally, as with lidar systems,
they are sensitive to weather conditions, such as snow fall, fog, heavy rain and dirt, limiting their usability in
outdoor environments. Not only that, but indoor environments pose further challenges for optical sensors,
such as dust, which can accumulate and cause frequent maintenance, as well as transparent and reflective
surfaces, commonly found in modern buildings, which can complicate and falsify detections.
Therefore, ultrasonic sensors are particularly effective as a complementary sensor modality, as they are
unaffected by transparency, specularity, or other optical properties of materials due to the mechanical wave
characteristics of sound. For the same reason, they operate reliably in a wide range of light conditions,
whether in complete darkness or in bright sunlight. In addition, the relatively large wavelength allows to see
through fine particles such as rain, snow, fog and dust, making them robust to weather and outdoor conditions.
Furthermore, the rather low signal frequencies facilitate the excitation and readout of the ultrasonic sensors,
eliminating the requirement for analog high-frequency electronic components, providing a low-power, low-cost
and readily-available solution, particularly in times of supply shortages due to global crises.
However, while radar and optical sensing modalities have steadily advanced up to achieving high-frame-rate
3D capability, the full potential of air-coupled ultrasonic perception sensors has not been exploited yet, as they
are typically used only as 1D range finders. Clearly, these ultrasonic 1D sensors can be sufficient for simple
2
tasks, such as providing parking assistance for passenger cars, as these vehicles are rather large, have limited
maneuverability, and the environment they are navigating in is relatively simple and structured. In contrast,
compact mobile robots and drones are typically highly maneuverable with multiple degrees of freedom. They
must operate in difficult terrain and share common paths with humans, requiring more detailed distance and
particularly directional information for reliable and safe navigation. Therefore, it is time for an upgrade of 1D
ultrasonic range finders.
1.1 Challenges and objectives of sonar imaging in air

While ultrasonic 1D perception sensors have not significantly changed over the years, other potentials of
air-coupled ultrasound have been increasingly explored, resulting in the emergence of new and innovative
applications in non-contact material testing [4]–[6], power transfer [7]–[9], agriculture[10], [11] and food
analysis [12], [13]. Here, the use of phased array technology for air-coupled ultrasound transmission becomes
increasingly popular due to the high achievable sound intensity and the ability to steer and focus the sound
beam. Bringing phased arrays to air has led to further application developments such as tactile displays [14]–
[16], particle levitation for non-contact processing [17]–[19] and parametric loudspeakers [20]–[22] .
Motivated by these developments, the goal of this work is to utilize air-coupled ultrasonic phased arrays for
3D perception in order to advance the rather simple 1D range finders. These 3D sonar systems enable the
formation of more detailed environmental images in air due to the ability to distinguish multiple reflectors
within the field of view and differentiate their distance, height and direction, allowing to meet the requirements
for next-gen autonomous robotic applications.
While 3D sonar systems in air are a rather new field of research, the concept of using ultrasonic phased
arrays for image formation is certainly not, as it has been intensively studied for medical imaging for several
decades, for which commercial systems have reached maturity and are already available. Although there are
many commonalities shared between the in-air sonar application and medical imaging, such as the general
scene illumination, data acquisition and image formation principles, there are also fundamental differences
for air-coupled imaging.
First, the speed of sound is more than four times slower in air than in fluids or tissue [23]. Second, the
region of interest of in-air applications spans over several meters and ideally over the entire hemisphere,
i.e. a field of view of ±90◦ , in contrast to the medical use case, where it is limited to a few centimeters and
typically to ±30◦ . Third, the poor coupling of ultrasound into air and the frequency-dependency of the sound
attenuation [24] motivates using high-power but relatively low-frequency transducers for excitation. Forth, as
a consequence, the utilized wavelength in air is typically 10 times larger compared to medical applications, so
that a correspondingly increased aperture size is required to obtain comparable angular resolutions in air.
These conditions highly constraint the selection of suitable air-coupled transducer technologies. One of the
applicable types are bending plate piezoelectric ultrasonic transducers, which have been widely used for many
air-coupled ultrasonic applications and have been proven to be effective [8], [17], [25]–[30]. The reason
for their popularity is that these transducers have a high output sound pressure level and a low resonant
frequency of typically 40 kHz, while they are reasonably sized and readily available. Due to their relatively
small aperture size, they are capable of illuminating a wide field of view. The main drawback is their narrow
bandwidth of approximately 3 kHz (7%) resulting in three major implications. First, the limited bandwidth
causes ringing and a rather long temporal pulse length, which degrades the achievable range resolution.
Second, the transmission of broad-band chirps or coded excitations are not applicable. Third, even minor
manufacturing tolerances can cause significant resonant frequency alterations, leading to relative amplitude
and phase response deviations between multiple transducers in an array configuration.
As a consequence, all of these conditions and constraints result in the major challenges and optimization
goals of sonar imaging in air, which include providing
3
1. a long range and wide field of view for large-volume imaging,
2. high frame-rates to perceive dynamically changing environments,
3. precise angular resolution and high contrast to obtain detailed images,
4. while maintaining a reasonable size, cost, computational and system complexity.
Since these optimization goals are to some extent interdependent, improving multiple quality metrics in a
holistic approach and making appropriate trade-offs are tasks of high complexity. On top of that, depending
on the specific application, certain optimization goals have higher priority than others. For example, in
hardware and space-constrained applications, size and computational complexity are particularly important,
in addition to providing high frame rates, if the application is mobile. In contrast, stationary industrial imaging
applications, such as those performing object classification in harsh environments, place more emphasis on
high resolution, contrast, and range. Hence, the requirements for a compact size and low costs are relaxed in
order to successfully complete the task.
Therefore, the objective of this thesis is the conception and realization, as well as the numerical and
experimental evaluation of different real-world air-coupled 3D sonar imaging systems, which are designed to
pursue different optimization goals, in order to explore the resulting real-world capabilities, trade-offs and
limitations.
1.2 Original work

The main contributions, to the best of the author’s knowledge, are as follows:
1. The realization of an air-coupled ultrasonic phased array imaging system, including the implementation
details, experimental characterization and comparison of high-frame rate and large-volume imaging
techniques based on a dense waveguided PUT transceiver array, which has been published in [31] and
[32].
2. The utilization and characterization of a sparse spiral PUT array geometry for high-precision and long-
range ultrasound beamforming in the far- and near-field, enabling grating-lobe-free image formation
with high resolution in air, which has been published in [33] and [34].
3. The analysis of the impact of PUT-typical relative amplitude and phase errors on the beamforming
characteristics in array configurations based on numerical Monte Carlo simulations, as well as the
experimental evaluation and recommendations regarding the effectiveness of error compensation using
transducer pre-selection and calibration, as published in [35].
4. The design, realization and characterization of two minimalistic embedded low-cost ultrasonic imaging
systems, which are particularly tailored for mobile hardware- and size-constrained applications, which
have been published in [36] and [37]. These systems are based on heterogeneous transducer technologies,
i.e. PUTs for transmission and MEMS microphones for recepetion, so that they require a minimum
number of components.
5. Two-scale array design strategies based on sparse spiral geometries with the objective to optimize the
side lobe level and main lobe width for generic one-way beamforming, including numerical evaluations
and comparisons to existing density-tapering strategies. This work is under review for publication
in [38].
4
6. The design, realization and evaluation of an ultrasonic imaging system combining a two-scale sparse
spiral MEMS receiver array with a waveguided PUT-based multi-frequency excitation technique to
achieve high-frame-rate imaging with improved contrast and angular resolution.
7. The utilization and analysis of a neural autoencoder network for ultrasonic image enhancement and
object localization, realizing a deep-learned image deconvolution in order to mitigate pulse shape
and point spread function characteristics, resulting in contrast and resolution improvements by post-
processing, which has been published in [39].
1.3 Thesis structure

Chapter 2 covers the basics of beamforming and wave-based imaging, which are regularly required throughout
this thesis. Here, the general principles of unfocused and focused transmit and receive beamforming are
explained including the basic terms and conventions, as well as the various influences of the array geometry on
the beamforming metrics. Furthermore, different image formation methods including their discretization and
representation strategies are introduced, and the relationships between the generic beamforming parameters
to the imaging quality metrics are highlighted.
Chapter 3 contains the system design of an air-coupled ultrasonic phased array prototype consisting of 64
PUT elements, which in combination with a waveguide structure, forms a uniform dense transceiver array for
grating-lobe-free unfocused far-field beamforming. In addition to the description of the electronics and system
architecture, insight into the practical implementation of the control system, as well as the GPU-accelerated
signal processing and visualization, are provided. Subsequently, an experimental characterization of the
developed prototype system is performed using two different imaging techniques.
Chapter 4 diverges from the conventional regular array geometries and instead, focuses on a sparse and
spiral 64-element PUT array geometry with the objective to enable high-precision beamforming, and, thus,
high-resolution imaging. Due to the larger aperture, the capability of using focused beams in the near-field is
covered as well in addition to the unfocused far-field case. After introducing the sparse spiral array geometry,
an in-depth numerical and experimental characterization is conducted, where the generic beamforming
parameters as well as the specific imaging quality metrics are analyzed, including the demonstration of the
high-resolution image formation. Furthermore, special attention is turned to the PUT-typical relative amplitude
and phase errors in array configurations, whose adverse effects are assessed based on Monte Carlo simulation
and whose compensation is investigated by experimental calibration.
Chapter 5 addresses the optimization of size, cost, system and computational complexity, and presents
two different embedded sonar system concepts for this purpose. Both systems are based on heterogeneous
transducer technologies, namely narrow-band PUTs for transmission and small, wide-band MEMS microphone
arrays for reception. The first system concept uses a hexagonal dense 2D array in conjunction with an
embedded single board computer equipped with a GPU for high-frame-rate 3D image formation. The second
concept requires only a single microcontroller for signal processing and is based on a waveguided transmit
PUT line array combined with a MEMS microphone receive line array in a T-configuration. Both systems are
experimentally characterized in terms of imaging performance and achievable frame rates.
Chapter 6 combines two design concepts based on sparse spiral arrays and one-way beamforming using
MEMS microphones into one overall concept, with the objective of finding a favorable trade-off between low
system complexity and cost, as well as high frame rates, resolution and contrast. Both strategies exploit the
5
contrast enhancement by introducing variations in the array-specific point spread function, but in different
ways. The first design strategy modifies the spiral array geometry by combining two heterogeneous sub-arrays
featuring different element densities and aperture sizes. The second design strategy is based on transmitting
multiple different frequencies, all of which can be received by the same array elements, enabled by the use of
wide-band MEMS microphones. Finally, a prototype system is built based on the two-scale and multi-frequency
design strategies, including its experimental evaluation of the transmit, receive, pulse-echo, and imaging
characteristics.
Chapter 7 covers the post-processing of the ultrasound images acquired to improve the resolution and
contrast by using a neural autoencoder network originating from optical image processing. Furthermore, the
architecture is used to directly extract the coordinates of detected objects. The neural network is trained
with randomized labeled ultrasound images, so that it learns to remove the pulse shape, side and main lobe
characteristics, and therefore corresponds to deep-learned image deconvolution. In addition to the training,
the architecture is tested and improvements are highlighted.
6
2 Fundamentals of beamforming and imaging
The main objective of this chapter is to provide a comprehensive and illustrative overview of the methodology,
characteristics and challenges of wave-based imaging using phased arrays, including the tools required for
this task, i.e. transmit and receive beamforming, which are frequently referenced throughout this thesis. The
individual sections are focused on narrow-band and coherent transmit and receive signals, but apart from
that, they are generally formulated to be application-neutral, i.e. independent of the medium and frequency
used, for both the far and near field. Therefore, this chapter references a number of books and articles from
multiple disciplines, covering physical principles of (sound-)waves, antenna arrays, medical ultrasound, radar
and communications, to name a few. The structure of the chapter is as follows: After introducing the basic
terms and the coordinate system, transmit beamforming is explained for the unfocused and then for the
focused case. Subsequently, the two beamforming variants for the receive mode are considered. With this
toolset, pulse-echo imaging based on two different methods and their specific characteristics are examined
afterwards.
2.1 Coordinate system and array terms

Throughout this thesis, a right-handed coordinate system is used, where the planar array geometry is positioned
on the vertical xy-plane, the xz-plane is the horizontal plane, the array center is located at the origin and the
array surface normal points to the positive z-axis [Fig. 2.1(a)]. The Cartesian coordinate vector of a point
r = (x, y, z) in the forward hemisphere is expressed by an azimuth-over-elevation coordinate system [40]
given by the transformation ⎛ ⎞ ⎛ ⎞
x R sin(θ) cos(ϕ)
r = ⎝y ⎠ = ⎝ R sin(ϕ) ⎠ , (2.1)
⎜ ⎟ ⎜ ⎟
z R cos(θ) cos(ϕ)
where R is the Euclidian distance from the array center to the point in space, θ and ϕ are the azimuth
and elevation angles, respectively, such that the direction (θ, ϕ) = (0◦ , 0◦ ) points to the positive z-axis. The
corresponding back-transformation is given by
⎛ ⎞ ⎛ √︁ ⎞
R x2 + y(︁2 + )︁z 2
(︂arctan y/z (2.2)
⎜ ⎟ ⎜ ⎟
⎝θ⎠ = ⎜ ⎠.
)︂⎟
arcsin y/ x2 + y 2 + z 2
⎝ √︁
ϕ
An important detail for the forthcoming θϕ-beampattern diagrams is that θ controls only the horizontal
direction, whereas ϕ is the dominant angle controlling the vertical and horizontal directions. For example, if
ϕ = 90◦ , the direction vector points in the positive y-direction regardless of θ. With the coordinate system
introduced, the array-specific terms and conventions are defined as follows.
In general, a phased array describes a group of multiple wave-transmitting, -receiving or -transceiving
array elements. In the context of electromagnetic waves, e.g. arrays for radar or communications, these
elements usually correspond to antennas, whereas for ultrasonic waves, e.g. for sonar and medical imaging,
7
Dap,x
y 2 (x63 , y63 )
1
x
y (6)
R 0
Dap,y
ϕ
d
θ -1
z y
(x0 , y0 )
-2 z x
(a) (b) -2 -1 0 1 2
x (6) Element aperture
Figure 2.1: Right-handed azimuth θ over elevation ϕ coordinate system and front view of the planar uniform
(8 × 8)-(λ/2) rectangular array geometry.
they consist of ultrasonic transducers and microphones. Typically, each array element can be controlled or
read-out individually to generate steerable interference patterns, also referred to as beamforming [41]–[44].
Beamforming is realized either physically when transmitting waves or computationally when receiving waves
using array signal processing. In both cases, the beamforming capability is highly dependent on the array
geometry, i.e. the positions of the array elements [Fig. 2.1(b)]. The position of the m-th array element is
specified by
rm = (xm , ym , zm )⊺ , for m ∈ [0, 1, . . . , M − 1], (2.3)
where M is the total number of elements and m is the element index. In this thesis, a number of different,
but exclusively planar array geometries are used, such that zm = 0 is valid for all geometries. Further
important features of an array geometry are the array aperture, i.e. the area spanned by all elements, and the
inter-element spacing (IES) d, i.e. the distance between two adjacent elements. Apart from that, the array
elements themselves can feature a specific extended geometry, referred to as element aperture [45].
Within this first chapter, the focus is on a simple array geometry, where the elements are positioned on a
periodic rectangular grid, such that the IES d is uniform, referred to as uniform rectangular array (URA) [43],
[45]. The position of the m-th element of a URA is therefore completely defined by specifying the IES d, the
number of rows My and columns Mx using the notation (My × Mx )-(d), that is
⎛ ⎞ ⎛ (︁ )︁⎞
xm d· m mod M x ) − 0.5(M x − 1)
rURA,m = ⎝ ym ⎠ = ⎝ d · ⌊m/Mx ⌋ − 0.5(My − 1) ⎠ , for m ∈ [0, 1, . . . , (Mx · My ) − 1], (2.4)
⎜ ⎟ ⎜ (︁ )︁ ⎟
zm 0
resulting in an aperture width/height of Dap,x/y = d · (Mx/y − 1). In the case of My = 1 or Mx = 1, the

geometry is also referred to as uniform line array (ULA).
In the following, the basic principles of transmit and receive beamforming are elaborated using a 1 × 2,
1 × 8, and 8 × 8 URA, most commonly with half-wavelength IES (d = 0.5 λ). In further chapters, periodic array
geometries are introduced, based on an equi-lateral triangular grid forming a hexagonal-shaped aperture
(Chapter 5), as well as non-uniform sparse geometries (Chapter 4,6).
8
2.2 Transmit beamforming
This section covers the concept of beamforming in order to realize a spatial filter effect in transmission. For this,
the models for generating a specific location-dependent wave superposition and the numerical computation of
the resulting wavefield are clarified, first, for unfocused far-field and then for focused near-field beamforming.
In this regard, the various effects of array design, including the aperture size, inter-element spacings, and
element size on the beamforming capability and its quality metrics for quantification are addressed.
2.2.1 Unfocused beam steering in the far-field

An ideal 1 × 2 ULA of transmitting point source elements is considered, i.e. the element apertures are infinitely
small, with an IES of d = 0.5 λ. When both elements are excited with an identical continuous-wave (CW)
excitation signal, each element creates a radial-propagating hemispherical wave. Both transmitted waves
interfere with each other, which results in a wave field composed of the superposition of the two individual
waves [46].
z (λ)
θ ∆R
θ
d
x (λ) ∆T
t (T )
Figure 2.2: Basic principle of wave superposition for beam steering with a (1 × 2) array. A relative time or
phase shift between the excitation signals results in a redirection of the intersecting wavefront
maxima.
For clarity, first, the focus is only on the equal-phase maximum-level wave fronts of each wave (Fig. 2.2). At
the intersection points, where the respective wave maxima overlap, the resulting maximum level doubles. As
the two waves continue to propagate radially over time, the intersection points of the maxima move forward
as well, but in a fixed direction. Therefore, the direction in which the doubled maxima occur is independent
of time, but can be steered by introducing a relative phase shift between the excitation signals [45], [47].
In order to obtain the relative phase shifts required to steer the doubled maximum levels in a particular
direction, the far-field assumption is used. The far-field assumption exploits that the wavefronts of both waves
are approximately planar in large distances compared to the spacing between their respective source elements
[42], [48], [49], so that the following geometric relation is valid. The path difference required between the
wavefronts is given by
∆R = d · sin(θ), (2.5)
where θ is the defined steering direction and d is the IES. The path difference is converted to a relative time
∆T and phase difference ∆φ, that is
∆R d 2πf0 2π
∆T = = sin(θ), and ∆φ = ∆R = d sin(θ), (2.6)
c c c λ
9
where c is the wave propagation speed, f0 is the wave frequency, and λ is the wavelength. The respective time
and phase delays extended to two-dimensional θφ-steering and specified for each particular element are given
by
1 (︁ 2π (︁
· xm sin(θ) cos(ϕ) + ym sin(ϕ) , and xm sin(θ) cos(ϕ) + ym sin(ϕ) , (2.7)
)︁ )︁
∆Ttx,m = ∆φtx,m =
c λ
where (xm , ym ) is the position of the m-th element. A mono-frequent CW signal can be equally delayed by
using either the true-time delay or the complex phasor, such that the excitation signal of the m-th element
stx,m (t) for steering to direction (θ, ϕ) is required to be
stx,m (t) = s0 (t − ∆Ttx,m ) = e−j∆φtx,m · s0 (t), (2.8)
where s0 (t) = A0 ej2πf0 t is the base excitation signal and its corresponding amplitude is A. By defining a
so-called far-field beamforming vector w∗far (θ, ϕ) ∈ CM ×1 [48], whose m-th entry is given by
2π
xm sin(θ0 ) cos(ϕ0 )+ym sin(ϕ0 )
(︁ )︁
w∗far,m (θ0 , ϕ0 ) = e−j∆φtx,m (θ0 ,ϕ0 ) = e−j λ , (2.9)
the vector of excitation signals stx ∈ CM ×1 can be expressed by
stx (t) = w∗far (θ0 , ϕ0 )s0 (t). (2.10)
In summary, it is possible to steer a higher-amplitude wave front compared to a single-element wave to a

certain direction using time- or phase-delayed excitation signals, which is the basic principle of unfocused
transmit beamforming. Next, not only the interference of the maximum-level wavefront is considered, but the
interference of the entire wavefield of an (8 × 8)-(λ/2)-array of point sources.
The wave field is composed of the superposition of the individual hemispherical waves generated by each
point source element. In the following, the assumption that the elements are embedded in an infinitely wide
rigid baffle and radiate into an infinite space is considered. Therefore, a model based on the discretized
Rayleigh integral [44], [46], [49], [50] is used to numerically generate the wave field, which is at a specific
point rP given by
M −1 2π
∑︂ e−j λ RP,m
p(rP , θ0 , ϕ0 ) = stx,m (θ0 , ϕ0 ) · , (2.11)
2πRP,m
m=0
where RP,m = ∥rP − rm ∥2 is the Euclidian distance from the m-th element to rP . By defining a steering vector
a ∈ CM ×1 [48], whose m-th entry is given by
2π
e−j λ RP,m
am (rP ) = , (2.12)
2πRP,m
the equation for p(rP , θ0 , ϕ0 ) is compactly expressed as
p(rP , θ0 , ϕ0 ) = s⊺tx (θ0 , ϕ0 ) · a(rP ) = s0 (t) · wH

far (θ0 , ϕ0 ) · a(rP ). (2.13)
where RP,m = ∥rP − rm ∥2 is the Euclidian distance from the m-th element to rP .
By steering the (8 × 8)-(λ/2)-array in the direction (θ0 , ϕ0 ) = (0◦ , 0◦ ), a beam-shaped region is formed
in the resulting wave field along the specified direction, in which the individual waves of all elements
interfere constructively, thus significantly increasing the amplitude compared to the surrounding regions,
as expected (Fig. 2.3). This high-amplitude beam along the steering direction is referred to as main lobe.
Adjacent to the main lobe, weaker beams in other directions are formed, called side lobes. Side lobes occur
10
-1 -0.5 0 0.5 1 0 0.25 0.5 0.75 1
0 MLW3
Magnitude (dB)
Main lobe MSLL
10 10 -10 MLW6
Side lobes
-20
z (6)
z (6)
5 5
-30
0 0 -40
10 Minima
5 00 -5 -10 10 5 0 -5 -10 90 45 0 -45 -90
-1 -0.5 0.5 1 0 0.25 0.5 0.75 1
x (6) x (6) 0 3 (/ )
Magnitude (dB)
10 10 -10
-20
z (6)
z (6)
5 5
-30
0 0 -40
10 5 0 -5 -10 10 5 0 -5 -10 90 45 0 -45 -90
(a) x (6) (b) x (6) (c) 3 (/ )
Figure 2.3: Normalized wavefield Re{p} (a), magnitude field |p| (b) and far-field θ-sectional view (c), i.e. beam
pattern, of the magnitude field for unfocused beam steering to θ = 0◦ and θ = −30◦ .
due to partially constructive interference, where the individual waves are not completely in-phase. Main and
side lobes are separated from each other by minima due to total destructive interference [43], [45].
The normalized angular sectional view at a specific distance in the far-field, referred to as the beam
pattern, highlights important quality metrics of the beamforming capability of the array, i.e. the main lobe
width (MLW) and the maximum side lobe level (MSLL) [45]. First, the MLW indicates the angular width
where the magnitude of the main lobe is above a certain threshold. Typical MLW threshold levels are −3 dB
and −6 dB for which the corresponding acronyms are MLW3 and MLW6 , respectively. Second, the MSLL
specifies the relative level of the highest side lobe with respect to the level of the main lobe. As a general
preference for beamforming, a narrow main lobe width and a low MSLL are favored, to enable a precise
steering in the direction intended without causing significant stray waves in other directions [43], [44], [51].
In fact, the beampattern can also be interpreted as a spatial filter response [41], [45], with the main lobe
representing the passband and the side lobe level corresponds the stopband level [47].
Next, the array is steered in the direction (−30◦ , 0◦ ). In addition to the main lobe being positioned to
the set direction, the adjacent side lobes are likewise displaced. However, clearly the main lobe as well as
the side lobes widen as their angular positions are moved towards the peripheral region. Consequently, the
beamforming precision degrades with absolutely increasing steering direction, such that the spatial filtering is
less direction-selective at the periphery [42], [43].
2.2.2 Impact of the array aperture size on the beam pattern, near-field and far-field
The resulting wavefield and beampattern depends - apart from the element excitation signals - mainly on the
array geometry. Therefore, a (10 × 10) and (5 × 5) geometry are considered and compared to the (8 × 8)
array, where each array features an IES of d = 0.5 λ (Fig. 2.4). The normalized beampattern of the (10 × 10)
array has a narrower main lobe and an increased number of side lobes at differing positions with a slightly
decreased MSLL compared to the (8 × 8) case. The opposite changes are evident for the (5 × 5) array, where
11
the width of the main lobe and the smaller number of side lobes are widened, in addition to the minor increase
in MSLL in comparison to the (8 × 8) array.
0 0.25 0.5 0.75 1 Natural focus

0
Magnitude norm.
Magnitude (dB)
10 × 10 1 8×8
10 8×8
-10
10 × 10
-20
z (6)
5 0.5
-30
0 Near-field Far-field
-40 0
10 5 0 -5 -10 90 45 0 -45 -90 0 2.5 5 7.5 10
x (6) 3 (/ ) z (6)
0 0.25 0.5 0.75 1
0
Magnitude norm.
Magnitude (dB)
5×5 8×8 1
10 -10
8×8
-20
z (6)
5 0.5
-30 5×5
0 -40 0
10 5 0 -5 -10 90 45 0 -45 -90 0 2.5 5 7.5 10
(a) x (6) (b) (c)
3 (/ ) z (6)
Figure 2.4: Normalized magnitude field |p| (a), far-field beam pattern (b) and on-axis pattern along the z-axis
(c) for a (10 × 10) and (5 × 5) URA compared to a (8 × 8), each with d = λ/2 IES. The MLW is
narrowed and the natural focus forms at a larger distance with increasing aperture size, thus the
near-field is extended.
The number of side lobes for uniform λ/2-arrays depends on the number of elements in the corresponding
dimension. For Mx elements, there are Mx − 1 minima and therefore Mx − 2 side lobes in θ-direction,
considering that there is a circular continuation at the angular boundaries at ±90◦ [45], [52]. The resulting
MLW of a uniform array is mainly dependent on the aperture size and can be estimated by the approximations
(︃ )︃ (︃ )︃
λ λ λ λ
MLW3 = 2 arcsin 0.44 ≈ 0.89 , MLW6 = 2 arcsin 0.6 ≈ 1.2 , and (2.14)
Mx · d Mx · d Mx · d Mx · d
(︃ )︃
λ λ
MLWnull = 2 arcsin 1 ≈2 , (2.15)
Mx · d Mx · d
where MLWnull is the MLW between the first order beam pattern minima. The approximations are derived
from the analytic solution of a uniform line array beam pattern [45], [52], that is
(︂ )︂ (︂ )︂
sin 2π Mx d
λ 2 sin(θ) sin 2π Mx d
λ 2 sin(θ)
p(θ) = )︂ ≈ (2.16)
sin(θ)
(︂ 2π Mx d
Mx sin 2π d
λ 2 sin(θ) λ 2
[︂ √
by numerically solving sin(x)
]︂
x = 1/ 2, 0.5, 0 for the argument x, followed by solving x = 2π λ 2 sin(θ) for
Mx d
θ to obtain half of the MLW [45], [51].

In addition to the change of the beam pattern, the distance at which the main lobe starts to form increases
when the aperture is enlarged, which is highlighted in the on-axis magnitude patterns along the z-axis. The
12
on-axis patterns are normalized to their respective farthest occurring maximum, where the corresponding
distance is called the natural focal distance [44], [52]. Here, the phase differences between the individual
waves of the elements are small, such that the far-field approximation becomes valid. The regions before and
after the natural focal distance are termed near-field and far-field [44], [52]. Within the near-field, there
are significant variations in magnitude, so that high-intensity radiation along the intended direction is not
ensured in this region when using unfocused beamforming. Beyond the natural focal distance, i.e. in the
far-field, the main lobe is completely formed and its magnitude monotonously decreases. An approximation
of the natural focal distance Rnat for circular apertures is given by [52]
(Dap /2)2
Rnat = , (2.17)
λ
where Dap is the aperture diameter. In order to obtain close approximations for a URA, whose aperture is not
radially symmetric, the (︂ average of the minimum and maximum aperture sizes can be used to approximate
√ )︂
Dap , e.g. Dap = Dap,x 1 + 2 /2 for Dap,x = Dap,y .
In summary, an increased aperture leads to a narrowed main lobe and thus also to an improved beamforming
precision. However, since the near-field is also expanded, the main lobe only completely forms at a larger
distance in the far-field when using unfocused beamforming. Therefore, in order to reliably generate high-
intensity waves within the near field as well, focused beamforming is required, which is described next.
2.2.3 Focused beam steering in the near-field

In contrast to unfocused far-field beamforming, where the wavefronts of the single elements are assumed to be
approximately planar, beamforming in the near-field must consider the true spherical shape of the individual
waves [47]. Thus, in order to generate a high-intensity superimposed wave at a particular point in the near
field, the time or phase shifts required for the excitation signals must be based on the true individual path
differences from each element to that specific point. Therefore, not only the beamforming direction but also a
beamforming distance must be specified, which jointly define the focus point r0 = (R0 , θ0 , ϕ0 ). The specific
Focus point
r0,m r0
rm
Figure 2.5: Basic principle of focused beamforming with an (1 × 8)-(λ/2)-ULA. The time- or phase shifts
required for the excitation signals are based on the true Euclidian distances between each element
to the focus point.
time and phase shift required for the excitation signal of the m-th element in order to focus to r0 = (R0 , θ0 , ϕ0 )
is given by [44], [47], [49]
1 (︁ 2π (︁
and (2.18)
)︁ )︁
∆Ttx,m = R0 − R0,m ∆φtx,m = R0 − R0,m ,
c λ
13
where R0,m is the Euclidian distance from the m-th element to the focal point, that is
R0,m = ∥r0,m ∥2 = ∥r0 − rm ∥2 =
√︂(︁ )︁2 (︁ )︁2 (︁ )︁2
R0 sin(θ0 ) cos(ϕ0 ) − xm + R0 sin(ϕ0 ) − ym + R0 cos(θ0 ) cos(ϕ0 ) , (2.19)
such that the resulting mono-frequency CW excitation signal of the m-th element is
stx,m (t) = s0 (t − ∆Ttx,m ) = e−j∆φtx,m · s0 (t), (2.20)
where s0 (t) = A0 ej2πf0 t . Therefore, the m-th entry of the beamforming vector for focused beamforming
w∗ ∈ CM ×1 is [48]
2π
(︁ )︁
w∗m (r0 ) = e−j∆φtx,m (r0 ) = e−j λ R0 −R0,m
, (2.21)
such that the excitation signals for focused beamforming are expressed by
stx (t) = w∗ (r0 )s0 (t) (2.22)
and the model for the wavefield based on (2.11) is given by
p(rP , r0 ) = s0 (t) · wH (r0 ) · a(rP ). (2.23)
-1 -0.5 0 0.5 1 0 0.25 0.5 0.75 1 Rpeak

1
10 10
Magnitude norm. Magnitude norm. Magnitude norm. R0 = 1 λ
Natural focus Rnat
0.5
z (6)
z (6)
5 5
0 0 0R
10
-1 5
-0.5 0 -5
0.5 -10
1 10
0 5
0.25 0
0.5 -5
0.75 -10
1 0 peak 2.5 R0 =53 λ 7.5 10
x (6) x (6) 1 z (6)
10 10
DOF3
0.5
z (6)
z (6)
5 5 DOF6
0 0 0
10
-1 5
-0.5 00 -5
0.5 -10
1 10
0 5
0.25 0
0.5 -5
0.75 -10
1 0 2.5
Rpeak 5R0 =7.5
9λ 10
x (6) x (6) 1 z (6)
10 10
0.5
z (6)
z (6)
5 5
0 0 0
10 5 0 -5 -10 10 5 0 -5 -10 0 2.5 5 7.5 10
(a) x (6) (b) x (6) (c)
z (6)
Figure 2.6: Normalized wavefield Re{p} (a), magnitude field |p| (b) and on axis-pattern (c) for selected focal
distances R0 = [1, 3, 9]. The path difference between the selected focal distance R0 and the
distance of the focal peak Rpeak , as well as the DOF increase with increasing focal distance.
14
Next, the resulting wavefields and on-axis patterns are considered, based on the model in (2.23) for three
focus points at various distances, i.e. R0 = [1, 3, 9]λ (Fig. 2.6). Clearly, focused beamforming enables to
superimpose high-intensity waves in the near field, but with a crucial difference compared to unfocused
beamforming. The spatial filter characteristic of focused beamforming is not primarily direction- but also
highly distance-selective, so that the magnitude for distances beyond the focus point decreases significantly
steeper compared to the magnitude within the beam-shaped main lobe of unfocused beamforming [44], [47],
[49].
The radial length in which the magnitude at the focal point is above −3 and −6 dB with respect to the focal
peak is called depth of field (DOF), i.e. DOF3 and DOF6 , respectively [44]. The DOF is extended with increasing
distance relative to the array aperture, so that the spatial filter gradually becomes less distance-selective and
transitions to the beam-shaped main lobe as in unfocused beamforming [44].
Additionally, the difference between the true distance of the focal peak and the defined focal distance R0
increases as the latter is increased. The reason is the extending DOF in conjunction with the decrease of the
magnitude proportional to 1/R. Therefore, the true focal peak distance is typically closer to the array origin
compared to the defined focal distance. In fact, the focal peak can not occur beyond the natural focus distance,
such that the focal peak position is only controllable within the near-field [44], [49].
2.2.4 Impact of the inter-element spacing - grating lobe formation

The previous sections show that increasing the aperture results in a near-field expansion and a narrowing of
the main lobe with only a minor change in MSLL, which are beneficial characteristics for most applications.
However, in order to increase the aperture, elements have been added to the array, such that the IES remained
fixed at d = 0.5 λ in all cases considered. Typically, a higher number of elements leads to an increased system
complexity and consequently to higher costs. Therefore, the effects of increasing the aperture of a ULA without
adding more elements by increasing the IES only are considered.
d = 1.5 λ d = 1.2 λ d = 0.6 λ
Figure 2.7: If the IES exceeds d > λ in the non-steered case, the respective maxima wavefronts intersect
and superimpose in multiple directions instead of one, referred to as grating lobe directions.
Consequently, high-intensity superimposed waves are formed in unintended directions, resulting
in ambiguous spatial filtering.
Compared to the (1 × 2) array with d = 0.5 λ, for the d = 1.5 λ and d = 1.2 λ array geometries, intersection
points of the maximum wave fronts arise not only in the steering direction (0◦ , 0◦ ) but in two additional
directions (Fig. 2.7). These extra intersection points occur, e.g., where the second maximum wave front
originating from one element superimposes with the first maximum wave front of the other element. Therefore,
wave fronts with the same intensity as the intended steered wave propagate to unintended directions.
15
In the case of in-phase excitation, i.e. steering to (0◦ , 0◦ ), the additional far-field intersection points occur
in directions θGL , where the path difference between the respective wave fronts ∆R corresponds to an integer
multiple N of the wavelength λ, that is [43], [45]
∆R = sin(θGL )d = N λ, such that (2.24)

(︃ )︃ ⃓ ⃓
Nλ ⃓Nλ⃓
θGL = arcsin , for N ∈ Z and ⃓
⃓ ⃓ ≤ 1. (2.25)
d d ⃓
The latter constraint shows that for ULAs steered to direction (0◦ , 0◦ ), additional high-intensity directions
are formed if the IES exceeds d ≥ λ. If the IES is further increased, these directions are shifted closer to the
steering direction. An additional pair of extra directions is formed for every multiple N of the wavelength λ
contained in the IES d, e.g. two pairs form for N = 2, such that d ≥ 2 λ.
-1 -0.5 0 0.5 1 0 0.25 0.5 0.75 1

0
Magnitude (dB)
10 10 -10
-20
z (6)
z (6)
5 5
-30
0 0 -40
10 5 0 -5 -10 10 5 0 -5 -10 90 45 0 -45 -90
x (6) x (6) /
3( )
Grating lobe Main lobe
-1 -0.5 0 0.5 1 0 0.25 0.5 0.75 1
0
Magnitude (dB)
10 10 -10
-20
z (6)
z (6)
5 5
-30
0 0 -40
10 5 0 -5 -10 10 5 0 -5 -10
(a) (b) (c) 90 45 0 -45 -90
x (6) x (6) /
3( )
Figure 2.8: Normalized wavefield Re{p} (a), magnitude field |p| (b) and far-field beam pattern (c) of a (8 × 8)-
(0.6 λ)-URA steered to (0◦ , 0◦ ) and (−40◦ , 0◦ ), where only in the latter case, a grating lobe is formed
in direction (90◦ , 0◦ ).
The wave field of an 8 × 8 URA with 0.6λ IES demonstrates that in the (0◦ , 0◦ )-steered case, the aperture
can be enlarged for narrowing the main lobe without causing further high-intensity waves in unintended
directions (Fig. 2.8). However, if the main lobe is steered to (−40◦ , 0◦ ), an additional high-intensity lobe is
formed in (90◦ , 0◦ ). These extra lobes featuring the same magnitude as the main lobe are referred to as grating
lobes, which are generally a disadvantage for most applications due to the ambiguity they introduce [42].
Based on (2.25), the direction of the resulting grating lobes θGL if steered in a specific direction θ is extended
to [43], [45] (︃ )︃ ⃓ ⃓
Nλ ⃓Nλ
θGL = arcsin + sin(θ) , for N ∈ Z and ⃓⃓ + sin(θ)⃓⃓ ≤ 1. (2.26)
⃓
d d
Consequently, in order to avoid ambiguities in the case of steered beamforming, the IES of a uniformly spaced
16
array must be [43]
λ
d≤ . (2.27)
1 + | sin(θ)|
In the case of a fully-steered array, i.e for steering angles up to θ = ±90◦ , a maximum IES of d = 0.5 λ is
required to avoid ambiguous grating lobes with the same peak magnitude as the main lobe.
Partial grating lobe

0 increases MSLL 0 0
Magnitude (dB)
Magnitude (dB)
Magnitude (dB)
-10 -10 -10
-20 -20 -20
-30 -30 -30
-40 -40 -40
90 45 0 -45 -90 90 45 0 -45 -90 90 45 0 -45 -90
/ / /
3( ) 3( ) 3( )
d = 0.5 λ θ0 = 60◦ d = 0.5 λ θ0 = 53◦ d = 0.45 λ θ0 = 90◦
Figure 2.9: Even if the maximum IES of d ≤ 0.5 λ of the example (8 × 8)-URA is maintained and thus an
amiguous grating lobe is not formed, the grating lobe can be partially contained within the
beampattern for high steering angles, resulting in a degraded MSLL.
However, in practice, the width of the grating lobe must also be considered, as the grating lobe can be
visible at least partially in the beampattern (Fig. 2.9). Although the partial grating lobe does not result in
ambiguities due to its lower magnitude compared to the main lobe, still it can lead to a significant increases
in MSLL at high steering directions. Therefore, a recommendation is to either further constrain the steering
directions or to choose a smaller IES than the maximum ambiguity-free IES.
2.2.5 Impact of the element aperture size - pattern multiplication

For all previous considerations, the array elements were assumed to be point sources, which have an infinite
small element aperture size and thus radiate uniformly in all spatial directions. However, real-world elements
consist of a specific aperture shape and size. Therefore, a more realistic model is obtained by assuming,
e.g., circular element apertures with a non-zero aperture diameter Dap , which are also referred to as piston
elements [49], [52].
According to Huygens principle, the oscillating continuous element aperture can be approximated numeri-
cally by a set of point sources at discrete positions within the given aperture (Fig. 2.10) [44], [49]. In order
to sample the aperture with L discrete positions, Distmesh [53] is used, which generates an approximately
equilateral triangular grid, also known as meshing, using a nonlinear iterative optimization procedure and
correctly covers the boundary region of the circular disk. In [49], the distances between the discrete positions
are recommended to be within ∆d ≤ λ/10. Based on the L discrete point sources, the resulting wave field of
the single element aperture can thus be generated as in (2.11), that is for a specific point in space rP
L−1
1 ∑︂ e−j2πRP,l /λ
p(rP ) = s0 (t) · , and RP,l = ∥rP − rl ∥2 , (2.28)
L 2πRP,l
l=0
where s0 (t) is the excitation signal, L is the number of point sources within the meshed aperture, rl is the
discrete position of the l-th point source and RP,l is its corresponding Euclidian distance to rP .
17
2 Dap = 0.7 λ 0
Magnitude (dB)
Point source
1 -5
-10
y (6)
Dap = 0.7 λ
0
-15
Dap = 1 λ
-1 -20
-2 -25
2 1 0 -1 -2 0.5 0.25 0 -0.25 -0.5 90 45 0 -45 -90
(a) x (6) (b) x (6) (c) 3 (/ )
Figure 2.10: If the array (a) consists of elements with an extended aperture, e.g. circular piston elements, a
set of point sources is used to model the element aperture forming a mesh (b). Therefore, the
individual array elements themselves feature a beam pattern (c), which is increasingly directional
with increasing aperture diameter Dap .
As the aperture diameter Dap increases, the beam pattern becomes more directional, in contrast to the
single point source beam pattern, which is uniform, i.e. independent of direction. An approximation for the
far-field MLW of a single element with an extended circular aperture is given by
(︄ )︄ (︄ )︄
λ λ
MLW3 = 2 arcsin 0.51 , MLW6 = 2 arcsin 0.71 , and (2.29)
Dap Dap
(︄ )︄
λ
MLWnull = 2 arcsin 1.22 . (2.30)
Dap
The approximations are obtained using a similar approach as for the ULA (2.14) and (2.15), whereas here,
the analytic far-field beampattern of the circular aperture is given by [45], [52]
(︂ )︂
Dap
2J1 2π λ 2 sin(θ)
p(θ) = , (2.31)
2π Dap
λ 2 sin(θ)
[︂ √ ]︂
such that the first order Bessel function J1 is used to solve J1x(x) = 1/ 2, 0.5, 0 for the argument x, followed
D
by solving x = 2π λ 2 sin(θ) for θ to obtain the respective MLWs [45], [51]. In addition to the narrowed
ap
main lobe, for large element aperture diameters, i.e. D ≥ 1.22 λ (2.30), side lobes are formed in a single
element aperture beampattern as well.
In order to model an array of elements with extended apertures, at each of the M array element positions,
the former single point source is replaced by a group of L sampled element aperture point sources, whereas
the respective aperture center is placed at the array element position. The total number of point sources in the
array therefore increases from M to M · L. The evaluation of the resulting wave field of an array of non-zero
aperture elements is extended based on the model from (2.11), that is
M −1 L−1
∑︂ e−j2πRP,(m,l) /λ M −1 L−1
∑︂ e−j2πRP,(m,l) /λ
1 ∑︂ 1 ∑︂
p(rP , r0 ) = stx,m (r0 ) ,= s0 (t)w∗m (r0 ) , (2.32)
L 2πRP,(m,l) L 2πRP,(m,l)
m=0 l=0 m=0 l=0
18
where M is the number of array elements, L is the number of point sources modeling a single element aperture,
stx,m is the excitation signal of the m-th element, r(m,l) is the position of the l-th point source of the m-th array
element, and RP,(m,l) = ∥rP − r(m,l) ∥2 is the Euclidian distance from r(m,l) to the evaluation point rP .
-1 -0.5 0 0.5 1 0 0.25 0.5 0.75 1

0
Magnitude (dB)
10 10 -10
-20
z (6)
5 z (6) 5
-30
0 0 -40
10 5 0 -5 -10 10 5 0 -5 -10 90 45 0 -45 -90
x (6) x (6) 3 (/ )
-1 -0.5 0 0.5 1 0 0.25 0.5 0.75 1
0
Magnitude (dB)
10 10 -10
-20
z (6)
z (6)
5 5
-30
0 0 -40
10 5 0 -5 -10 10 5 0 -5 -10 90 45 0 -45 -90
x (6) x (6) 3 (/ )
Figure 2.11: In the steered case, i.e. θ = −50◦ in the example, the element directivity, resulting from its
aperture size (Dap = 1 λ), causes an increase in MSLL and constrains the maximum steering
angle.
The wavefield of an (8 × 8)-(λ/2)-array geometry demonstrates the effects of the extended element aperture
with Dap,el = 1 λ on the resulting beam pattern compared to the same array geometry consisting of single
point sources only (Fig. 2.11). In the (0◦ , 0◦ )-steered case, the side lobes in the far-field beam pattern are
reduced according to the directivity of the single element aperture, which is a favorable effect since the MSLL
is thereby lowered. However, if the main lobe is steered towards the periphery, it is attenuated due to the
single element directivity, while side lobes pointing towards the center are less or not affected by attenuation.
As a result, there is a reduction in the radiated intensity of the main lobe in addition to the increase in MSLL,
so that the spatial filter is less direction-selective overall. Furthermore, the peak of the main lobe at high
steering angles deviates increasingly from the predefined steering direction, which as a consequence limits the
effective maximum steering angle. In summary, the resulting far-field beampattern consists of a multiplication
of the respective complex-valued normalized far-field beampatterns of the steered point-source array geometry
and that of a single non-steered extended element centered at the origin. This relation is referred to as pattern
multiplication [42], [43].
2.3 Receive beamforming

In the previous sections, the characteristics of the steered transmission of high-intensity waves in a specific
direction or to a specific spatial position, i.e. transmit beamforming, have been examined. The effects of
the array aperture and element size as well as the inter-element spacing on the spatial filter response have
19
been clarified. The following section covers spatial filtering when receiving, i.e. the selective reception of
waves from a specific direction or spatial position, which is referred to as receive beamforming. Due to the
reciprocity of transmission and reception [46], [51], [54], the effects on the spatial filter characteristics caused
by the aperture and element size, as well as the inter-element spacing, also apply here. The main difference
from transmission is that in reception, beamforming is not realized physically by the superposition of waves
in space, but computationally by the superposition of waves using array signal processing. The principles
of basic receive beamforming using array signal processing are explained hereafter, first for unfocused and
subsequently for focused receive beamforming.
2.3.1 Unfocused receive beam steering

A (1 × 8)-(λ/2)-array geometry of point receivers with infinitely small aperture size is considered (Fig. 2.12). A
planar CW wave s0 impinges on the array geometry from the direction (θP , ϕP ), e.g., created by a point source
in the far-field. The signals received by the elements differ in time and phase due to the path differences ∆Rm
depending on the direction of arrival. The time and phase delays of the signal of the m-th element are given
by
1 2π
∆Trx,m = − · ∆Rm (θP , ϕP ), and ∆φrx,m = − ∆Rm (θP , ϕP ), (2.33)
c λ
where the path difference to the m-th element is
∆Rm (θP , ϕP ) = xm sin(θP ) cos(ϕP ) + ym sin(ϕP ). (2.34)
Thus, the signal received by the m-th element equals to
srx,m (t) = s0 (t − ∆Tm ) = e−j∆φrx,m · s0 (t), (2.35)
where s0 (t) = A0 ej2πf0 t is the excitation signal of the point source. The steering vector for far-field point
sources afar ∈ CM ×1 [48] is defined, whose m-th entry is given by
afar,m (θP , ϕP ) = e−j∆φrx,m = ej λ xm sin(θP ) cos(ϕP )+ym sin(ϕP )

2π
(︁ )︁
, (2.36)
so that the received signal vector srx ∈ CM ×1 is expressed by
srx (t) = afar (θP , ϕP ) · s0 (t). (2.37)
If the signals received by all elements are simply summed up, the resulting signal is given by
M −1 M −1 M −1
ej λ xm sin(θP ) cos(ϕP )+ym sin(ϕP )
2π
∑︂ ∑︂ ∑︂ (︁ )︁
prx (t) = srx,m (t) = e
−j∆φrx,m (θ,ϕ)
s0 (t) = s0 (t). (2.38)
m=0 m=0 m=0
Clearly, the amplitude of the summed signal is maximized to A = M A0 if the signals of all elements are
in-phase, which results for a wave impinging from direction (θP , ϕP ) = (0◦ , 0◦ ). Furthermore, the summed
amplitude is reduced if the signals have different phases, i.e. the impinging wave originates from other
directions. In conclusion, by simply summing all element signals, waves arriving from (0◦ , 0◦ ) are amplified
and waves from other directions are attenuated. Thus, a direction-selective spatial filter for receiving waves is
realized.
If a wave sequentially impinges on the array from different directions (θP , ϕP ) from the far-field and the
amplitude of the summed signal A is plotted over the corresponding direction for each case, the spatial
20
θP = 20◦ θP = 0 ◦ θP = −50◦
+ + +
θ0 = 0 ◦
θ P (◦ )
Figure 2.12: By simply summing all array signals received, frontally impinging waves are amplified, whereas
for other directions, they are attenuated in the summed signal, i.e. spatially filtered for direc-
tion θ0 = 0◦ .
filter response, i.e. the far-field receive beam pattern, is obtained. This far-field receive beam pattern is
equivalent to the previously considered far-field transmit beam pattern when transmitting in direction to
direction (θ0 , φ0 )=(0◦ , 0◦ ).
As in transmit beamforming, the direction of the spatial filter can be altered as well, in order to filter for
waves arriving from other directions. For this, the array signals received must be time- or phase-delayed so
that they are in-phase for an impinging wave from the defined filter direction before they are subsequently
summed (Fig. 2.13). In order to reverse the phase shifts for a specific filter direction (θ0 , ϕ0 ), the additional
time or phase delays required for the m-th element signal are given by
∆Tshift,m (θ0 , ϕ0 ) = −∆Trx,m (θ0 , ϕ0 ) and ∆φshift,m (θ0 , ϕ0 ) = −∆φrx,m (θ0 , ϕ0 ). (2.39)
Thus, the vector for the additional compensating phase shifts, i.e. the far-field beamforming vector [48],
w∗far ∈ CM ×1 is defined to be
w∗far (θ0 , ϕ0 ) = a∗far (θ0 , ϕ0 ), (2.40)
whose m-th entry is therefore
2π
xm sin(θ0 ) cos(ϕ0 )+ym sin(ϕ0 )
(︁ )︁
w∗far,m (θ0 , ϕ0 ) = e−j λ , (2.41)
21
so that the resulting phase shifted signals srx, shift for the filter direction (θ0 , ϕ0 ) and a far-field plane wave
impinging from (θP , ϕP ) are given by
srx, shift (t) = w∗far (θ0 , ϕ0 ) ⊙ sfar (θP , ϕP , t) = w∗far (θ0 , ϕ0 ) ⊙ afar (θP , ϕP ) · s0 (t). (2.42)
Based on this, the subsequently summed signal results in
M −1
w∗far,m (θ0 , ϕ0 ) ⊙ afar,m (θP , ϕP ) · s0 (t) = wH
∑︂
prx (θP , ϕP , θ0 , ϕ0 ) = far (θ0 , ϕ0 ) · afar (θP , ϕP ) · s0 (t). (2.43)
m=0
θP = 20◦ θP = 0 ◦ θP = −50◦
Phase shift Phase shift Phase shift
+ + +
θ0 = 20◦
θP (◦ )
Figure 2.13: The direction of the spatial filtering is varied by time- or phase-shifting the array signals before the
summation, so that their phases are equalized for an impinging wave, e.g. from θP = θ0 = 20◦ .
In conclusion, due to the additional phase shifts, the amplitude of the beamformed signal is maximized if
the filter direction of the beamforming vector w∗far (θ0 , ϕ0 ) matches the impinging direction of the steering
vector afar (θP , ϕP ). Therefore, the resulting main lobe of the far-field beam pattern is shifted to (θ0 , ϕ0 ) [47],
[48].
22
2.3.2 Focused receive beam steering
If the wave emitting point source is located in the near field , the resulting phase shifts between the array
signals srx (t) depend on the specific individual distances from each element rm to the position of the point
source rP (Fig. 2.14).
Focus point r0
Unfocused beamforming
suppresses near-field waves
Phase shift Phase shift (d)
Focused beamforming
suppresses far-field waves
+ +
(a) (b) (c)
Figure 2.14: In order to ideally spatially filter a spherical wave originating from the near field, the focal point r0
is required to match not only the direction but also the distance of the source position (a). If the
focus point is positioned in the near field, waves from the far field are suppressed even if they
match in direction (b). Thus, the spatial filter response is direction- and distance-selective (c),(d).
Therefore, the array signals received are given by
srx (t) = a(rP )s0 (t), (2.44)
where the m-th entry of the steering vector a ∈ CM ×1 is given by

2π
e−j λ RP,m
am (rP ) = , (2.45)
2πRP,m
and RP,m is the Euclidian distance from the m-th element to the point source, that is
RP,m = ∥rP,m ∥2 = ∥rP − rm ∥2 = (2.46)

√︂(︁ )︁2 (︁ )︁2 (︁ )︁2
RP sin(θP ) cos(ϕP ) − xm + RP sin(ϕP ) − ym + RP cos(θP ) cos(ϕP ) . (2.47)
23
Consequently, in order to compensate the phases of the array signals, not only a filter direction mus be
defined, but also a filter distance, i.e. a focal point r0 , that coincides with the position of the point source rP .
The m-th entry of the focused beamforming vector w∗ ∈ CM ×1 is given by [44], [49]
2π
(︁ )︁
w∗m (r0 ) = e−j λ R0 −R0,m
, (2.48)
where R0 = ∥r0 ∥2 and R0,m = ∥r0 − r0,m ∥2 are the distances from the focus point to the origin and to the
m-th element, respectively. The summed signal for the point source position rP and the focus point position r0
results in
M −1
w∗m (r0 )am (rP )s0 (t) = wH (r0 )srx (rP ) = wH (r0 ) · a(rP ) · s0 (t)
∑︂
prx (rp , r0 ) = (2.49)
m=0
By comparing the model for the resulting summed signal of receive beamforming (2.49) with the model for
the wavefield in transmit beamforming (2.23) the reciprocity [43], [49], [54] is highlighted. In fact, both
the transmit steering and beamforming vectors are identical to their respective receive counterparts and are
only utilized differently. For transmit beamforming, the beamforming vector w∗ (r0 ) is used to generate the
phase-shifted excitation signals, whereas in reception, it is used to additionally phase-shift the signals received.
The transmit steering vector a(rP ) is used to obtain the wavefield for a specific point in space rP , whereas
in reception, it is used to model the received signals for a point source located at that defined point. Next,
the spatial filter responses outside the horizontal plane and the corresponding representation in the θϕ- and
uv-domain are considered.
2.3.3 Two-dimensional spatial filter response in the uv-domain

In the previous sections, the transmit and receive spatial filter responses have been visualized either as a
far-field beam pattern over θ or in the horizontal xz-plane, so that the elevation direction has only been
considered for ϕ = 0◦ . However, the spatial filter responses are indeed volumetric, i.e. for each point in space
or for each point source position rP a resulting beam pattern value can be determined.
In the far field or, in general for a defined distance R, the beam pattern is commonly illustrated for both
spatial angles θ and ϕ, which is therefore spanned on a hemisphere (Fig. 2.15). In the θϕ-diagram, the
hemisphere is stretched on a 2D plane, which shows the corresponding values of the beam pattern at the
correct angles. For the coordinate system used, i.e. azimuth over elevation, the latter is the dominant
angle (Section 2.1), so that for high ϕ angles, there is a widened representation of the side lobes along the
θ-axis, comparable to the representation of Antarctica on a world map.
A further option for visualizing the beam pattern hemisphere on a 2D plane is the use of uv-coordinates [40],
also referred to as direction cosine space or sine space [55], that is
u = sin(θ) cos(ϕ) (2.50)

v = sin(ϕ), (2.51)
√
where u and v correspond to valid directions for u, v ∈ {R| u2 + v 2 ≤ 1}. The uv-diagram representation
is an orthographic projection of the hemisphere onto a flat plane, which corresponds to a frontal view to
the inner hemisphere in the positive z-direction [40]. The advantage of the uv-diagram is that the beam
width of the main and side lobes do not widen in dependence of the beamforming direction, providing
easy comparability [45], [55]. Furthermore, the uv-domain representations can be used for deconvolution
approaches, as described in more detail in Section 2.4.6 and Chapter 7. One disadvantage is that for a given
position in the beam pattern in the uv-domain, the corresponding steering direction (θ, ϕ) is not directly
readable. Throughout this thesis, both, the uv-representation and the θϕ-representation are used, the latter
in particular for more application-focused content.
24
uv-projection θϕ-section uv-projection
θϕ-section
u-section
y (λ)
xz-section
θ-section
z (λ)
x (λ)
xz-section θ-section u-section
θϕ-section uv-projection
y (λ)
x (λ) z (λ)
xz-section θ-section u-section
MLW widened MLW consistent
Figure 2.15: Various common sectional views of the volumetric spatial filter response and their three-
dimensional location including the orthographic projection of the θϕ-hemisphere for a specific
fixed distance R in the far-field onto the uv-plane. The uv-representation features the character-
istic, that a shift of the beamforming direction does not affect the main lobe and side lobe width,
preserving their relative positions and shapes, in contrast to the θϕ-view.
2.4 Pulse-echo image formation

In this section, the pulse-echo image formation is outlined. Here, a predefined region of interest is illuminated
with one or more time-limited pulses, followed by the induced echoes being received by the array elements.
Based on the above introduced transmit and receive beamforming techniques in conjunction with the time-of-
flight method, the received echo signals are mapped to the corresponding points in space. After the details of
25
the imaging process are described, the specific characteristics of the resulting wave-based images and their
quality metrics, i.e. contrast, range and angular resolution, are elaborated. Subsequently, the estimation
of these metrics using the point spread function and its impact on the image composition are emphasized.
Finally, the benefits and limitations of two different imaging methods are compared, which are based on one-
and two-way beamforming, respectively.
2.4.1 Discretization of the region-of-interest

Before starting the image formation process, first, the region-of-interest (ROI) [44], [56] must be de-
fined (Fig. 2.16), consisting of the field-of-view (FoV) D and the range-of-view (RoV) R [31], [57].
Uniform θϕ-sampling
Higher sampling
Example object density
y (λ)
y (λ)
y (λ)
y (λ)
Array
Front view Front view
(a) x (λ) z (λ) x (λ) x (λ) z (λ) x (λ)
Uniform uv-sampling
Uniform sampling
Rmax density
y (λ)
y (λ)
y (λ)
y (λ)
Scan line
Front view Front view

(b) x (λ) z (λ) x (λ) x (λ) z (λ) x (λ)
Figure 2.16: Discretization of a 2D and 3D ROI based on uniform θϕ-sampling (a) and uniform uv-sampling
(b). The sampling density of uniform uv-sampling is reduced near the periphery, such that the
front view, i.e. the orthographic projection is uniformly sampled.
The RoV is specified by the minimum and maximum radial distances to be imaged (Rmin and Rmax ), which
depend on the defined signal acquisition time window (Tmin and Tmax ), where the pulse transmission defines
the reference starting time. The signal time T and the radial distance R are proportional, that is
c·T
R= , (2.52)
2
where c is the wave propagation time. The discrete RoV set is obtained by sampling the received signal within
the signal acquisition time window into N samples with the sample rate fs , such that the n-th entry of R is
given by
c (Tmax − Tmin ) cTmin (Rmax − Rmin )
Rn = n + =n + Rmin , for n ∈ {0, 1, . . . , N − 1}, (2.53)
2(N − 1) 2 N −1
26
where N is the total number of samples acquired, that is
⌊︃ ⌋︃
2
(2.54)
⌊︁ ⌋︁
N = (Tmax − Tmin ) fs = (Rmax − Rmin ) fs ,
c
where ⌊.⌋ denotes rounding down to the nearest integer value [57].
TheFoV consists of a discrete set of K directions (θk , ϕk ) considered for echolocation. In order to obtain a
spatial sampling with uniform angular distances, an upper and lower boundary including an angular step size
per spatial direction is specified. Therefore, the k-th entry of the uniformly sampled FoV D in the θϕ-domain
is given by [31], [57]
(︄ )︄⊺ (︄ )︄⊺
θk θmin + (k mod Kθ ) · θstep
Dk = = , for k ∈ {0, 1, . . . , K − 1}, (2.55)
ϕk ϕmin + ⌊k/Kθ ⌋ · ϕstep
where Kθ and Kϕ are the number of unique directions in the corresponding direction, i.e.
θmax − θmin ϕmax − ϕmin

Kθ = + 1, and Kϕ = + 1, (2.56)
θstep ϕstep
such that the total number of directions is K = Kϕ Kθ . In conjunction with the defined RoV, the discrete
directions of the FoV result in so-called scan lines which pass through the volume of the ROI [44], [56].
Apart from uniform sampling in the θϕ-domain, an alternative discretization of the FoV is obtained by
sampling with uniform increments within the uv-domain in the same way as in (2.55) and (2.56) using
umin , ustep , umax and Ku , as well as the equivalent v counterparts.
√ However, the discrete uv-directions corre-
spond to valid spatial directions only within u, v ∈ {R | u2 + v 2 ≤ 1} [45].
The uniform θϕ-sampling provides a homogeneous sampling of the ROI in Cartesian space, whereas for the
uv alternative, the sampling density is decreased towards the periphery. However, in the frontal view of the
spanned hemisphere in positive z-direction, i.e. its orthographic projection, the uv-sampling is homogeneous,
whereas the θϕ-sampling density increases with increasing ϕ-direction.
Considering that the main lobes in transmit and receive become wider as the steering angle increases, which
results in a degraded effective peripheral angular resolution (Section 2.4.4), the reduction in sampling density
of the uniform uv-sampling is favorable [45]. Another advantage is that the images generated are directly
provided in uv-space, and thus, can be used for deconvolution approaches without additional processing
(Chapter 7). For simplicity, uniform θϕ-sampling for describing the imaging process and characteristics is
used in the remainder of this chapter.
2.4.2 Image formation process

After the discretization of the ROI, the image formation process is initiated. Here, the objective is to assign
a value to each discretized spatial point within the ROI, that ideally corresponds to the magnitude of an
echo originating from that position. As a result, a 2D or 3D image is obtained in which passive reflective
surfaces are spatially mapped. In order to form the complete image, each point in space must be illuminated
to induce echoes of possible scatterers. Afterwards, the received array signals containing the echo signals
must be converted into signals that provide the echo information in dependence of their origin distance R,
azimuth θ and elevation ϕ direction.
The techniques to spatially filter outgoing and incoming waves and thereby identify a corresponding origin
direction have already been described in the previous sections, i.e. transmit and receive beamforming. In
order to obtain the origin distance of an echo, the time-of-flight (ToF) is used, i.e. the transit time from the
transmission of the signal to the reception of its echo, according to the relation in (2.52). However, so far,
27
only mono-frequency CW signals have been considered, whose frequency and envelope are time invariant
and therefore do not provide unambiguous ToF information. Therefore, a time-dependent modulation of the
transmitted signal is required to obtain distance information, e.g., by varying the frequency, also referred to as
chirping [51], or by varying the amplitude. Throughout this work, amplitude modulation is used, specifically
by sending a time-limited pulse, since only narrow-band elements for wave generation are utilized. The basic
transmit signal thus results in
s0 (t) = A(t)ej2πf0 t , (2.57)
where A(t) is the time-dependent amplitude envelope.
Previous considerations demonstrated that beamforming can be achieved by time shifts as well as by
equivalent phase shifts, whereas both methods provide the same result for mono-frequency CW signals.
However, if the time-dependent amplitude envelope is used, the true-time-shifted and phase-shifted signals
differ, since the latter shifts only the carrier signal phase of each element and not the envelope position
itself [48].
Assuming that A(t) varies only slowly, i.e. the pulse is broad in time and its frequency response is narrow-
band, such that the element signals differ only slightly in the envelope but mainly in the phase, the following
approximation for the signal received by the m-th element srx,m (t) holds
srx,m (t) = s0 (t − τ − ∆Tm ) = A(t − τ − ∆Tm ) · ej2πf0 (t−τ −∆Tm ) (2.58)
j 2πf0 (t−τ )−∆φm
(︁ )︁
−j∆φm
≈ s0 (t − τ )e = A(t − τ ) · e (2.59)
where τ is the time delay due to the ToF referenced to the array center, ∆Tm is the m-th element specific time
delay and ∆φm = 2πf0 ∆Tm is the corresponding specific phase delay. This assumption is referred to as a
narrow-band condition, which is specifically given by [45]
Bpulse · ∆Tmax ≪ 1, (2.60)
where Bpulse is the −3 dB bandwidth of the pulse, which affects its minimum temporal length, and ∆Tmax is
the absolute maximum time shift between two array element signals, thus, depending on the overall aperture
size. If the narrow-band condition is satisfied, the time shifts can be substituted by equivalent complex phasors
for beamforming [45]. Therefore, arbitrary phase shifting is possible without being constrained by the discrete
time samples and without requiring additional signal interpolation. Otherwise, if the narrrow-band condition
is not fulfilled, e.g. when using broad-band chirps or very large aperture sizes, a broad-band beamforming
method is required, such as true-time-delay beamforming in the time domain, i.e. delay-and-sum (DAS), or
its frequency domain counterpart, where each frequency f is phase shifted accordingly, that is
srx,m (t) = s0 (t − τ − ∆Tm ) c s ŝ0 (f ) · e−j2πf (τ +∆Tm ) (2.61)
instead of requiring only the carrier frequency f0 for the complex phasors [47].
In the following, the specific steps of the image formation process are described. The first step is to illuminate
the discrete points in space along the scan lines in the ROI defined. Various imaging methods exist, that differ
in the number of simultaneously illuminated and evaluated scan lines, as well as in the number of transmitting
and receiving elements, where each method pursues specific optimization goals, such as frame rate, viewing
range, angular resolution or system complexity [44], [49], [56].
In order to clarify the basic concept and characteristics of the image formation, first, a straightforward
method called multi line acquisition (MLA) is considered, where a single element transmits a uniform
hemispherical pulse, illuminating all scan lines with a single firing event [56] (Fig. 2.17). Subsequently, all
elements receive the induced echo signals within the defined acquisition time window, where the element
signal matrix Srx ∈ CM ×N for an example reflector at rP is given by
(︂ )︂
Srx = a(rP ) · s0 (n, rP ) = a(rP ) · A(nTs − τ ) ⊙ ej2πf0 (nTs −τ ) , (2.62)
28
where n = [0, 1, . . . , N − 1] is the sample index vector, s0 (n, rP ) ∈ C1×M is the time signal vector including
the echo of the example reflector, Ts = 1/fs is the sampling period, A(t) is the basic transmit signal envelope,
τ = 2∥rP ∥2 /c is the ToF referenced to the array center and a(rP ) is the steering vector (2.45). In a multi-reflector
scene containing G reflectors, the model of the received array signals is extended to
G−1
∑︂
Srx = A(rP,0 , . . . , rP,G−1 ) · S0 (n, rP,0 , . . . , rP,G−1 ) = a(rP,g ) · s0 (n, rP,g ), (2.63)
g=0
where A = a(rP,0 ), . . . , a(rP,G−1 ) ∈ CM ×G is the steering matrix and S0 = s0 (n, rP,0 ); . . . ; s0 (n, rP,G−1 ) ∈
[︁ ]︁ [︁ ]︁
CG×N is the time domain signal matrix containing the echoe signals of all reflectors.
Spatial filtering Position spatially Envelope formation

Hemispherical
of all scan lines filtered signals to of spatially
pulse transmission
using received signals corresponding scan line filtered signals
Complete 2D scan Color code

Linear interpolation Top view
with more scan lines envelope magnitude
Figure 2.17: Procedure of the pulse-echo image formation using the multi line acquisition method, requiring
only a single firing event to generate volumetric 3D or a horizontal 2D image, as in the example.
The latter is also referred to as brightness-scan (B-scan) in the ultrasound imaging context.
Next, for imaging in the far field, the signal matrix received Srx is spatially filtered for all K defined spatial
direction of the FoV D with the beamforming matrix WH far ∈ C
K×M , that is
P = WH
far · Srx , (2.64)
where P ∈ CK×N contains the spatially filtered signals and the k-th row of WH H H
far is given by W(k,:),far = wfar (θk , ϕk )
(2.41). Subsequently, the envelopes Penv of each spatially filtered signal are obtained by forming the absolute
value of each entry, that is ⃓ ⃓
Penv,(k,n) = ⃓P(k,n) ⃓ . (2.65)
⃓ ⃓
Each entry of the spatially filtered envelopes Penv,(k,n) is then color coded and positioned according to the
corresponding filtered spatial direction D k and radial distance Rn [31]. The non-evaluated gaps between
29
the scan lines are linearly interpolated [44]. In order to visualize a 3D image, the magnitude values are
additionally alpha-coded, so that echo-free regions become transparent for revealing echoes located behind
them [31]. All together, this procedure generates a 2D or 3D image in which the originating positions of
scatterers are indicated by color and optionally by transparency. However, the wave-based image formation
results in typical undesirable irregularities, which are considered next.
First, contrary to intuition, the true distance of a reflecting object is not located at the radial distance of the
maximum magnitude in the image but at the onset slope of the pulse, if narrow-band transmit pulses are
used. Therefore, without additional pre-processing of the signals for compensation, there is an offset between
the detection maximum and the true origin position of the reflection, which is located closer to the array.
Second, an infinitely small point reflector is still rendered in the image with a certain radial depth and
angular width. The radial depth of the representation depends on the temporal length of the transmitted and
reflected pulse, while the angular width depends primarily on the main lobe of the spatial filter response.
Therefore, the depth and width of the detection does not provide clear information on the true size of the
reflecting object [44].
Third, non-zero magnitude regions adjacent to the main detection are formed in the image, although there
are no reflectors within these areas. These errors are caused by the side lobes of the spatial filter response.
For example, if the spatially filtered signal is generated for a specific reflector-free direction, the main lobe of
this particular spatial filter response points in the direction to be scanned and its side lobes point in other
directions. If a side lobe points in the direction of an impinging wave, its magnitude is not fully suppressed,
which is therefore incorrectly assigned to the scan direction. The resulting non-zero magnitude is relative to
the magnitude of the impinging wave attenuated by the corresponding side lobe level. Therefore, these regions
are referred to as side lobe artifacts [44], [56]. All in all, these effects complicate the reflector detection,
particularly when there are multiple reflectors present in the ROI. The metrics for quantifying these image
characteristics are described in the following.
2.4.3 Range resolution

If two equal-strength point reflectors are located radially one behind the other within the ROI, both generate
echoes which are time-delayed according to their respective radial distances, which accumulate to a composite
echo signal. Assuming that the radial distance between the reflectors is sufficiently small, the two echo pulses
overlap so that they are not imaged as two separate pulses at all. The condition for the separation of the two
object is satisfied if each pulse creates a distinct local magnitude maximum in the image [51]. The minimum
distance required for separability is defined as the range resolution ∆Rmin , also referred to as axial resolution
in a medical imaging context [44], [58].
Neglecting the distance-dependent attenuation and shadowing effects of the closer reflector, which attenuates
the illumination and the echo of the reflector behind it, the range resolution is primarily determined by the
transmit pulse length and shape. Therefore, when using narrow-band elements, the transmit pulse is ideally
as short as possible, but sufficiently broad, so that the initial transient is not canceled early for reaching the
peak transmit amplitude [44], [51].
If broad-band elements are utilized and an additional frequency modulation of the transmit pulse is applied,
e.g. chirping, the range resolution can be further improved by so-called pulse compression [51]. For this,
the received signals are pre-processed by a matched filter based on the chirped transmit pulse. As a result,
the length of the filtered receive pulse is compressed, where a higher chirping bandwidth leads to a more
effective compression [54].
30
1
Magnitude norm.
50
0.75
y (λ)
0 0.5
0.25
-50 Separable
50
(a) 50 0
0 10 25 50
Magnitude norm.
50 -50 0 R (6)
0.75
Not separable
y (λ)
0 0.5
0.25
-50
50 50 0
(b) 0 0 25 50
z (λ)
x (λ) -50 0 R (6)
Figure 2.18: The pulse width determines the range resolution, i.e. the minimum radial distance required to
separate two radially adjacent reflectors. If two distinct maxima are formed in the image (a), the
reflectors are separable. In (b), the reflectors are too close to each other for separation.
2.4.4 Angular resolution

Similar to the range resolution, an angular resolution can be defined, which is often referred to as axial
resolution in the medical imaging domain. For this, a scene where two equal-strength point reflectors are
positioned adjacent to each other at the same distance is considered. If the reflectors are too close to each
other, their echo detections overlap so that they can not be distinguished as two separate detections. Here
again, the condition for the separation is that each echo must create a distinct local magnitude maximum in
the image, i.e. there must be a local minimum between the echo maxima. The minimum angular distance
required for separation is defined as the angular resolution ∆θmin [44], [51]. Therefore, the true minimum
spatial distance between the two reflectors Dmin increases with increasing radial distance R according to the
geometric relationship
Dmin = 2R sin (0.5 · θmin ) . (2.66)
The angular resolution is primarily determined by the MLW of the spatial filter response [44], [51], [56]. For
clarification, the spatial filtering of the scan line exactly positioned between two reflectors is considered, such
that the main lobe of the receive beamformer points in this specific direction. If the MLW is sufficiently narrow,
the directions of arrival of the echoes from the two reflectors are not within the main lobe and therefore do
not significantly contribute to the magnitude of this scan line, thus, resulting in a minimum. However, if the
main lobe is wide, it includes the directions of arrival of the echoes, such that they are only suppressed by the
corresponding relative main lobe levels. As a result, the magnitude of the scan line between the reflectors is
the superposition of the only partly attenuated impinging echo levels. Therefore, the magnitude of the center
scan line can even be higher than the magnitude of the scan line pointing directly to a reflector. Consequently,
a minimum between the reflectors does not form in the image Since the main lobe width widens for high
steering directions, peripherally positioned reflectors require a wider angular spacing than centrally located
ones (Section 2.2).
31
In order to improve the angular resolution, the main lobe width can be narrowed, which can be achieved
by increasing the array aperture and increasing the transmit frequency (2.14). However, the maximum IES
must still be maintained for URAs, so both of these alterations require an increase in the number of elements.
Alternatively to main lobe reduction, there are a number of high resolution array signal processing techniques,
such as Capon [59], MUSIC [60], ESPRIT [61], which provide higher angular resolution in receive mode
compared to conventional beamforming (CBF) considered here. In addition to the array signal processing
approach, there are methods for post-processing the image generated, e.g. deconvolution techniques [62]–
[64], which can improve the angular and range resolution to a certain extent.
Magnitude norm.
50
0.75
y (λ)
0 0.5 Separable
0.25
-50
50
(a) 50 0
0 190 45 0 -45 -90
-50 0
Magnitude norm.
50 3 (/ )
0.75
y (λ)
0 0.5
0.25
-50
50 50 0
(b) 0 90 45 0 -45 -90
z (λ)
x (λ) -50 0 /
3( )
Figure 2.19: Two adjacent reflectors at the same distance require a minimum angular distance from each
other, so that they are separable in the image. The minimum angular separation distance
determines the angular resolution, which depends on the main lobe width of the spatial filter
response.
2.4.5 Contrast ratio

In addition to range and angular resolution, image contrast is a key factor for the detection of multiple
reflectors of varying strengths. The peak-to-peak contrast ratio CRpp is defined as the ratio between the highest
magnitude of a detection p̂0 and the highest background magnitude p̂b , that is [56]
(︃ )︃
p̂0
CRpp = 20 log10 . (2.67)
p̂b
Side lobe artifacts affect the CRpp as they occur at the same distance and adjacent to an echo detection, with
magnitudes relative to the true echo level. By neglecting other interfering effects, such as noise or additional
strong reflections, the contrast ratio is primarily limited due to the resulting side lobe artifacts, that is
(︃ )︃
p̂0
CRpp = 20 log10 = −MSLLdB , (2.68)
p̂SLL
32
where p̂SLL is the level of the highest side lobe, thus, the CRpp corresponds to the negative MSLL in dB. If
there are several strong reflectors present in the ROI, the side lobe artifact levels can accumulate and further
reduce the CRpp [44].
For clarity, a scene is considered in which there are two reflectors at equal distances and with sufficiently
large angular spacing so that they can be separated in terms of angular resolution. However, unlike the
previous scenes, the reflectors have different reflectivities, so their echoes have different magnitudes. If the
ratio between strong and weak echo level is larger than the CRpp , the weak echo is masked by the side lobes
of the strong echo and is therefore not directly detectable, e.g. by using a threshold for segmentation. If the
threshold is set too high, the weak echo is considered to be background and is not detected. If the threshold
is set too low, side lobes are falsely classified as detections. Overall, the contrast ratio thus limits the relative
dynamic range, i.e. the ratio between the highest and lowest detectable echo level [44], [51].
Similar as for angular resolution, the aforementioned signal processing and image processing techniques
can be used to improve the contrast ratio (CR) as well (Section 2.4.4) [65]. In addition, the side lobes of the
spatial filter response can be reduced by so-called apodization [44]. Here, the transmit amplitudes or receive
sensitivities of the array elements are weighted based on their spatial position using a well-known 2D window
function, e.g. Hamming, Blackman, Dolph-Chebycheff [44], [45]. However, the reduction of the side lobes
comes at the expense of the main lobe width, which widens and thus degrades the angular resolution. Apart
from that, apodization reduces the overall intensity when used in transmission.
Magnitude norm.
50
0.75
y (λ)
0 0.5
0.25
-50
50
(a) 50 0
0 190 45 0 -45 -90
-50 0
Magnitude norm.
50 3 (/ )
0.75
CRpp
y (λ)
0 0.5 Not detected
0.25
-50
50 50 0
(b) x (λ) 0 90 45 0 -45 -90
z (λ)
-50 0 3 (/ )
Figure 2.20: The contrast ratio limits the detectability in scenes with a high relative dynamic range, i.e. when
the echo levels of reflectors significantly differ. Side lobe artifacts degrade the contrast ratio,
particularly if accumulated from multiple sources.
2.4.6 Point spread function and convolution

The previous two sections demonstrated that the characteristics of the spatial filter response, in particular the
MLW and the MSLL, have a significant impact on the angular resolution and the contrast ratio, respectively.
33
0 0.25 0.5 0.75 1 90 0 0
MLW3
Magnitude (dB)
MSLL MLW6
50 45 -10 -10
' (/ )
0 -20 -20
z (6)
25
-45 -30 -30
0
50 25 0 -25 -50 -90 -40 -40
x (6) 90 45 0 -45 -90 90 45 0 -45 -90
3 (/ ) 3 (/ )
Figure 2.21: The array-geometry-dependent PSF and its characteristic MLW and MSLL provide quality metrics
for the approximation of the resulting angular resolution and contrast ratio, respectively.
Therefore, the MLW and MSLL of the spatial filter response of an ideal scene with a single centered far-
field point reflector, also referred to as point-spread-function (PSF), are commonly used to approximate the
achievable angular resolution and contrast ratio [44], [51], [56]. The PSF is therefore given by
M −1 M −1
−j 2π xm sin(θ) cos(ϕ)+ym sin(ϕ) 2π
wH wH
∑︂ (︁ )︁ ∑︂
ppsf (θ, ϕ) = far · afar (0 , 0 ) =
◦ ◦
far ·1= e λ = e−j λ (xm u+ym v) , (2.69)
m=0 m=0
where 1 ∈ RM ×1 denotes a vector of ones.

The PSF depends primarily on the array geometry, i.e. the position of each array element xm , ym , if there is
no additional apodization applied. Therefore, the general side lobe distribution and imaging performance of
different array geometries can be easily compared and estimated based on the MLW and MSLL quality metrics
of their respective PSFs, e.g., to optimize the geometry.
In order to estimate the angular resolution, the full width of the main lobe is evaluated at a threshold
corresponding to half of the main lobe peak value, which is also referred to as full-width-half-maximum
(FWHM) [44]. The FWHM threshold is based on the fact that two reflectors of equal strength can be separated
if the accumulated magnitude of the scan line between the two reflectors is smaller than the magnitudes of
the scan lines pointing directly to the respective reflectors. Therefore, when spatially filtering the scan line
between the reflectors, the relative main lobe level pointing to the respective reflectors must not exceed a
maximum relative level of 0.5.
Table 2.1: Far-field angular resolution approximations [66] for one- and two-way beamforming using
ULAs (2.14) and circular apertures (2.29).
One-way Two-way beamforming
Single PSF Combined PSF Single PSF Rayleigh
ULA MLW6 ≈ 1.2 λ/D MLW6 ≈ 0.89 λ/D MLW3 ≈ 0.89 λ/D 1 λ/D
Circular MLW6 ≈ 1.41 λ/D MLW6 ≈ 1.02 λ/D MLW3 ≈ 1.02 λ/D 1.22 λ/D
Unfortunately, in literature, the threshold for the FWHM approximation of the angular resolution is frequently
chosen inconsistently or incorrectly, so that the −3 dB MLW or the −6 dB MLW are typically found. One reason
for the inconsistency might be that the −3 dB MLW, also referred to as half-power-beam-width (HPBW), is
used as a metric for the general characterization of a main lobe without intending to estimate the angular
resolution for imaging. Therefore, the threshold for the angular resolution approximation is clarified in the
following (Table 2.1).
34
If one-way beamforming of coherent signals is considered for imaging, i.e. only transmit or only receive
beamforming [56], the −6 dB MLW of the PSF is used for the anuglar resolution estimation, i.e. the full
width of the main lobe at half maximum of the amplitude [44]. In fact, forming the squared magnitude of a
one-way beampattern, i.e. the power beampattern, does not improve the angular resolution, such that the
−3 dB HPBW is an inaccurate estimator in this case.
For imaging using coherent two-way beamforming, i.e. in transmit and receive, there are two options.
The −6 dB MLW of the combined PSF can be used to approximate the angular resolution. Alternatively, the
identical result is obtained by using the −3 dB MLW of the transmit-only or receive-only PSF if the individual
PSFs are identical [51]. Apart from that, some literature refers to the Rayleigh resolution for the approximation
of the two-way angular resolution [66]. The Rayleigh resolution limit states that two reflectors are separable
if they are positioned at least in the direction of the first order minimum of the PSF of the other reflector [66].
The Rayleigh resolution limit results in a more pronounced and larger margin from the reflector maxima to
the minimum in between compared to the MLW3/6 approach. Two-way beamforming is described in more
detail in Section 2.4.7 based on the single line acquisition imaging technique.
1 0 1 1
True reflector
0.5 -10 0.5 positions q(u,v) 0.5
0 -20 ⊛ 0 = 0
v
-0.5 -30 -0.5 -0.5
-1 -40 -1 -1
1 0.5 0 -0.5PSF-1in uv-domain 1 0.5 0 -0.5Obtained
-1 image1 0.5 0 -0.5 -1
u ppsf (u, v) u p(u, v) u
0 0 0
Magnitude (dB)
-10 -10 -10
-20 ⊛ -20 = -20

-30 -30 -30
-40 -40 -40

1 0.5 0 -0.5 -1 1 0.5 0 -0.5 -1 1 0.5 0 -0.5 -1
u u u
Figure 2.22: In the uv-domain, the image obtained is composed of the ideal image, only containing the true
locations of the reflectors, which is convolved with the PSF, resulting in a degradation of angular
resolution and contrast. This convolution is circular for uniformly spaced array geometries.
In addition to determining the MLW and MSLL approximation metrics and the general side lobe distribution,
the far-field PSF and its shift invariance in the uv-domain [63] can be exploited to alternatively express the
model of the resulting far-field beampattern of a scene with one or more reflectors. The beam pattern resulting
from the superposition of G adjacent equal-strength reflectors positioned at the same distance but different
35
directions (uP , vP ) is given by
M −1 G−1
−j 2π
ej λ
2π
wH
∑︂ ∑︂ (︁ )︁
p(u, v) = far · Afar · 1 = e λ
(xm u+ym v) xm uP,g +ym vP,g
(2.70)
m=0 g=0
⎛ ⎞ ⎛ ⎞
G−1
∑︂ M −1 G−1 M −1
−j 2π −j 2π
∑︂ (︁ )︁ ∑︂ ∑︂
= e λ
xm (u−uP,g )+ym (v−vP,g )
=⎝ δ(u − uP,g , v − vP,g )⎠ ∗ ⎝ e λ
(xm u+ym v) ⎠, (2.71)
g=0 m=0 g=0 m=0
= q(u, v, uP , vP ) ∗ ppsf (u, v) (2.72)
where q corresponds to a 2D map of

[︁ Dirac impulses at the directions ]︁of theMreflectors, ppsf is the shift-invariant
PSF in the uv-domain, and Afar = afar (u0 , v0 ), . . . , afar (uG−1 , vG−1 ) ∈ C ×G is the far-field steering matrix.
2Period Valid direction

0 2 2
boundary boundary
1 -10 1 1
0 -20 0 0
v
-1 -30 -1 -1
-2 -40 -2 -2
2 1 0 -1 -2 2 1 0 -1 -2 2 1 0 -1 -2
u u u
1 0 1 1
0.5 -10 0.5 0.5
0 -20 0 0
v
-0.5 -30 -0.5 -0.5
-1 -40 -1 -1
01 0.5 0 -0.5 -1 01 0.5 0 -0.5 -1 01 0.5 0 -0.5 -1
u
Magnitude (dB)
u u
-10 -10 -10
-20 -20 -20
-30 -30 -30
-40 -40 -40

1 0.5 0 -0.5 -1 1 0.5 0 -0.5 -1 1 0.5 0 -0.5 -1
u u u
Figure 2.23: The periodicity of the PSF of uniformly spaced array geometries results in the circular character-
istic of the convolution, where the overlap outside the period boundary is circularly wrapped
to the opposite side. In general, only the uv-region within the unit radius corresponds to valid
θϕ-directions on the hemisphere.
Therefore, the beampattern image p obtained can be represented by an ideal image, containing the plain
36
directional information of the reflectors q, i.e. the reflector source distribution, which is convolved with the
PSF ppsf , resulting in a contrast- and resolution-reduced image [62], [63]. Similar to optical systems, the PSF
can thus be considered as a distortion impulse-response of the imaging system [62].
In practice, a regular convolution of two size-limited images results in an image with enlarged bound-
aries [67]. However, in our √ case, the valid region of the resulting image, which corresponds to true spatial
directions, is still limited to u2 + v 2 ≤ 1. Since the PSF of URAs is periodic, that is
M −1 M x −1 M y −1
−j 2π 2π
∑︂ ∑︂ ∑︂ (︁ )︁
ppsf (u, v) = e λ
(xm u+ym v)
= e−j λ d mx u+my v
(2.73)
m=0 mx =0 my =0
M x −1 M y −1
2π 2π 2π
∑︂ ∑︂ (︁ )︁
= ppsf (u + U, v + V ) = e−j λ d mx u+my v
· e⏞ −j λ⏟⏟dmx U⏞ · e⏞ −j λ⏟⏟dmy V⏞ , (2.74)
mx =0 my =0 =1 for U =λ/d =1 for V =λ/d
where the period in the u and v direction is U = V = λ/d, the convolution in (2.72) corresponds to a
circular convolution [68], denoted by the symbol ⊛. Considering a URA with an IES d = λ/2 as an example,
the convolution overlap of the resulting image outside the period boundaries, i.e. for |u| > 1 = U /2 and
|v| > 1 = V /2, is wrapped to the opposite side, that is
(︄ ⌊︃ ⌋︃ ⌊︃ ⌋︃ )︄
u v
p(u, v) = p u − U, v − V . (2.75)
U /2 V /2
The composition of the resulting image based on a circular convolution of the ideal image with the PSF is
particularly advantageous for deconvolution approaches intended to reverse the image degradation effects of
the PSF [62], [67], as used in Chapter 7.
2.4.7 One-way vs. two-way beamforming - single line acquisition

The previous sections focus on the MLA imaging method, in which a hemispheric pulse is transmitted to
simultaneously illuminate all scan lines to obtain a single set of M array signals. Based on this set, all scan
lines are subsequently evaluated for all scan lines, i.e. spatially filtered, to form the complete image. Therefore,
the ideal image formation time TPRF, MLA or its inverted quantity, i.e. the pulse repetition frequency (PRF)
fPRF, MLA , depends on the maximum range of view Rmax of the ROI which determines the signal acquisition
time Tmax , that is
1 Rmax
TPRF, MLA = = Tmax = 2 , (2.76)
fPRF, MLA c
where c is the wave propagation speed. Since only receive beamforming is applied, i.e. one-way beamforming,
the resulting PSF for ideal point source array elements depends primarily on the receive spatial filter response.
If transceiver array elements are available, then spatial filtering is supported for both, pulse transmission and
echo reception, i.e. two-way beamforming. A straightforward imaging technique using two-way beamforming
is called single line acquisition (SLA) [56]. Here, a single scan line direction of the ROI is selected one after
the other in which a beamformed pulse is transmitted, so that only this specific selected scan line (θ0 , ϕ0 ) is
primarily illuminated by the main lobe. Therefore, an ideal far-field example point reflector in the direction
(θP , ϕP ) generates the echo signal given by
ptx (θ0 , ϕ0 ) = wH
far, tx (θ0 , ϕ0 ) · afar, tx (θP , ϕP )s0 (t), (2.77)
which is received by all array elements, that is
srx = afar, rx (θP , ϕP ) · ptx (θ0 , ϕ0 ). (2.78)
37
The array signals received are then spatially filtered only for the selected scan line direction (θ0 , ϕ0 ), so that
the evaluated scan line signal is given by
prx (θ0 , ϕ0 ) = wH H H
far, rx (θ0 , ϕ0 ) · srx = wfar, rx (θ0 , ϕ0 ) · afar, rx (θP , ϕP ) · wfar, tx (θ0 , ϕ0 ) · afar, tx (θP , ϕP )s0 (t), (2.79)
such that, for afar, tx = afar, rx and wH H

far, tx = wfar, rx , valid for identical transmit and receive array geometries
and ideal transceiver elements, prx can be expressed by
(︂ )︂2
H
prx (θ0 , ϕ0 ) = wfar (θ0 , ϕ0 ) · afar (θP , ϕP ) s0 (t). (2.80)
Multi line acquisition (MLA) 0 0.25 0.5 0.75 1

y (λ)
0
z (6)
Magnitude (dB)
-10 MLA
-20
x (6)
Single line acquisition (SLA) 0 0.25 0.5 0.75 1 -30 SLA
y (λ)
-40
90 45 0 -45 -90
z (6)
/
3( )
x (λ) z (λ) x (6)

x (λ)
Figure 2.24: The SLA imaging method evaluates the individual scan lines sequentially. For each selected
direction, a beamformed pulse is transmitted and the received array signals are spatially filtered,
i.e. two-way beamforming. Therefore, at the expense of an increased image formation time, the
SLA main lobe is narrower and the side lobe level in decibel is two times lower compared to
MLA, improving the angular resolution, contrast and range of view.
In order to generate a complete image, this pulse-echo sequence must be repeated for all scan lines in the
ROI, so that for K scan lines, a total of K pulse firing events are required and a total of K sets each containing
M array signals are acquired. A major disadvantage of SLA is that for each firing event, the signal acquisition
time must pass, which increases the theoretically achievable image formation time, and, thus, decreases the
PRF compared to MLA [56], that is
1 2Rmax
TPRF, SLA = =K· = K · TPRF, MLA (2.81)
fPRF, SLA c
The main advantages are the clearly improved MSLL and a narrowed MLW of the two-way PSF, so that
the contrast ratio and the angular resolution of the resulting images are improved compared to MLA [56].
The improvements are due to the two-fold spatial filtering. For example, if a reflector is positioned out of the
selected scan line, it is only illuminated with a reduced amplitude due to transmit beamforming, e.g. with
the level of a transmit side lobe. Therefore, the echo of this reflector is weaker than the echo of a reflector
positioned within the selected scan line. In addition, this weaker echo is spatially filtered again with receive
38
beamforming, i.e. attenuated by the corresponding receive side lobe level. Hence, the resulting relative side
lobe level of the two-way beampattern is squared, or in decibels, two times lower if the spatial filter response
in transmit and receive are identical (2.80). Apart from the improvement in image quality, a longer range of
view is achievable due to the higher transmit amplitude provided by transmit beamforming.
39
3 Uniform dense waveguided transceiver PUT arrays
Parts of this chapter have been published in

[31] ”3D imaging method for an air-coupled 40 kHz ultrasound phased-array”,
in Proc. International Congress on Acoustics, 2019, and
[32] ”Real-time 3D imaging using an air-coupled ultrasonic phased-array”,
IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 2020.
After introducing the general medium- and frequency-independent basics of beamforming and imaging, in
this chapter, ultrasound imaging in air is specifically addressed, including the evaluation of its capabilities and
limitations based on a real-world phased array prototype using piezoelectric ultrasonic transducers (PUTs).
First, a review of related work is given regarding other in-air ultrasonic array applications and highlight the
challenges of air-coupled ultrasound and the available transducer technology. In particular, a major concern
of the latter is bringing together a high transmit intensity and a sufficiently small size to avoid grating lobes
in an uniform array configuration for unambiguous imaging (Section 2.2.4).
In the following, a solution is proposed for modifying the array geometry using a waveguide structure
into which high-intensity PUT-based transceiver elements are inserted in order to realize an (8 × 8)-URA
with a half-wavelength IES. Furthermore, the remaining in-air phased array system is described including
customized transceiver electronics and give insight into the FPGA-based hardware control and the practical
implementation of GPU-accelerated frequency-domain array signal processing and visualization. Another
essential part of the chapter is the experimental evaluation of the imaging capabilities using the MLA and SLA
imaging methods (Section 2.4.2, 2.4.7), which are then compared and discussed.
In advance, acknowledgements are given to the contributors of this system and the corresponding publi-
cations. Specifically, the waveguide geometry utilized has been designed by my colleagues Axel Jäger and
intensively investigated by Matthias Rutsch in the course of their doctoral theses and related publications [69],
[70]. Furthermore, my colleague Jan Hinrichs developed the transceiver electronics in his master thesis [71],
which is one of the key components of the system.
3.1 Related work on air-coupled ultrasonic phased arrays

The idea of combining ultrasonic transducers in a phased array has been applied in a variety of in-air
applications, including haptic hologram generation [72], [73], levitation of small objects [74], [75], highly
directional parametric loudspeakers [26], [76], surface wave elastography [77], vortex generation [78] and
power transfer [7], [8]. In these studies, efficient piezoelectric ultrasonic transducers (Murata MA40S4S)
with a resonant frequency of 40 kHz have been chosen as array elements.
However, arranging these transducers in an array is problematic: The diameter of this transducer type is
too large to meet the λ/2 criterion for the maximum inter-element spacing, resulting in high intensity sound
emission in unintended directions, also referred to as grating lobes [79]. Although grating lobes do not hinder
the operation of the in-air applications mentioned, they may cause harm to nearby users [80]. In contrast, for
acoustic imaging the avoidance or suppression of grating lobes is mandatory in order to locate objects without
ambiguities.
40
Different techniques have been applied for unambiguous imaging with air-coupled ultrasonic phased arrays.
In [81] and [82], a movable transducer array mounted on a 2D positioning system is used to form a synthetic
aperture [83]. In [84], the array layouts for pulse transmission and echo reception differ, resulting in grating
lobes being cancelled out in the point spread function. Kumar et al. have followed a similar approach by using
a transmitter array with Murata MA40S4S transducers and a receiver array consisting of small micro-electro-
mechanical system (MEMS) microphones [85]. Steckel et al. use a single broad-band transducer to transmit a
chirp pulse and a random MEMS microphone array for echo reception [86], [87]. Since only the direction of
the main lobe of the received directional pattern is frequency independent, grating lobes are suppressed in
the directional energy spectrum after matched filtering. A more conservative approach is to meet the λ/2
criterion. This has been accomplished by developing small size ultrasonic transducers based on PMUTS [88],
[89], CMUTS [20], [90], [91], PVDF materials [92]–[94] and ferroelectrics [95]. However, lowering the
operation frequency while maintaining efficiency regarding the transmitted sound pressure level has been
challenging. Instead of using smaller transducers, Takahashi et al. have attached shrinking tubes functioning
as waveguides to the efficient low-cost Murata MA40S4S transducers, reducing the effective inter-element
spacing [96].
3.2 Waveguided uniform rectangular array geometry

Based on the idea in [96], a 3D printed waveguide has been created containing M = 64 equal-length acoustic
channels into which MA40S4S transducers with a resonant frequency of f0 = 40 kHz are inserted [97]. This
structure forms an 8×8 uniform rectangular array (URA), reducing the aperture size to 35 mm×35 mm and
the IES to d = λ/2 = 4.3 mm (Fig. 3.1), enabling grating lobe free transmit and receive beamforming, which
has also been utilized for flow measurement [98] and non-contact testing (NDT) based on Lamb waves [99].
The position of the m-th channel output port is given by
rm = (xm , ym , zm ) = d(m mod 8), d⌊m/8⌋, 0 , (3.1)
(︁ )︁
where mod denotes the modulo operator, ⌊·⌋ denotes the floor function, m ∈ [0, . . . , M − 1] is the channel
index and d is the inter-element spacing.
100 mm
35 m
m
y
35 mm
Tra ϕ
nsdu
cers
z θx
λ
Waveguide 2
Figure 3.1: Uniform rectangular 8×8 phased array with 3D-printed waveguide and electronics attached.
The ultrasonic transducers are inserted into the waveguide reducing the effective inter-element
spacing to λ/2 for grating lobe free beamforming [32], [69], [70].
The URA layout allows the adaptation of multiple transmit and receive methods, commonly used in medical
imaging, e.g. conventional single-line-acquisition for achieving high SNR and resolution. Other techniques
41
can be realized which are aimed at increasing the frame rate by widening the transmit beam, thus reducing
the number of required pulses for 3D image generation, e.g. fan-beam scanning[100] and diverging wave
transmission [101], [102]. Therefore, by utilizing different methods, this air-coupled system is dynamically
configurable to meet varying requirements.
In order to investigate the capabilities of the system, two different imaging methods are adapted, each
addressing one limitation of air-coupled ultrasound, i.e. the limited range of view and frame rate. First,
the single-line-acquisition (SLA) method is used, where a pulse-echo detection is performed sequentially for
all directions of the discretized region of interest using all transducers for beamformed pulse transmission
and echo reception. Although this technique enables long range localization, requiring multiple pulse-echo
detections results in slow image generation (Section 2.4.7).
Second, the real-time imaging capability of the system is evaluated by choosing a method that maximizes
the frame rate. Since fan-beam scanning and the diverging wave technique still require multiple pulses per
image, only one transducer is used to transmit a single hemispherical pulse to irradiate the surroundings
simultaneously, i.e. multi-line-acquisition (MLA) (Chapter 2.4.2), at the expense of range of view and resolution.
The specific characteristics of the system when using the SLA and MLA method are then compared.
3.3 Transceiver electronics and imaging system architecture

The phased array system includes four major components, which are the phased array itself, the custom
transceiver electronics, an FPGA board based on the Xilinx Zynq System-on-Chip (SoC) and a conventional
PC (Fig. 3.2).
Phased-Array
Transducers Waveguide
MA40S4S
Transceiver
electronics for
Pulsers T/R-Switches ADCs
64 channels HV7355 TX810 AD7761
Xilinx Zynq 7010
FPGA
Avnet
MicroZed
Microprocessor
SDRAM
PC CPU GPU
Intel Core i5-2400 Nvidia Geforce GTX 1050TI
Figure 3.2: Phased array system components. The transceiver electronics includes eight pulser ICs each
capable of driving up to eight transducers. However, only one transducer is driven in the MLA
method. Sixteen TR-switches toggle between transmit and receive mode. The FPGA of the Xilinx
Zynq SoC generates the transmit signal, controls the pulse-echo sequence and receives the
digitized data of the eight multi-channel ADCs. The microprocessor receives control commands
from the PC via Gigabit Ethernet and, likewise, returns the digitized data. Array signal processing
and visualization are performed by the GPU [57].
The transceiver electronics is capable of driving all ultrasonic transducers with individually phase-shifted
42
transmit signals. For this, eight multi-channel MOSFET pulser ICs (Microchip HV7355) are used, each receiving
one digital serial signal and generating eight unipolar square-wave signals with the driving amplitude of
20 Vpp . This excitation method achieves a minimum relative time delay of 200 ns, constrained by the serial
signal clock, which corresponds to a minimum angular beamforming step size of approximately 0.9◦ . In the
MLA imaging method, only one ultrasonic transducer is driven, whereas all of them are used for receiving. In
order to toggle from transmit to receive mode, 16 four-channel T/R switches (Texas Instruments TX810) are
activated and the outputs of the pulser ICs are set to high impedance. In receive mode, the incoming signals
of all transducers are digitized in parallel by eight multi-channel 16 bit ADCs (Analog Devices AD7761) with
a sampling rate of fs = 200 kSa/s per channel. The ADCs can be synchronized and have a serial interface
reducing FPGA pin usage for the data reception.
The FPGA board (Avnet MicroZed) features 1 GB of shared SDRAM and one Xilinx Zynq 7010 SoC containing
an FPGA and a microprocessor in one package. The FPGA is used for generating the transmit signal, sequence
control, as well as reception and storing of all digitized data of the ADCs. The microprocessor receives control
commands from the PC via Ethernet and, likewise, returns the digitized raw data.
On the PC, a custom C++ program is executed for system control via user interface, communication, signal
processing and visualization. The latter two tasks are handled by a GPU (Nvidia Geforce GTX 1050 TI) using
the Nvidia CUDA toolkit and OpenGL. The CUDA toolkit provides ready-to-use batched FFT functions, thus
enabling the implementation of parallel low-latency signal filtering and efficient beamforming in the frequency
domain. In general, the system architecture, which is similar to software-based medical open scanners [103],
offers a number of advantages over hardware-based signal processing using FPGAs. First, GPUs are quickly
reprogrammable without requiring time-consuming logic synthesis, thus accelerating development cycles.
Second, since the processed data is available in the GPU memory, additional parallel image processing, e.g.
object tracking, can be applied directly. Furthermore, the option of exporting raw data to frameworks, such as
Matlab, simplifies prototyping of advanced imaging algorithms.
In summary, the system is easily reconfigurable and expandable, yet still maintaining high processing speed
through parallelization. In the next section, the generation of an acoustic image is described.
3.4 Implementation details for beamforming and imaging

In order to generate an acoustic image showing the positions of objects in front of the phased array, three
processing stages are performed. First, in the pulse-echo sequence, an ultrasonic pulse is transmitted and
the induced echo signals are captured. Second, using array signal processing, the originating positions and
amplitudes of the echoes are determined from the received signals. Third, the obtained position and amplitude
information are visualized in a horizontal sectional image (B-Scan) and a 3D volume image (3D-Scan). These
three steps are repeated to generate multiple sequential images, which are referred to as frames.
Before starting the frame generation, the region of interest is discretized and defined with the field of view
D, containing K directions, and the range of view R, containing N distances, as elaborated in Section 2.4.2.
The region of interest is defined once at the very beginning of the frame generation and remains constant for
the following three processing stages.
3.4.1 Pulse-echo sequence

In the pulse-echo sequence, an ultrasonic pulse is transmitted in order to create and capture echo-signals
from the scatterers in front of the array.
For MLA transmission, a single transducer is driven with a unipolar 40 kHz square wave pulse with a length
of 40 periods (TPulse = 1 ms) and an amplitude of 20 Vpp . Due to reduced effective transmit aperture size
(⌀ 3.4 mm = 0.4 λ) of the waveguide port compared to the aperture of the bare transducer (⌀ 9.4 mm = 1.1 λ),
43
a close-to hemispherical wave s̃(t) is transmitted. Therefore, scatterers in all spatial directions in the region
of interest are irradiated with a single pulse. The transmitted complex equivalent pressure signal s̃(t) is
expressed by
s̃(t) = A(t) · ej2πf0 t , (3.2)
where A(t) is the amplitude envelope of the pulse, j is the imaginary unit and f0 = 40 kHz is the signal
frequency.
After pulse transmission, all M = 64 transducers are used to receive the echo signals. For a single echo
signal originating from a point scatterer in the far-field from direction (θP , ϕP ), the analog received signal of
the m-th transducer is given by
(3.3)
(︁ )︁
s̃rx, m (t) = s̃ t − τ − ∆tm (θP , ϕP ) ,
where τ denotes the sound propagation time delay depending on the distance of the echo origin r = c · τ /2
and ∆tm (θ, ϕ) is the specific time delay of the m-th transducer depending on the direction of arrival (θ, ϕ),
i.e. (︃ )︃
xm ym
∆tm (θ, ϕ) = − sin(θ) cos(ϕ) + sin(ϕ) . (3.4)
c c
Here, (xm , ym ) are the m-th transducer position coordinates and c is the speed of sound.
Since the relative bandwith of the signal is small, the narrow-band assumption holds [48] (Chapter 2.4.2).
This
[︁ implies that the m-th
]︁ transducer-specific delay ∆tm (θ, ϕ) is negligible for the amplitude envelope, i.e.
A t − τ − ∆tm (θP , ϕP ) ≈ A(t − τ ), such that
s̃rx, m (t) ≈ s̃(t − τ ) · e−j2πf0 ∆tm (θP ,ϕP )

(3.5)
= s̃(t − τ ) · am (θP , ϕP ),
is valid. In (3.5), am (θ, ϕ) = e−j2πf0 ∆tm (θ,ϕ) denotes the steering factor of the m-th transducer.
The sample acquisition of the received signals starts after the pulse transmission (Tmin = TPulse ) using a
sampling rate of fs = 200 kHz, obtaining N samples per channel (Section 2.4.1). The digitized received signal
of the m-th transducer is given by
srx, m (n) = Re s̃rx, m (nTs + TPulse ) , (3.6)

{︁ }︁
where Ts = 1/fs is the sampling period and n is the sample index. The digitized signals srx, m (n) of all M
transducers are transferred to the PC for signal processing.
3.4.2 Frequency-domain array signal processing

The signal processing chain consists of analytic signal conversion, matched filtering, conventional beamforming
and envelope extraction. The processing is mainly performed in the frequency domain and is parallelized
using a GPU and the Nvidia CUDA API. Frequency domain processing offers three main advantages. First, it
reduces the computational load of filtering and analytical signal conversion since no convolution is required.
Second, the matrix multiplication for conventional beamforming can be reduced to the narrow frequency band
of the received echo spectrum. Last, compared to time-domain Delay-and-Sum-Beamforming, phase-shifting
is not constrained to multiples of the sampling period. Although conventional beamforming provides poor
resolution compared to other techniques, it enables large volume scanning at high frame rates, as it requires
only a single matrix multiplication. In the following, the implementation of the processing chain is described.
After the data has been received by the PC, all signals srx, m (n) are buffered in the GPU memory and
transformed into the frequency domain using the CUDA batched Fast Fourier Transform cuFFT, i.e.
ŝrx, m (l) = FN srx, m (n) (3.7)

{︁ }︁
44
where FN {·} denotes the vector-wise N -point Fourier transform from time to frequency domain and l ∈
{0, . . . , N − 1} is the frequency bin index.
Next, each frequency domain signal of ^ srx (l) is converted to an analytic signal and multiplied with the
frequency response of the matched filter. The resulting signal for the m-th transducer is given by
ŝbp, m (l) = û(l) · ĥ(l) · ŝrx, m (l). (3.8)
Here, û(l) represents the conversion function for creating the analytic signal in the frequency domain, i.e.
⎧
⎨1, if l = 0
⎪
⎪
û(l) = 2, if 0 < l ≤ N2 (3.9)
⎩0, otherwise,
⎪
⎪
and ĥ(l) is the frequency response of the matched filter.

In order to determine the originating directions of the impinging echoes, the pre-processed signals ŝbp, m (l)
are spatially filtered using a conventional beamformer [48], evaluating K different scanning directions
defined by the field of view D (Section 2.4.1). This multi-evaluation is implemented as parallel tiled matrix
multiplication, i.e.
^
P = WH ·^ Sbp , (3.10)
where WH is the [K×M ] beamformer matrix,^ Sbp is the [M ×N ] pre-processed input data matrix and^ P is the
resulting [K×N ] spatially filtered output data matrix. The single elements of the respective matrices are given
by
WH ∗
(k, m) = wm (θk , ϕk ), (3.11)
^
Sbp, (m ,l) = ŝbp, m (l), (3.12)
^
P (k , l) = p̂(θk , ϕk , l), (3.13)
where wm ∗ (θ, ϕ) is the beamforming factor of the m-th transducer, corresponding to the complex conjugate of
its steering factor, i.e.

wm∗
(θ, ϕ) = a∗m (θ, ϕ) = ej2πf0 ∆tm (θ,ϕ) . (3.14)
For further optimization, the matrix multiplication in (3.10) can be evaluated only for the frequency bins,
containing the echo spectrum.
After multiplying the matrices, all K spatially filtered signals in the field of view are obtained and transformed
back into the time domain. Finally, the amplitude envelopes penv (θk , ϕk , n) are determined by calculating the
absolute value of each element, since the analytical signals have already been created in (3.8), i.e.
penv (θk , ϕk , n) = |FN−1 p̂(θk , ϕk , l) |, (3.15)

{︁ }︁
where FN−1 {·} denotes the vector-wise N -point inverse Fourier transform using the CUDA batched IFFT.
The amplitude envelopes correspond to the directional and distant dependent echo amplitudes, which are
graphically displayed to create an image of the space in front of the array.
3.4.3 Visualization
In the visualization stage, the horizontal sectional image (B-Scan) and the three-dimensional volume image
(3D-Scan) are generated with OpenGL using the processed amplitude envelope data. First, the corresponding
position of each amplitude value penv (θk , ϕk , n) is given by the direction index k to obtain the field of view
45
direction Dk = (θk , ϕk ) and the sample index n to obtain the distance (Section 2.4.1). The determined
coordinates Rn , θk , ϕk are transformed to Cartesian coordinates using the azimuth-over-elevation coordinate
system (Section 2.1). Then, each amplitude value is color-coded and rendered at its specified coordinates.
The image generated is linearly interpolated to fill blank voxels. In addition, weak amplitude voxels in the
3D-Scan are made transparent using alpha blending (Section 2.4.2).
3.5 Experiments, results and discussion

The MLA and SLA method are analyzed in an anechoic chamber with emphasis on the transmit and receive
characteristics, angular and range resolution, range of view and field of view, as well as the achievable frame
rates. Regarding these properties, the MLA method is compared to the SLA method. For both methods the
same phased array and measurement setups are used.
3.5.1 Measurement setups

In the anechoic chamber, the phased array is attached to two rotational axes at the end of a 6 m long rail. On
the rail there is a movable slide on which different targets can be mounted. Depending on the measurement,
a calibrated microphone (B&K Type 4138), an ultrasonic transducer (Murata MA40S4S), one or two hollow
steel spheres (⌀100 mm) are used as targets. The spheres are mounted either radially or angular next to each
other. Using the rotational axes and the slide, the targets can be positioned freely in the coordinate system of
the phased array (Fig. 3.3).
Rotatable
phased
array Optional phased array with rigid baffle
Calibrated targets
microphone
Slide
Ultrasonic Angular
Rail transducer spheres
Radial spheres (⌀ 100 mm) as targets

6m
Figure 3.3: Measurement setups in the anechoic chamber. The phased array can be rotated mechanically
with two DOF using the rotational axes. For the transmit characteristic measurement, a calibrated
microphone is used as the target. For measuring the receive characteristic, the target is replaced
with an ultrasonic transducer. Otherwise one or two spheres with a diameter of 100 mm are used,
which are positioned radially or angular adjacent [57].
3.5.2 Transmit and receive characteristics

The transmit and receive characteristic describe the spatial sound distribution of the transmit pulse and the
spatial filter response of the conventional beamformer, respectively. Important parameters are the −3 dB main
lobe width and the side lobe level.
For the MLA transmit characteristic, a broad, uniform main lobe is desirable. This allows scatterers to be
irradiated equally to generate echoes from all spatial directions with a single transmit pulse. The receive
46
characteristic requires a narrow main lobe in order to determine the originating directions of the received
echoes. Side lobes cause phantom reflections which degrade the image quality and result in false detections.
Therefore, a low side lobe level is preferred (Section 2.4.5).
For the transmit characteristic measurement, the calibrated microphone is used as target and mounted at a
distance of 2 m on the slide in front of the phased array. The phased array sequentially transmits pulses using
a single transducer. The calibrated microphone records the incoming pulse signals and their maximum ampli-
tudes are stored. After each pulse transmission, the phased array is rotated mechanically in steps of 2◦ using
the rotational axes. This pitch-catch sequence is repeated until all directions of (Fig. 3.5a) have been covered.
The measurement is normalized to the largest received amplitude and is linearly interpolated (Fig. 3.5).
The measured transmit characteristic has a −3 dB main lobe width of ±50◦ . There are several narrow
side lobes due to the pressure-release piston radiator directivity pattern, since the rigid baffle is finite
and additionally interrupted by the non-transmitting neighboring waveguide ports. The minimum relative
amplitude at −90◦ is −12 dB or 25 % of the largest value measured, respectively. The reduction of the effective
transmit aperture by the waveguide leads to a more uniform transmit characteristic compared to the bare
transducer (Fig. 3.5b). However, widening the transmit characteristic also results in a weakened maximum
pulse amplitude due to diffraction loss of −3.3 dB, as expected. In addition, the waveguide channel itself
attenuates the transmitted pulse by −3.8 dB due to internal reflections and friction (Fig. 3.4). Apart from
the attenuation, the shape of the pulse is preserved, since the waveguide has no significant impact on the
bandwidth of the pulse, as the piezoelectric transducer is narrow-band.
2 Bare transducer
Transducer with waveguide
Sound pressure (Pa)
Excitation
1
Ringing
0
3 4 5 6 7 8 9
Time (ms)
Figure 3.4: Measured amplitude envelope of two pulses (TPulse = 1 ms) transmitted by a bare transducer and
by the same transducer inserted into the waveguide, respectively. Both pulses are recorded by a
calibrated microphone at a distance of 1 m, directly in front of the transducer (0◦ ,0◦ ). Comparing
both pulses, the overall amplitude reduction is −7.1 dB consisting of −3.8 dB waveguide losses
due to internal reflections and friction, plus −3.3 dB diffraction loss, since the directivity pattern is
widened (Fig. 3.5). Apart from attenuation, the shape of the pulse is preserved, as the waveguide
has no significant impact on the bandwidth [57].
The receive characteristic is measured by using an ultrasonic transducer as target, which is positioned at
a distance of 2 m on the slide in front of the phased array. The transducer is of the same type as the array
elements (MA40S4S, 40 kHz, ⌀9.6 mm) and can be considered to be a point source at this distance. The
phased array is in receive-only mode and is not rotated mechanically during this measurement. Instead, it
points directly to the transducer. The transducer transmits a single pulse, which is recorded by all phased array
transducers. Using this recorded dataset, the conventional beamformer generates a summed, spatially filtered
signal for each direction shown in (Fig. 3.6a). The receive characteristic shows the normalized maximum
47
amplitudes of these signals as a function of the direction (Fig. 3.6).
90 0 0
−3
60
Amplitude (dBmax )
−10 −10
Elevation ϕ (◦ )
30
0 −20 −20
−30
−30 −30
−60 Single transducer without waveguide
Single transducer with waveguide
−90 −40 −40
90 60 30 0 −30 −60 −90 90 60 30 0 −30 −60 −90
(a) Azimuth θ (◦ ) (b) Azimuth θ (◦ )
Figure 3.5: Measured MLA method transmit characteristic (a) and its horizontal sectional view (b) of a single
transmitting transducer, which is inserted into the waveguide. In addition, the sectional view in
(b) is compared to the transmit characteristic of a single transducer without waveguide. For
these measurements, the phased array is mechanically rotated incrementally with the rotational
axes. At each step a pulse is transmitted, which is recorded by the calibrated microphone at a
distance of 2 m. The maximum amplitudes normalized to the corresponding largest value of
these pulses are shown. There are several sidelobes due to the pressure-release piston radiator
directivity pattern, since the rigid baffle is finite and additionally perforated by the non-transmitting
waveguide ports. The transmit characteristic of the transducer with the waveguide attached is
more uniform since the effective aperture is reduced [57].
90 0 0
Model Main lobe
60 Measured width 12◦
Amplitude (dBmax )
−10
Elevation ϕ (◦ )
30 −11.5
0 −20 −20
−30
−30 −30 Side lobe level
−60
−90 −40 −40

90 60 30 0 −30 −60 −90 90 60 30 0 −30 −60 −90
(a) Azimuth θ (◦ ) (b) Azimuth θ (◦ )
Figure 3.6: Measured receive characteristic (a) and its horizontal sectional view (b). For this measurement, an
ultrasonic transducer transmits from a distance of 2 m directly to the only receiving phased array.
The conventional beamformer generates the reception pattern shown, which corresponds to the
transfer function of the spatial filter for the direction (0◦ , 0◦ ). The asymmetry and irregularities
compared to the analytic model are caused by differences in the sensitivities and phase responses
of the individual transducers. Since both methods use all transducers for receive beamforming,
the corresponding receive characteristics are equal [57].
48
The measured receive characteristic has a main lobe width of 12◦ and a side lobe level of −11.5 dB. Compared
to the analytic model (12◦ , −12.8 dB), the main lobe width is in good agreement. However, the measured
receive characteristic has a side lobe level 1.3 dB higher and its shape is asymmetrical. This is caused by
differences in the sensitivities and phase responses between the ultrasonic transducers, also affecting the
angular resolution, as shown in the following section.
3.5.3 Range and angular resolution

The range and angular resolution describe the minimum distance between two objects such that they appear
separated in the image generated. The objects are defined as separable if their echoes create two distinct
local maxima.
In order to measure the range or angular resolution, two steel spheres (⌀ 100 mm) are placed either radially
or angular next to each other on the profile mounted on the slide. The slide is positioned at a distance of
R = 2 m in front of the phased array. The distance D between the spheres is gradually increased. After each
increment, the separability is evaluated using B-Scan images (Fig. 3.7). The minimum distance Dmin, ra for
separability of two radially adjacent spheres directly determines the range resolution Dres, ra , i.e.
Dres, ra = Dmin, ra . (3.16)

However, the angular resolution θres, ang is expressed as an angle, since the minimum distance Dmin, ang for
separability of two angular adjacent objects depends on the distance R to the phased array, i.e.
(︃ )︃
Dmin, ang
θres, ang = 2 · arctan . (3.17)
2·R
The range resolution is affected by two factors, the pulse length and the created sound shadow depending
on the object shape. For the measurement setup described, the range resolution is Dres, ra = 200 mm for
both, the MLA and the SLA method, as the excitation time is the same in both methods (TPulse = 1 ms). Due
to the narrow bandwidth of the transducer resulting in an increased pulse length (Fig.3.4), the measured
results are higher compared to the ideal range resolution of a 1 ms pulse with minimum rise and fall times
(171.5 mm). The measurement further shows that a direct line of sight is not necessarily required for object
detection. Since sound waves can diffract around objects, the echo of the rear object still reaches the array,
but its amplitude is weakened due to the sound shadow of the front object.
The measured angular resolution using the SLA method is improved (θres, ang = 12◦ ) compared to the
MLA method (θres, ang = 14◦ ). The reason is that beamformed transmit pulses irradiate the region of interest
direction-selectively. Therefore, they provide another layer of spatial filtering during pulse transmission, in
addition to the conventional beamformer after echo reception. Compared to the achievable angular resolution
using analytic models (SLA 9◦ , MLA 11◦ ), the measured results show a loss of 3◦ in both cases. Again, the
cause are variations in the sensitivities and phase responses between the transducers. In order to correct these
variations, an array calibration is required, which will be part of future work.
3.5.4 Range of view and field of view

In the B-Scan images shown, there is a blind zone up to a distance of 1 m (Fig. 3.7). In this zone, high
amplitudes are detected, which are caused by the transmitted pulse and the ringing of the transducer. These
high amplitudes mask incoming echoes. With a suitable amplitude scaling of the colorbar for close targets,
echoes above a minimum distance of Rmin = 0.5 m are detectable with both transmission methods. Since the
amplitudes in the blind zone change only slightly between several pulse excitations, they can be removed by
subtraction with a pre-recorded image. For characterization, the blind zone removal is not active.
49
Azimuth Amplitude Azimuth Amplitude
0.0 ◦ 1.00 0.0 ◦ 1.00
◦ -25 ◦ -25
0 .0 0 .0
25. ◦
25. ◦
0.80 0.80
◦
◦
-5
-5
.0
.0
0.0
0.0
0.60 0.60
50
50
◦
◦
3.0 m 3.0 m
0.40 0.40
2.0 m e 2.0 m e
g 0.20 g 0.20
an an
1.0 m R 1.0 m R
(a1) 0.00 (b1) 0.00
0.0 m 0.0 m
Azimuth Azimuth Amplitude
Amplitude
0.0 ◦ 0.0 ◦
◦ -25 ◦ -25
0 .0
25.
0 .0 ◦ 0.03 25. ◦ 0.03
-5
◦
-5
.0
0.0
.0
0.0
0.02
50
0.02
50
◦
◦
3.0 m 3.0 m
2.0 m e 0.01 2.0 m e 0.01

g ng
an 1.0 m Ra
1.0 m R
0.00 (b2) 0.00
(a2) 0.0 m 0.0 m
1.5 1.5
MLA method ×30
HP method (×30) MLAmethod
HP method(×30)
×30
SLA method
DTB method SLA method
DTB method
norm. Amplitude
norm. Amplitude
1 1
0.5 0.5
0 0
1 2 3 4 50 25 0 −25 −50
(a3) Range (m) (b3) Azimuth θ (◦ )
Figure 3.7: (a) B-Scans of the range resolution measurement using the single-line-acquisition (SLA) method
(a1) and the multi-line-acquisition (MLA) method (a2) and their correspoding radial sections at 0◦
(a3). The B-Scans show the echos of two spheres (⌀100 mm) positioned radially one behind the
other at a distance of 1.8 m and 2.2 m, respectively. Altough, there is no direct line of sight, the
echos can be localized with both methods. Since the amplitude of the transmitted pulse using
the MLA method is 30 times weaker, the SNR is decreased compared to the SLA method. The
high amplitudes of the transmitted pulse and the ringing transducers mask incoming echos in a
distance of up to 0.5 m, also referred to as blind zone. In this zone, objects are not detectable
without further processing. (b) B-Scans of the angular resolution measurement using the SLA
method (b1) and the MLA method (b2) and their correspoding angular sections at 2.1 m (b3). The
B-Scans show the echos of two spheres (⌀100 mm) positioned angular adjacent in the directions
7◦ and −7◦ , respectively. Both methods generate a distinct local maxima for each sphere, such
that they are separable. For the MLA method, the local minimum between the spheres is higher
and the side lobes are more prominent due to the wider main lobe and higher side lobe level of its
point spread function [57].
50
Table 3.1: Summary of the measurement results for the multi-line-acquisition (MLA) and single-line-
acquisition (SLA) method [57].
MLA method SLA method Remark
TX main lobe width 100◦ 21◦
RX main lobe width 12 ◦ 12◦
TX side lobe level - −7.2 dB
RX side lobe level −11.5 dB −11.5 dB
Range resolution 200 mm 200 mm Targets: two steel spheres (⌀100 mm), pulse length 1 ms
Angular resolution 14 ◦ 12 ◦ Targets: two steel spheres (⌀100 mm)
Min. detection distance 0.5 m 0.5 m Due to blind zone
Max. detection distance 3m > 6m* Target: one steel sphere (⌀100 mm)
Field of view ±80◦ ±90◦ Target: one steel sphere (⌀100 mm) at 2 m
B-Scan frame rate 43 FPS 0.76 FPS Number of directions: 51, max. range of view: 3 m
3D-Scan frame rate 29 FPS 0.015 FPS Number of directions: 2601, max. range of view: 3 m
*limited by anechoic chamber size
To determine the maximum detection distance Rmax , the distance R between the phased array and a single
sphere (⌀100 mm) is gradually increased along the rail. After each increment, a B-Scan image is used to
evaluate whether the echo of the sphere is detectable. The echo is defined as detectable if its amplitude is
higher than the noise level. The maximum detection distance using the SLA method is larger (Rmax > 6 m)
than using the MLA method (Rmax = 3 m). Since the SLA method uses all transducers for pulse transmission,
the amplitude of the beamformed pulse is approximately 30 times higher than the amplitude of the single
hemispherical pulse. Therefore, the beamformed pulse and its echos can propagate further before being
attenuated by air below the noise level. For the SLA method, the validation of the maximum distance of 6 m is
limited by the anechoic chamber size.
The same procedure is used to determine the maximum field of view. However, not the distance of the
sphere is varied (r = 2 m), but its direction in the array coordinate system. For this purpose, the phased array
is rotated mechanically in 5◦ steps with the horizontal rotational axis. For both methods the maximum echo
amplitude decreases with increasing direction due to the directivity of a single transducers. Using the SLA
method, the echo of the sphere is detectable in the entire field of view of ±90◦ . With the MLA method, the
field of view is limited to ±80◦ due to the overall weaker echo amplitude.
3.5.5 Frame rates

The frame rate indicates the number of completely generated and rendered B-Scans or 3D-Scans within one
second. The unit of the frame rate is frames per second (FPS). It is equivalent to the pulse repetition rate
(PRF) when using the MLA method. However, it differs from the PRF when using the SLA method since
multiple pulses are required to generate an image.
For comparison of the 3D-Scan frame rates of both methods, the region of interest is defined with
θmin = ϕmin = −50◦ , θstep = ϕstep = 2◦ ,
θmax = ϕmax = 50◦ , R = {0.017 m, . . . , 3 m} , (3.18)
where θmin , θmax , ϕmin , ϕmax are the minimum and maximum azimuth and elevation angles, θstep , ϕstep are
the angular step sizes and R is the range of view.
For the B-Scan frame rate comparison, the region of interest parameters are identical, however the field of
view is limited to the horizontal plane, i.e. ϕmin = ϕmax = 0◦ . Thus, the numbers of directions to be evaluated
51
for generating a 3D-Scan KV and a B-Scan KB are
KV = 51 · 51 = 2601 and KB = 51. (3.19)
The frame rate fFrame = 1/TFrame is calculated by measuring the time TFrame between the first pulse transmission
and the complete generation and rendering of the image. The frame rates given in (Table 3.1) were averaged
over 20 frames. Using the MLA method, B-Scan and 3D-Scan frame rates of 43 FPS and 29 FPS are achievable.
The frame rates are mainly determined by the sound propagation time (B-Scan 75 %, 3D-Scan 50 %). The
remaining part consists of data transfer and processing time. With the SLA method, the frame rates for a
B-Scan and 3D-Scan are 0.76 FPS and 0.015 FPS, respectively. Since this method sends a pulse for each
direction, the achievable frame rates significantly decrease with increasing number of directions. However,
this relation also indicates that the SLA method can achieve higher frame rates if the field of view is reduced
or the angular step size is increased. The system is able to select the region of interest and the transmission
method dynamically.
3.6 Chapter summary and conclusions

This chapter introduced and evaluated an air-coupled ultrasonic 3D imaging system using a URA based
on a waveguide and efficient low-cost transducers. In contrast to similar air-coupled ultrasonic imaging
systems (Section 3.1), this approach allows the adaptation of multiple transmit and receive methods with
individual strengths and weaknesses. For instance, the experimental characterization demonstrated that the
system is capable of localizing multiple objects in real-time with the multi-line-acquisition method, although
there are trade-offs compared to the single-line-acquisition method, most notably the reduced range of view.
Since the system supports both methods, the appropriate variant can be selected dynamically depending on
the situation. Considering that a sufficient frame rate is required, the MLA method is more suitable for close
range situations where a wide field of view is mandatory, whereas the SLA method is an approriate choice for
generating images with a long range of view but only a limited field of view. Given this adaptability, ultrasonic
3D imaging has high potential for improving the robustness of in-air environmental monitoring in conjunction
with lidar, radar and camera systems. The following chapters are aimed at the enhancement of the angular
resolution and image quality by evaluating non-uniform array topologies, calibrating transducer variations
and utilizing image processing algorithms.
52
4 Non-uniform sparse spiral transceiver PUT arrays

[33] ”Spiral air-coupled ultrasonic phased array for high resolution 3D imaging”,
in Proc. IEEE International Ultrasonic Symposium (IUS), 2020,
[35] ”Calibration of Air-Coupled Ultrasonic Phased Arrays. Is it worth it?”,
in Proc. IEEE International Ultrasonic Symposium (IUS), 2022, and
[34] ”Air-Coupled Ultrasonic Spiral Phased Array for High-Precision Beamforming and Imaging”,
IEEE Open Journal of Ultrasonics, Ferroelectrics, and Frequency Control, 2022.
The previous chapter addressed the problem of grating lobe formation due to too large transceiver elements in
a uniform dense array configuration by using a waveguide structure containing 64 acoustic ducts into which
the PUTs are inserted, reducing the effective inter-element spacing to λ/2. However, as the effective array
aperture is reduced by the tapered waveguide as well, the main lobe width (MLW) increases, thus lowering
the beamforming precision, and, consequently, the angular imaging resolution.
In order to overcome both design limitations of uniform dense arrays on the allowed transducer diameter
and achievable array aperture size, given a fixed number of elements, sparse array geometries are a viable
option. Due to their non-uniform element positioning, the formation of grating lobes is prevented even at
greater inter-element spacings. As a result, the array aperture can be enlarged for improving the beamforming
precision without requiring the number of transducers to be increased [104]–[106].
This chapter demonstrates that ultrasonic in-air imaging, in particular, benefits from these features, enabling
large-volume, unambiguous and high-resolution image formation. As in the previous chapter, a real-world
phased array system is considered consisting of 64 MA40S4S PUTs and the identical transceiver electronics,
hardware control and signal processing components. However, the elements are non-uniformly and sparsely
arranged, specifically based on the sampled Fermat spiral, such that all IES are greater than λ/2, thus
spanning a significantly larger circular aperture with a diameter of 200 mm compared to the waveguide
array (35 mm×35 mm). The goal here is to improve the beamforming precision without increasing the system
complexity.
For validation, the real-world beamforming behavior is investigated, as wella as the resulting in-air imaging
capabilities of the non-uniform sparse spiral array geometry by providing an in-depth experimental and
numerical characterization. Since the improved beamforming precision is valuable not only for imaging, but
generally for many air-coupled ultrasonic array applications, first, an application-independent characterization
of the generic transmit-only (TX), receive-only (RX), and pulse-echo (PE) operation modes is conducted. After
that, the general findings are put into application context by examining the resulting in-air imaging capabilities,
which are then compared to the waveguide array. Before covering the evaluation and the methodologies
involved, the next sections first provide more detail on the specific Fermat spiral array geometry used, which
has been applied in other fields as well.
4.1 Related work on non-uniform sparse spiral array geometries

The advantageous properties of sparse spiral array geometries based on the Fermat spiral have been highlighted
in the works [107]–[109]. First, the simple deterministic design allows a flexible customization to the
53
requirements of the application. Second, the MLW and maximum side lobe level (MSLL) can additionally be
fine-tuned by density tapering [108]. Third, the suppression of grating lobes is independent of the frequency
used, thus unambiguous beamforming is ensured for narrow- and broad-band signals [109].
Spiral phased arrays become increasingly popular and have been investigated in various domains ranging
from satellite communication [108], radar [110], microwave imaging [111] and optical phased arrays [112],
over noise source localization [113], [114] to medical ultrasound [115]–[118]. For air-coupled ultrasonic
applications, they have been examined for the transmit-only generation of haptic feedback [119], for NDT
using receive-only microphone arrays [120] and for the detection of pasture biomass and grape clusters
using two dedicated transmit and receive arrays without beam steering [27], [121]. In the following, a fully
steerable sparse spiral transceiver array is used whose specific composition and array geometry are described
hereafter.
4.2 Sparse spiral sunflower array geometry

The spiral array geometry is composed of M = 64 piezoelectric air-coupled ultrasonic transducers (Murata
MA40S4S, ⌀10 mm) with a resonant frequency of 40 kHz and a narrow bandwidth of 1.2 kHz. The transducers
are arranged on the xy-plane on a single PCB along the Fermat spiral spanning an overall aperture with a
diameter of Dap = 200 mm (Fig. 4.1). The position of the m-th transducer rm is defined by sampling the
MA40S4S
Ø 10 mm
y
Aperture
Ø 200 mm z x
Figure 4.1: The spiral phased array consists of 64 ultrasonic 40-kHz narrow-band transducers. Each trans-
ducer can be used for transmit, receive and pulse-echo operation. Although the inter-element
spacings are greater than λ/2, the spiral geometry prevents the formation of grating lobes. The
enlarged aperture enables high-precision beamforming [34].
Fermat spiral, that is

(︄ √︃ √ )︄
m 1+ V
(Rm , φm ) = Rap , 2πm , and (4.1)
M −1 2
rm = (xm , ym , zm ) = Rm cos(φm ), sin(φm ), 0 , (4.2)
(︁ )︁
54
where M = 64 is the number of transducers, m ∈ [0, . . . , M − 1] is the transducer index, Rm is the distance
to the aperture center and φm is the corresponding angle, Rap = Dap /2 is the maximum aperture radius and
V = 5 is a design parameter that determines the angular distance between two successive elements and
therefore the number and position of the spiral arms. By choosing V = 5, the angular distances correspond
to the Golden Angle, resulting in the so-called sunflower pattern. This pattern features a uniform spatial
density of the transducer positions where all angles φm are unique due to the Golden Angle being an irrational
number [108], [116]. In [109], spiral array configurations with different V values are analyzed with the
conclusion that V = 5 gives Pareto optimal results in terms of MLW and MSLL. Density and amplitude
tapering, as proposed in [108], [109], [116], are not applied to maintain a balanced ratio between MLW and
MSLL and to avoid limiting the radiation power and sensitivity. With the selected diameter of Dap = 200 mm
(23.3 λ), the expected MLW is halved from 5.2◦ to 2.6◦ , compared to the smallest possible aperture diameter
Dap, min = 100 mm (11.7 λ), limited by the size of the transducers. However, a further halving of the MLW
requires a further doubling of the aperture diameter. Therefore, a reasonable trade-off between the achievable
MLW and simple manufacturing as well as integrability into the existing measurement setup is found.
The remaining system components, i.e. transceiver electronics, the FPGA board and the PC, are described
in detail in the previous Section 3.3 and are briefly summarized with the key aspects hereafter. The
transceiver electronics provides a dedicated transmit and receive channel for each of the 64 transducers,
allowing individual time-delayed excitation with unipolar square-wave burst signals (1 ms, 20 Vpp , 40 kHz)
and individual sampling of the transducer signals (195 kSa/s, 12 bit). The FPGA SoC (Xilinx Zynq 7010)
is used for sequence control, signal generation and acquisition, and ethernet communication with the PC,
where the signals are processed and analyzed using Matlab. All together, the system described supports fully
steerable TX and RX beamforming, and therefore in combination, also acoustic imaging using PE detection.
4.3 Operation modes and numerical simulation model

In contrast to the previous waveguided array geometry, apart from exclusively using unfocused beams for
far-field beamforming, here, focused beams in the near-field are considered as well, due to the larger aperture
size and the resulting extended near-field up to approximately Rnat = 1.16 m (Section 2.2.2) of the spiral
array geometry. The following experimental characterization therefore investigates the three beamforming
operation modes TX, RX and PE each with the focused near-field and unfocused far-field variant, which are
subsequently compared with each other.
The beamforming methods required have been introduced in Chapter 2 for the URA geometry, but are also
applicable to the non-uniform spiral array geometry by using the corresponding element positions rm . The
relevant sections are
• unfocused and focused transmit beamforming (TX mode) in 2.2.1 and 2.2.3,
• unfocused and focused receive beamforming (RX mode) in 2.3.1 and 2.3.2 ,
whereas unfocused and focused two-way pulse-echo beamforming (PE mode) corresponds to the combination
of the TX and RX mode. The pulse transmission and frequency domain implementation of the receive
beamforming including the pre-processing of the received signals are realized as described in Section 3.4.
In addition to the application-independent evaluation of the individual operating modes, the imaging
capabilities using the SLA method, which utilizes the PE mode (Section 2.4.7), is investigated as well. The
SLA imaging method is applicable in the near-field and far-field and is suitable for long-range imagingdue
to the array gains during transmission and reception. In addition, the two-fold spatial filtering lowers the
MSLL, thus, improves the suppression of side lobe artifacts. In particular, the latter is a major challenge for
the high-frame rate MLA imaging technique, which is addressed in more detail in Chapter 6.
55
In order to evaluate the results of the subsequent experiments (Section 4.4) and to identify differences
from the expectations, a numerical simulation model for the real measurements is provided. The model is
based on the discretized Rayleigh integral, introduced in Section 2.2.1 for wave transmission. Apart from
the array geometry itself, the effective aperture of each MA40S4S transducer (Dap,m = 7 mm) is considered
as well by generating a mesh using DistMesh [53], which consists of L = 362 mesh points on a circular disk
with an average spacing of 0.05 λ. In this way, the directivity of the single transducers is included. Due to the
reciprocity, the specific model
M −1 L−1
∑︂ e−j 2π ∥r0 −rm,l ∥2
1 ∑︂ λ
p(rP , r0 ) = ∗
s0 (t)wm (r0 ) , (4.3)
L 2π∥r0 − rm,l ∥2
m=0 l=0
is applicable for transmission and reception, where rP is either the position of a microphone or the position of a
single transmitter, whereas r0 is either the transmit or the receive focus point location, respectively. Therefore,
this model can be used to simulate all real measurement procedures and results.
4.4 Experiments, results and discussion

The experiments are grouped into two categories. First, general application-independent characterizations
of the TX, RX, and PE operation modes are conducted and compared to the simulation model. For each
mode, the 2D directivity patterns, the sectional directivity patterns for varying focal points in the near- and
far-field, and the radial on-axis patterns for varying focal distances are examined. From these patterns, the
most significant parameters are extracted and discussed, i.e. the magnitude at the focal point, the position of
the focal peak, the MLW or focal length, and the angular or radial MSLL. Second, the more specific use case of
acoustic imaging is addressed. Here, the angular resolution is validated and the imaging of multi-reflector
scenarios in both, the far- and near-field, is examined.
4.4.1 Measurement setups

All measurements are performed in an anechoic chamber. The spiral array including a rigid baffle is mounted
on two rotational axes (Fig. 4.2).
a Spiral array embedded b c d e f

in rigid baffle
mm
140
Two
rotational
axes
Calibrated Ultrasonic Corner Sphere Sphere Two spheres
microphone Linear axis transducer reflector Ø100 mm Ø50 mm Ø50 mm
Figure 4.2: The measurement setup in the anechoic chamber allows the different targets to be positioned three-dimensionally in the
coordinate system of the array. The positioning of the target and the focal point is fully-automated and independently
controllable. The respective targets used are for TX mode measurements the calibrated microphone (B&K Type
4138) (a), for RX mode the pre-characterized ultrasonic transducer (Murata MA40S4S) (b) and for PE mode the corner
reflector (c) or one of the sphere reflectors (d),(e). The double sphere setup on the transverse profile is used to measure
the angular resolution (f) [34].
56
The rigid baffle is aligned with the output surfaces of the transducers. Therefore, the setup resembles the
simulation model more accurately, for which an infinite rigid baffle is assumed. In front of the array, there is a
linear axis (length of 6 m) with a movable slide onto which different targets can be mounted. Using the linear
axis and the two rotational axes, the target can be freely positioned in the coordinate system of the array.
The selection of the specific target depends on the array operation mode being examined. In TX mode the
calibrated microphone (B&K Type 4138 and B&K Type 2670 preamplifier), in RX mode a pre-characterized
ultrasonic transducer (Murata MA40S4S) and in PE mode either the corner reflector or one of the sphere
reflectors is used as target (Fig. 4.2). Overall, the setup allows the automated and independent positioning of
the target and focal point.
4.4.2 Two-dimensional directivity patterns

The 2D directivity patterns show the direction-dependent magnitude distribution of the main and side lobes
when beamforming, i.e. the main characteristics of the spatial filtering. The far-field patterns of all three
modes (TX, RX, PE) are measured and compared to the simulated pattern (Sim), (Fig. 4.3).
In TX mode, the calibrated microphone [Fig. 4.2(a)] is used as target positioned at a fixed distance of
5 m. The array sequentially transmits unfocused pulses (40 kHz, 1 ms) steered to the fixed direction (0◦ , 0◦ ).
After each pulse transmission, the array is mechanically rotated in steps of 1◦ using the two rotational axes.
The microphone aperture is not rotated, but always points to the array center. This way, the microphone is
positioned at different directions of the array coordinate system until pulses from all angles of [Fig. 4.3(b)] are
obtained. The corresponding maximum magnitude of the pulses received are plotted and linearly interpolated
in the directivity pattern.
In RX mode, the same procedure is used, however the microphone is replaced with the pre-characterized
ultrasonic transducer [Fig. 4.2(b)], sequentially transmitting pulses (40 kHz, 1 ms) to the array. The array
signals are spatially filtered for direction (0◦ , 0◦ ) using unfocused receive beamforming, while the array is
mechanically rotated.
For measuring the PE directivity pattern, the corner reflector [Fig. 4.2(c)] is used as target and the unfocused
TX and RX modes described are combined, i.e. two-fold spatial filtering is applied. The corner reflector has
been selected as target as it creates a stronger echo compared to the spheres. Therefore, despite the doubled
length of the propagation path (10 m), the echo of the reflector stands out from interfering echoes of the
environment, caused by the sound absorbers at the walls, rail and mountings for example.
In all measured and simulated patterns (Fig. 4.3), there is a single narrow main lobe at the expected center
direction (0◦ , 0◦ ) and grating lobes do not form. The MLWs of the single patterns differ only slightly (3◦ for
Sim, TX, RX and 2.5◦ for PE). There is a side-lobe-reduced zone surrounding the main lobe followed by a
concentric ring of side lobes at approximately θ + ϕ = 30◦ . This ring contains the highest side lobes for
√︁
2 2
all patterns, thus defining the respective MSLL, which also differs only slightly from the expected value, i.e. in
dB for Sim −15, for TX −17, for RX −15.9, and for PE −26.7. The side lobe level of the PE pattern is overall
lower and
√︁ the MLW is slightly narrower, due to the two-fold spatial filtering. At the periphery of the patterns,
where θ2 + ϕ2 > 70◦ , there is a major drop in side lobe level caused by the directivity of the individual
transducers themselves. Since the color scale of the PE pattern is adjusted, the noise floor and scattering
reflections of the room become visible in this zone.
In summary, primarily the exact positions and levels of the side lobes are different from the expected
simulated patterns. The differences are caused by interfering room reflections, as well as relative amplitude
and phase deviations between the individual transducers, which will be examined in the follow-up chapter.
Nevertheless, all directivity patterns measured agree well with the prediction of the model and confirm the
grating lobe free beamforming capability with high precision.
57
0
80
Magnitude (dBmax )
40
Elevation ϕ (◦ )
−10
0
−40
−20
−80
(a) Sim (b) TX
0
80
Magnitude (dBmax )
40
Elevation ϕ (◦ )
−20
0
−40
−40
−80
80 40 0 −40 −80 80 40 0 −40 −80
(c) RX Azimuth θ (◦ ) (d) PE Azimuth θ (◦ )
0 Sim TX RX PE
Magnitude (dBmax )
−20
−40
−60
80 60 40 20 0 −20 −40 −60 −80
◦
(e) Horizontal sections Azimuth θ ( )
Figure 4.3: The 2D directivity patterns are measured in the far-field (5 m) and show the magnitude distribution
of the main and side lobes for a steering angle of (0◦ ,0◦ ), i.e. the main characteristics of the
spatial filtering. The targets used are the microphone for TX, the ultrasonic transducer for RX
and the corner reflector for PE [Fig. 4.2(a),(b),(c)]. All directivity patterns measured (b)-(d) agree
well with the prediction of the simulation (a) and confirm the grating lobe free high-precision
beamforming capability. For easy comparison, the color scale of the PE pattern is adjusted by
0.5 due to two-fold spatial filtering [34].
4.4.3 Sectional directivity patterns for varying focal angles

Next, the sectional directivity patterns with varying focal angles are measured to analyze the influence on
the magnitude at the focal angle, the angle of the focal peak, the MLW and MSLL. The parameters examined
indicate the direction-dependent precision and accuracy of the beamforming. All three modes (TX, RX and PE)
are considered in the far-field (5 m) and in the near-field (0.2 m), and are compared to the simulation (Sim).
A similar measurement procedure as in Section 4.4.2 is used with the following differences. First, for each
58
mechanical rotation the focal angle is varied sequentially between 0◦ and 90◦ in steps of 5◦ . Second, the focal
distance is either at infinity (unfocused) for the far-field measurement or focused at the target distance (0.2 m)
for the near-field measurement. Third, the step size of the mechanical rotation is reduced to 0.5◦ , but limited to
the horizontal plane (azimuth) due to the high amount of data generated and increased measurement duration.
All measurements are repeated five times to identify possible deviations of the parameters examined. The
targets used for the respective modes are the microphone (TX) [Fig. 4.2(a)], the transducer (RX) [Fig. 4.2(b)],
the corner reflector (far-field PE) [Fig. 4.2(c)] and the small sphere (near-field PE) [Fig. 4.2(e)]. In the
near-field measurement, the small sphere is advantageous due to its smaller retroreflective area, so that the
measurement of the MLW is less influenced by the reflector dimension.
The sectional directivity patterns of the TX mode measured in the far- and near-field are shown as an
example for selected focal angles in (Fig. 4.4). For both measurements, the −3 dB main lobe width widens
with increasing focal angle, thus decreasing the precision and selectivity of the beamforming. In addition, the
magnitude at the focal angle decreases, mainly due to the directivity of the individual transducers. As a result,
the MSLL degrades as well, since peripheral main lobes are more attenuated by the directivity than centrally
located side lobes. All these effects additionally cause a mismatch between the angle of the focal peak and the
desired focal angle, so that the accuracy of the beamforming decreases with increasing focal angle.
For easy comparison, instead of presenting the directivity patterns of all operation modes and focal angles,
only the key parameters are extracted, i.e. the magnitude at the focal angle, the angle of the focal peak, MLW
and MSLL, as a function of focal angle for the near- and far-field (Fig. 4.5).
The magnitudes at the focal angle decrease for all operation modes, in the near-field and far-field with
increasing focal angle [Fig. 4.5(a1),(b1)]. The decrease measured mostly agrees well with the prediction
of the simulation, as it includes the aperture of the single transducers. However, there are differences to
the simulation, primarily for peripheral focal angles |θ| > 80◦ , where the magnitudes measured drop more
severely. The cause is the housing of the transducer elements, which is not taken into account by the simulation
model. The PE magnitudes at the focal angle show a significantly steeper drop, since the pulse is affected twice
by the directivity of the single transducers, when transmitting and when receiving. Overall, the transducer
directivity causes a slightly higher attenuation of the magnitudes at the focal angle in the far-field than in the
near-field.
The angle of the focal peak agrees well with the desired focal angle for all measurements and the simulation
up to |θ| < 75◦ , confirming a high beamforming accuracy [Fig. 4.5(a2),(b2)]. For focal angles higher than
75◦ , the angle of the focal peak is lower than the desired focal angle and is limited to approximately 80◦
and 83◦ for the far-field and near-field measurements, respectively. This limitation is again caused by the
directivity of the single transducers, which strongly attenuates the magnitude for high focal angles. Therefore,
the simulated angle of the focal peak reaches a higher limit (86◦ ), since the increased attenuation by the
transducer housing is not included. In summary, these results imply that the beamforming capability for TX,
RX and PE is accurate but constrained to the respective limits for the near- and far-field.
The MLW widens with increasing focal angle, thus reducing the beamforming precision [Fig. 4.5(a3),(b3)].
All measurements are close to the expected values from the simulation. The MLW is between 2◦ and 5◦ up to
a focal angle of |θ| < 55◦ followed by a more pronounced widening with increasing focal angle. The PE MLW
is slightly narrower for all focal angles due to the two-fold spatial filtering. The difference for the TX and RX
modes to the simulated MLW is within ±1◦ up to a focal angle of |θ| < 80◦ . In the far-field, the MLWs of all
modes and the simulation reach a plateau at approximately |θ| = 80◦ above which the MLW widens only
slightly. The plateau is mainly caused by the directivity from [Fig. 4.5(b1)], which attenuates the main lobe
for high focal angles preventing further widening. A similar characteristic is observed in the near-field, but
the plateau is followed by a narrowing. The narrowing is primarily due to the notch in the directivity at 80◦
[Fig. 4.5(a1)], where the main lobe is selectively attenuated, thus affecting the −3 dB width. However, the
overall differences in MLW between the near- and far-field measurements are only minor.
59
Magnitude at
Angle of focal peak focal angle
0 0◦ 60◦ 0 0◦ 60◦
Magnitude (dBmax )
Magnitude (dBmax )
20◦ 80◦ MLW 20◦ 80◦
−10 40◦ 90◦ MSLL −10 40◦ 90◦
−20 −20
−30 −30
−40 −40
−20 0 20 40 60 80 −20 0 20 40 60 80
Target angle (◦ ) Target angle (◦ )
(a) Far field (5 m) (b) Near field (0.2 m)
Figure 4.4: Sectional directivity patterns of the TX mode, as an example (non-averaged), for varying focal
angles in the near- and far-field. The MLW widens, the MSLL rises, and the magnitude of the focal
peak lowers for increasing focal angles. The angle of the focal peak deviates from desired high
focal angles (80◦ , 90◦ ). Thus, the beamforming precision and accuracy in the periphery is reduced.
The parameters labeled are compared in (Fig. 4.5) for all operation modes [34].
The MSLL degrades with increasing focal angle, as a result of the directivity of the single transducers
[Fig. 4.5(a4),(b4)]. The TX and RX MSLL measurements in the near- and far-field show only a slight deviation
from the simulation of approximately ±2 dB. The PE MSLL is in general approximately two times lower in
dB scale than the TX and RX MSLL due to the two-fold spatial filtering, as expected. In order to obtain
accurate MSLL results, the suppression of interfering reflections by covering mountings with foam absorbers is
mandatory. Otherwise, false sidelobes can be created impairing the measurement. For all extracted parameters,
the standard deviations obtained from five measurements do not show significant irregularities.
In the context of acoustic imaging with the PE method, the parameters measured have the following effects.
The MLW significantly determines the angular resolution, indicating the minimum angular distance between
two reflectors required, such that they can be imaged separately. Therefore, two peripherally positioned
reflectors are required to have a greater angular distance from each other for separability, compared to
two centrally positioned reflectors. Since the angular resolution is approximately constant with distance,
objects can be imaged with improved absolute resolution if they are close to the array, as demonstrated in
Section 4.4.6.
The deviations between the selected focal angle and the resulting angle of the focal peak cause peripherally
located reflectors to be imaged at an incorrect angle. For example, reflectors located at ±80◦ will additionally
be rendered to an angle of ±90◦ in the image.
Furthermore, the decrease of the magnitude at the focal angle causes two identical reflectors to be imaged
with different magnitudes despite the same reflectivity, if one is placed centrally and one peripherally. In
conjunction with the increasing MSLL, the detected echo of the peripheral reflector stands out less prominently
from the side lobe artifacts caused by the centrally located reflector. Therefore, the relative dynamic range
for detection is reduced in the peripheral region: If a reflector is located in the center of the ROI, weaker
peripheral reflectors are particularly difficult to detect. In addition, when many reflectors are present in the
ROI, their side lobe artifacts will superimpose, thus reducing the relative dynamic range in the entire image,
which can result in false detections. This effect is investigated in Section 4.4.6 with multi-reflector setups.
60
Near field (0.2 m) Far field (5 m)
Mag. at focal angle (dBmax )
Mag. at focal angle (dBmax )

0 0
−5 −5
−10 −10
−15 −15
−20 −20
−25 Sim
−25
Sim
TX TX
−30 RX −30 RX
−35 PE
−35 PE
−40
(a1) −80 −60 −40 −20 0 20 40 60 80 90 (b1) −80 −60 −40 −20 0 20 40 60 80 90
90 90
Angle of focal peak (◦ )
Angle of focal peak (◦ )

80 Sim 80 Sim
TX TX
70 RX 70 RX
60 PE 60 PE
50 50
40 40
30 30
20 20
10 10
0 0
(a2) 0 10 20 30 40 50 60 70 80 90 (b2) 0 10 20 30 40 50 60 70 80 90
16 16
Main lobe width (◦ )
Main lobe width (◦ )
Sim Sim
14 TX 14 TX
12 RX 12 RX
PE PE
10 10
8 8
6 6
4 4
2 2
0 0
(a3) 0 10 20 30 40 50 60 70 80 90 (b3) 0 10 20 30 40 50 60 70 80 90
5 5
Sim Sim
0 TX 0 TX
MSLL (dB)
MSLL (dB)
−5 RX −5 RX
PE PE
−10 −10
−15 −15
−20 −20
−25 −25
−30 −30
0 10 20 30 40 50 60 70 80 90 0 10 20 30 40 50 60 70 80 90
(a4) (b4)
Focal angle (◦ ) Focal angle (◦ )
Figure 4.5: Extracted key parameters from the sectional directivity patterns as shown in (Fig. 4.4) for all operation modes and
the simulation. All values shown are averaged over five measurements. The error bars and shading indicate the
corresponding standard deviation. The microphone is used for characterizing the TX mode, the ultrasonic transducer
for the RX mode, the small sphere (⌀50 mm) for the near-field PE mode, and the corner reflector for the far-field PE
mode [Fig. 4.2(a),(b),(c),(e)]. The directivity of the single transducers affects all parameters with increasing focal
angles, i.e. the magnitude at the focal angle is reduced (a1),(b1), the angle of the focal peak deviates for focal angles
|θ| > 80◦ (a2),(b2), the MLW reaches a plateau for focal angles |θ| > 80◦ (a3),(b3), and the MSLL rises (a4),(b4).
Major differences between measurements and expected values mainly occur for high focal angles > 80◦ due to the
transducer housing which is not included in the simulation. The housing causes an additional attenuation in the
peripheral transducer directivity, evident in (a1),(b1) between 80◦ to 90◦ . Likewise, this causes simulation mismatches
in the other parameter measurements in the same angular range. The measurements and simulations of the MSLL
are in good agreement. In general, the values of the PE measurements are approximately two times lower in dB scale
due to the two-fold spatial filtering. Overall, the highest beamforming accuracy is achieved for focal angles up to 75◦
(a2),(b2) and the highest precision for focal angles up to 55◦ (a3),(b3) [34].
61
4.4.4 Radial on-axis pattern for varying focal distances
Similar to the sectional directivity patterns for different focal angles, the radial on-axis patterns are examined
for different focal distances. All three operation modes (TX, RX, PE) are considered and compared with
the simulation model (Sim). Instead of mechanically rotating the array, only the distance to the target is
increased sequentially in steps of 1 cm from 3 cm up to 3 m using the linear axis. For each target distance,
the beamforming focal distance is varied from 10 cm to 3 m in steps of 10 cm. In addition, the focal distance
at infinity (unfocused beamforming) is considered. The targets of the respective modes are the microphone
[Fig. 4.2(a)], receiving pulses from the array (TX), the ultrasonic transducer [Fig. 4.2(b)], transmitting pulses
to the array (RX) and the sphere (⌀100 mm) [Fig. 4.2(d)], reflecting the pulses back to the array (PE). The
larger sphere (⌀100 mm) has been chosen for the PE measurement as it provides a higher echo amplitude than
the small sphere (⌀50 mm) and is therefore suitable in the near- and far-field, since it stands out more from
interfering room reflections. Moreover, unlike the corner reflector, the sphere has a well-defined reflection
point.
The maximum magnitudes of the received pulses are evaluated, which all have a duration of 1 ms, a
frequency of 40 kHz and are excited using 20 Vpp . All measurements are repeated five times to identify possible
deviations of the parameters examined. From the radial on-axis patterns obtained, the four key parameters
are analyzed, i.e. magnitude at the focal distance, the distance of the focal peak, the focal length, and the
MSLL out of the focal point. These parameters provide information on the distance-dependent precision and
accuracy of the focused beamforming.
The radial magnitude distributions measured are shown for the TX mode and selected focal distances in
[Fig. 4.6(a)] as an example. With increasing focal distance, the following effects are observed. First, the
maximum magnitude at the respective focal distance decreases due to the attenuation of the medium. As a
result, the MSLL out of the focal point steadily increases. The −3 dB focal length widens, reducing the distance
selectivity of the focal point. Furthermore, the distance of the focal peak to the desired focal distance starts to
diverge for focal distances higher than 0.5 m.
Next, the key parameters extracted from the radial on-axis patterns are analyzed, i.e. the magnitude at the
focal distance, the distance of focal peak, focal length and MSLL. First, the magnitude at the focal distance
is examined [Fig. 4.6(b)]. For easy comparison, the respective measurements were normalized to their
corresponding value at a distance of 1 m. In addition, the dB values of the PE measurement are adjusted by a
factor of 0.5 to account for the doubled propagation path. Due to the measurement setup and the extension
of the sphere, the closest focal distance of the PE measurement is 20 cm. The magnitudes at the focal distance
of all measurement modes and the simulation are in good agreement. For each mode, a global magnitude
maximum exists at different distances, i.e. 10 cm (TX, RX), 28 cm (PE) and 5 cm (Sim). The position of the
corresponding maximum is affected by the shape of the respective target. For example, the dimension of
the sphere reflector influences the PE measurement for close distances, whereas the simulation assumes an
infinitely small target.
The decrease in magnitude for focal points closer to the array than the main maximum is primarily due
to the directivity of the individual transducers. With increasing focal distance, the magnitude at the focal
distance decreases as well caused by the attenuation of the medium. The PE measurement is attenuated the
most due to the doubled propagation path. The differences of the measurements to the simulation arise from
atmospheric absorption effects, influenced by temperature, air pressure and humidity [24], which are not
represented in the model. Despite the attenuation by the medium, high sound pressure levels are measured
by the calibrated microphone in TX mode, i.e. max. 152 dBSPL at 10 cm to min. 119 dBSPL at 5 m distance.
The distance of the focal peak in relation to the desired focal distance indicates the accuracy of the focused
beamforming [Fig. 4.6(c)]. The overall characteristics of all modes are very similar to the simulation. The
distance of the focal peak matches the desired focal distance only up to 0.5 m and diverges increasingly for
62
larger distances. Therefore, in order to position the focal peak at a specific distance, a larger focal distance
must be selected. However, the focal peak cannot be positioned further than approximately 1 m, as expected
for the ⌀200-mm aperture.
Magnitude at focal distance

150
MSLL Focal length 0.3 m
Magnitude (dBSPL )
140 0.5 m
0.7 m
130
0.9 m
120 1.1 m
1.3 m
110 Distance of focal peak unfocused
100
0 0.5 1 1.5 2 2.5 3
(a) Target distance (m)
Mag. at focal distance (dB1 m )
20
Distance of focal peak (m)
15 TX: 152 dBSPL Sim Sim
TX 1 TX
10 RX RX
5 0.5 PE PE
0
−5 0.5
−10
−15
−20 TX: 119 dBSPL
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 0.5 1 1.5 2 2.5 3
(b) Focal distance (m) (c) Focal distance (m)
Max. side lobe level (dB)
1.25 2.5
Sim
1 TX 0
Focal length (m)
RX
PE −2.5
0.75 −5
0.5 −7.5
Sim
−10 TX
0.25 RX
−12.5 PE
0 −15
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
(d) (e) Focal distance (m)
Focal distance (m)
Figure 4.6: Radial on-axis patterns (non-averaged) of the TX mode as an example (a) for varying focal
distances in direction (0◦ , 0◦ ). The parameters marked are extracted and compared for all
operation modes in (b)-(e). The values in (b)-(e) are averaged over five measurements. The error
bars and shading indicate the corresponding standard deviation. The microphone is used for
characterizing the TX mode, the ultrasonic transducer for the RX mode, the sphere (⌀100 mm)
for the PE mode [Fig. 4.2(a),(b),(d)]. Since the close-range magnitudes measured (b) are affected
by the shape of the respective targets, they are normalized to the value at a distance of 1 m for
easy comparison. Additionally, the dB values of the PE measurement are adjusted by a factor of
0.5 to account for the doubled propagation path [34].
The focal length is the range between the −3 dB limits surrounding the focal peak and determines the
precision of the focused beamforming, which degrades with increasing focal distance [Fig. 4.6(d)]. The
values measured are in good agreement with the simulation. The focal length of the PE measurement is
significantly lower than the other measurements for all focal distances primarily due to the two-fold spatial
63
filtering. Depending on the specific application, the focal length characteristics measured can be used to
determine whether focusing at a certain distance is feasible or generally required.
Finally, the MSLL is analyzed, i.e. the ratio between the magnitude of the focal peak and the corresponding
highest local maximum outside the focal zone [Fig. 4.6(e)]. Therefore, a high MSLL implies a degraded
selectivity of the focused beamforming as well. Compared to the simulation, the qualitative characteristics
of all measurements are similar. The MSLL of the TX and RX measurements are approximately 3 dB higher
than the simulation values, whereas the PE MSLL is generally lower due to the two-fold spatial filtering. The
general differences between simulation and measurement are additionally caused by amplitude and phase
deviations between the single transducers. For all extracted parameters, the standard deviations obtained
from five measurements do not show significant irregularities.
Next, the parameters examined are interpreted for an imaging application using the PE mode. First, it is
possible to detect the test sphere (⌀100 mm) [Fig. 4.2(d)] in the entire measurement range between 20 cm to
5 m with an appropriate focal distance [Fig. 4.6(b)]. For imaging, it is often preferred to scan the complete
ROI using as few transmission events as possible in order to reduce the measurement time. Therefore, a long
and uniform focal zone is advantageous, since objects in a larger range can be detected by a single pulse. Using
unfocused beamforming, the longest focal length is achieved and thus the largest range is covered. However,
objects with a distance of less than 1 m can not be detected with this method, as its radial on-axis pattern
shows several notches in this range [Fig. 4.6(a)]. Therefore, a reliable detection in the near-field can only be
ensured using additional focused pulses with appropriately close focal distances. Due to the decreasing focal
length with decreasing focal distance, multiple sequential pulse transmissions with different focal distances
are required. For example, in order to cover a viewing range from 20 cm to 5 m, pulses must be focused at
least to the distances 0.3 m and 0.5 m in addition to unfocused beamforming. In summary, although the large
aperture of the spiral array is advantageous for the achievable angular resolution, additional transmission
events are required for detection in the extended near-field.
4.4.5 Angular resolution

The achievable angular resolution using SLA with the PE method is experimentally validated. The angular
resolution specifies the minimum angular distance between two objects required so that they appear separated
in the image generated. For separation, the echoes in the image must create two distinct local maxima
(Section 2.4.4). In this experiment, the relative level of the local minimum between the echo maxima is
analyzed as a function of angular distance between two objects. Two small spheres (⌀50 mm) [Fig. 4.2(e)]
are used as targets. Their reflection points are positioned at a distance of 2 m in front of the array center. The
spheres are mounted on a transverse profile to allow the spacing between the sphere reflection points to be
increased symmetrically relative to the z-axis in steps of 2 cm from 6 to 20 cm. For each sphere spacing, a
2D scan is performed from −10◦ to +10◦ with an angular step size of 0.2◦ and a focal distance of 2 m. Each
measurement is repeated 20 times to determine the mean level and standard deviation of the local minimum.
With a spacing of 6 cm, corresponding to an angular distance of 1.7◦ , the spheres are not yet separable, as
they appear as joint echo without a local minimum (Fig. 4.7). Starting from a spacing of 10 cm (2.3◦ ) a local
minimum with a relative level of −0.78 dBmax (90%) is formed. Therefore, the determined angular resolution
is in good agreement with the −3 dB main lobe width measured [Fig. 4.5(b3)]. However in practice, often a
higher margin between local minimum and maxima is required for a reliable separation, depending on the
separation algorithm and additional image processing used. Overall, the results are in good agreement with
the expected values from the simulation and the previous MLW measurements.
64
10
◦ 0◦ -10 ◦
Min. magnitude between Norm. amplitude

1.0
2.5 m
0.8
2.0 m
0.6
1.5 m
0.4
1.0 m
0.5 m
0.2
0.0 m
0.0
echo maxima (dBmax )

0
−10
−20
−30
−40 Sim
PE
−50
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
Angular distance between objects (◦ )
Figure 4.7: Results of the angular resolution measurement using the two spheres [Fig. 4.2(f)] whose angular
spacing is gradually increased at a distance of 2 m. In the B-scans (top), the echoes of the spheres
are imaged as two distinct local maxima starting from an angular distance of 2.3◦ . The lower the
minimum magnitude between the echo maxima (bottom), the more reliably they are distinguished
by separation algorithms. The values shown are averaged over 20 measurements. The error bars
indicate the corresponding standard deviation, which rises with decreasing magnitude starting
from −30 dB due to the noise floor [34].
4.4.6 Multi-reflector scenes in the far- and near-field

In order to provide a more vivid interpretation of the measurement data obtained and the resulting imaging
capabilities, two experiments with multiple reflectors in the far- and near-field are conducted. In addition to
analyzing the separability, the creation of artifacts due to the accumulation of side lobes is analyzed.
In the far-field experiment, a test pattern consisting of 28 corner reflectors positioned at a distance of 2 m
in front of the array (Fig. 4.8) is used. The entire pattern covers a width of 850 mm and is thus located within
an angular range of ±12◦ of the ROI. The corner reflectors are identical trihedrals with a front edge length
of 140 mm, arranged with a minimum spacing of 120 mm. The spacing corresponds to a minimum angular
distance of (3.4◦ ) and is thus slightly above the measured required angular resolution of approximately (2.3◦ ).
The scene is scanned line by line, i.e. SLA (Section 2.4.7), using unfocused beamforming in PE mode with an
angular step size of 0.5◦ and visualized in a 3D scan. The total duration for the image generation including
data acquisition, processing and rendering is 2 min.
In the image generated, the relative echo magnitudes of the individual reflectors vary between −4 and
0 dBmax . The highest minimum between two reflectors is −5 dBmax . Due to the accumulation of the side
lobes of the individual reflectors, the MSLL is −7.3 dBmax . Therefore, by using a simple separation threshold
between −4 and −5 dBmax , all reflector echoes in this example scene can be separated correctly and no false
detections due to side lobe artifacts occur. The experiment illustrates that the accumulated MSLL can increase
significantly compared to a single reflector scenario (approx. −28 dBmax ), making a clear separation more
difficult. Therefore, in the presence of multiple reflectors, the relative dynamic range is limited, which can lead
to either false detections or weaker reflections not being detectable, depending on the separation algorithm
and additional image processing used.
The improved angular resolution due to the larger array aperture enables especially close objects to be
65
Spacing
120 mm
Corner
reflector
Range 2 m
Array
Figure 4.8: Multi-reflector setup consisting of 28 corner reflectors for analyzing the separability and the
accumulation of side lobes in the far-field (top). The angular resolution, the accumulated MSLL
and the echo intensities of all reflectors are sufficient to correctly separate and visualize the
reflector echoes without side lobe artifacts in the OpenGL-rendered 3D-Scan by using a threshold
for separation (bottom) [34].
visualized with high detail. This is demonstrated in another experiment, where a hand is scanned using
focused beamforming at a distance of 20 cm with an angular step size of 0.5◦ and displayed as a 3D scan
(Fig. 4.9). Compared to the previous measurement setup, the number of retroreflective surfaces can not
be exactly defined. The stretched hand has a maximum extension of 19 cm in both dimensions and is thus
located in an angular range of ±25◦ . The length of the fingers is 6 to 8.5 cm with a spacing between them of
min. 1 cm to max. 5 cm, corresponding to 2.9◦ to 14◦ angular distance. The total time required for generating
the image is 2.5 min.
Figure 4.9: Experimental setup for examining the imaging of object shapes in the near-field at 20 cm using
focused beamforming. The angular resolution is sufficient to recognize the shape of the hand in
the 3D scan. However, the magnitudes of several accumulated side lobes are to some extent
higher than the desired echo levels, causing minor artifacts between the fingers. Therefore, if
a threshold is used for separation, a trade-off must be found between the level of detail and
tolerable artifacts [34].
66
Figure 4.10: 2D sectional image of the hand for different threshold levels. The top left graphic shows the
unprocessed raw image of the hand using a dB scale normalized to the highest amplitude
received. Only a threshold is used for separating the hand reflections from side lobe artifacts.
Amplitudes below the threshold are removed from the image (black color). The higher the
threshold is set, the more side lobe artifacts are removed. However, since there are strong and
weak hand reflections, some parts of the hand will be removed as well if the threshold is set too
high. Therefore, in scenes with a high dynamic range, a trade-off between the level of detail and
the accepted artifacts must be found [34].
67
Although the shape of the hand is recognizable in the 3D scan, especially the ring finger and the lower
palm are weak reflectors, since their surface normals partially point away from the array, thus deflecting
a large portion of the sound. As a result, the magnitudes of several accumulated side lobes are to some
extent higher (MSLL −5.7 dBmax ) than the desired reflections (e.g. −7.2 dBmax at the ring finger center).
Therefore, in scenes with a high relative dynamic range, a trade-off between the level of detail and the
accepted artifacts must be found. In the 3D scan shown, a simple separation approach with an example
threshold level of −7.5 dBmax is used, so that few artifacts are visible, but the hand is displayed in more detail.
The image processed with different threshold levels is included in the supplemental material. Overall, the two
multi-reflector experiments demonstrated that the array is capable of imaging object shapes and patterns in
both the near-field and far-field.
Finally, the main characteristics of the sparse spiral array are put in relation to the previous dense uniform
rectangular waveguide array which also consists of 64 MA40S4S transducers and the identical electronic
hardware (Table 4.1). All values, except the field of view, are given for a steering angle of (0◦ , 0◦ ). At the cost of
a larger overall aperture, the MLW, angular resolution and MSLL of the spiral array are significantly improved
compared to the uniform rectangular approach. On the other hand, the waveguide array offers additional
protection of the transducers and a larger field of view due to the smaller effective element apertures.
Table 4.1: Main characteristics of the spiral and waveguide array [34].
Spiral array Waveguide array Remark
Aperture ⌀200 mm 35 mm×35 mm [32]
TX SPL 148 dB 145 dB [97] at 0.3 m
TX MLW 3 ◦ 21 ◦ [32] far-field
RX MLW 3◦ 12◦ [32] ”
TX MSLL −16.1 dB −7.2 dB [32] ”
RX MSLL −14.8 dB −11.5 dB [32] ”
Angular res. 2.3◦ 12◦ [32] ”
Field of view ±80◦ ±90◦ [32] ”
4.4.7 Relative amplitude and phase errors

In the previous sections, the differences between the measurements and the numerical simulation were
highlighted. One cause for the differences in the position and height of the sidelobes are the relative amplitude
and phase errors between the individual transducers. These errors are quantified in the transmit and receive
cases by conducting two measurements. For the transmit mode, the calibrated microphone [Fig. 4.2(a)]
is placed at a distance of 2 m in front of the array (0◦ , 0◦ ). One after the other, each of the 64 transducers
transmits a pulse (40 kHz, 10 ms), received by the microphone. For each pulse, the amplitude and phase angle
is evaluated. The phase angle is adjusted depending on the transducer position in order to compensate phase
shifts due to different distances between the microphone and the individual transducers. The measurement
is repeated 20 times for each transducer to determine the mean and standard deviation for the respective
amplitude and phase angle. Subsequently, the averaged amplitudes and phase angles of the individual
transducers are normalized to the respective average values of all transducers. For the receive mode, the same
procedure is used but the microphone is replaced with the ultrasonic transducer [Fig. 4.2(b)] sending pulses
(40 kHz, 10ms) to the array in receive-only mode. Here, all 64 transducers receive the signal simultaneously
instead of one after the other.
The relative amplitudes are scattered in a range of approximately ±50% to the mean value, both for the
TX and RX mode (Fig. 4.11). The small differences in distance from the target to the respective transducers
68
100
Transducer pos. y (mm) TX RX
1.4
Relative amplitude
50
1.2
0 1
0.8
−50
0.6
−100
(a1) (b1)
100 RX
TX
Transducer pos. y (mm)
50
Relative phase (◦ )
50
25
0 0
−50 −25
−50
−100
−100 −50 0 50 100 −100 −50 0 50 100
(a2) (b2)
Transducer pos. x (mm) Transducer pos. x (mm)
Figure 4.11: Relative amplitude (a) and phase errors (b) of the sparse spiral PUT array in TX (left) and RX
mode (right), respectively, normalized to the absolute mean amplitude and phase of all elements.
The errors are due to manufacturing tolerances of the transducers and, thus, are independent of
the element positions within the array geometry. Strong transmitters are not necessarily strong
receivers [35].
do not significantly influence the relative amplitudes, as confirmed by the random values independent of
the element position. The relative amplitudes of the respective transducers in the TX and RX cases are only
partially correlated with a correlation coefficient of rxy = 0.4. In other words, only a few strong transmitters
are also strong receivers.
The relative phases are spread in a range of approximately ±60◦ (TX) and ±50◦ (RX). After correcting the
phase depending on the distance differences to the individual transducers, the relative phases have no position
dependency. There is only a partial correlation between the relative phases of the TX and RX mode with a
correlation coefficient of rxy = 0.49. Therefore, only a small number of transducers show a highly deviating
phase in both, TX and RX mode.
The cause of these partially strong errors of the relative amplitudes and phases are manufacturing tolerances,
influencing the individual spring-mass damper systems of the transducers and thus the frequency and phase
response (Fig. 4.12). Due to the narrow band characteristics of the piezoelectric transducers, a small deviation
in resonant frequency can result in large amplitude deviations when excited at their nominal frequency of
40 kHz. A shift of the resonant frequency can additionally cause large phase deviations, since the phase
response is particularly steep near the resonant frequency. The differences in errors between the transmit and
receive modes despite using the same transducers is due to their two different resonant frequencies, i.e. the
series resonance (TX) and parallel resonance (RX), the latter being approximately 3 kHz higher.
69
fo − 1 kHz fo + 1 kHz 100
0
+60◦
Magnitude (dBmax )
50
Rm −6
Phase (◦ )
Cel 340 Ω
2150 pF 0
Lm −12
48 mH
MA40S4S −50
Cm −18 Nominal
330 pF −60◦
frequency fo
−100
36 38 40 42 44 36 38 40 42 44
Frequency (kHz) Frequency (kHz)
(a) (b) (c)
Figure 4.12: PUTs are modeled with the Butterworth-van-Dyke equivalent circuit (a) to obtain the narrow-band
amplitude and phase system response (b), (c). Due to the narrow bandwidth, even small resonant
frequency alterations will cause high amplitude and phase errors in an array configuration, as
measured in Fig. 4.11 [35].
While these errors are not a major concern in a single-transducer application, they are quite significant
in an phased array configuration, where ideally all elements have the same phase and amplitude response
at their nominal frequency. Consequently, the beam pattern of real-world PUT arrays is degraded, i.e. the
maximum (MSLL) and mean (SLL) side lobe levels increase, the main lobe width (MLW) widens and the array
gain (A) is reduced (Fig. 4.13).
Magnitude (dBmax ) Magnitude (dBmax )

−20 −10 0 −20 −10 0 Array w/o errors (a) Array w/ errors (b)
1
0 A
MLW
Magnitude (dBmax )
0.5 MSLL
SLL
0 −20
v
−0.5
−40
−1
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
u u u
(a) (b) (c)
Figure 4.13: The ideal beam pattern without errors (a). The degraded beam pattern with amplitude and phase
errors within deviation limits of ±0.5 and ±60◦ , respectively (b). Horizontal sectional view of
both beam patterns (c) [35]
Previous studies observed similar errors in antenna arrays for automotive radar systems and proposed a
70
self-calibration method for compensation [122]. Other related works investigated low-power micromachined
PUT arrays and counteracted resonant frequency mismatches by DC-bias tuning and calibration [123]–[125].
Furthermore, ultrasonic arrays for NDT imaging were examined concerning the effects of element failure,
sensitivity and phase errors, as well as their calibration using the reflection of a back wall [126]–[129].
The next section focuses on analyzing the impact of the amplitude and phase errors on the beamforming
performance specifically for the sparse spiral PUT array, but also applicable to arbitrary array geometries
using other technologies. Furthermore, an experimental error calibration approach is presented and the
improvements are highlighted.
4.5 Calibration of relative amplitude and phase errors

First, the expected adverse effects of the amplitude and phase errors for different deviation limits on the
transmit beam pattern are numerically simulated and quantified using Monte-Carlo (MC) simulations. Second,
a free-field method to measure and counteract the deviations by calibration and pre-selection of elements
is presented. Acknowledgements are given to my student Tobias Frey, who supported the simulations and
experiments as part of his bachelor thesis [130].
4.5.1 Monte-Carlo simulation, results and discussion

Based on MC simulations, the impact of amplitude or phase errors is quantified to clear how the most important
parameters of the beam patterns are affected. For this, sets of different amplitude and phase deviation limits
are defined in which the errors are allowed to vary. The amplitude deviation limits range from ±0% to
±100% in steps of ±10%. The phase deviation limits are between ±0◦ to ±180◦ in steps of ±10◦ . First, the
amplitude and phase errors are examined separately. For each deviation limit, N = 105 arrays are generated,
whose elements are assigned equally-distributed randomized phase or amplitude errors within the respective
limits. The element positions are fixed according to the sparse spiral geometry [Fig. 4.12(d)] [34]. The
far-field one-way broad-side beam pattern of each array is formed, assuming point-source elements. The
corresponding model is described in detail in 2.2.1. From each beam pattern, the quality parameters SLL,
MSLL, array gain A and MLW are automatically extracted and the differences to the ideal array without
errors are highlighted (Fig. 4.14). For the previously measured real-world values of the phase and amplitude
deviation limits of PUTs (±60◦ and ±0.5), a combined simulation is performed, where both errors occur
simultaneously (Table 4.2).
The results show, that the phase errors have a more significant impact on the beam pattern compared to
the amplitude errors. The latter do not affect the MLW significantly and can cause the array gain to either
decrease or increase, as expected. The highest possible array gain is 6 dB higher than the ideal array elements,
whereas the worst-case array gain is 0, if all elements are non-operational. However, these best and worst
cases were not included in the set of randomized arrays. Particularly interesting is that the MSLL can be
lowered due to amplitude errors. In these cases, the amplitude errors are likely to reduce the amplitude of
peripheral elements, similar to array apodization, resulting in a MSLL reduction. The worst-case SLL doubles
at a phase deviation limit of approx. ±85◦ , whereas the array gain A halves. The MLW is only slightly affected
up to more extreme phase deviation limits of approx. ±100◦ . Both error types have the most negative impact
on the MSLL.
71
2 12
Max.
∆SLL (dB)
∆SLL (dB)
1.5 9
1 Mean 6
3
0.5 Min.
0
0
(a1) (b1)
4 15
∆MSLL (dB)
∆MSLL (dB)
3 12
2 9
6
1
3
0
0
(a2) (b2)
0
2
1 −3
∆A (dB)
∆A (dB)
0 −6
−1
−2 −9
−3 −12
−4
(a3) (b3)
10
0.6
8
0.4
∆MLW (◦ )
∆MLW (◦ )
6
0.2
4
0
2
−0.2
0
−0.4
(a4) 0 0.2 0.4 0.6

|Amp. dev. limits|
0.8 1
(b4)
0 50 100
|Phase dev. limits| (◦ )
150
Figure 4.14: Results of the MC simulations showing the change of the beam pattern parameters compared
to the ideal array for either multiple amplitude or phase deviation limits [35].
Table 4.2: Worst-case MC simulation results for the PUT-typical combined amplitude (p̂) and phase (φ)
deviation limits (±0.5, ±60◦ ) [35].
p̂ dev. only φ dev. only φ and p̂ dev.
∆SLL 0.47 dB (+5.6%) 1.65 dB (+20.9%) 2.37 dB (+31.4%)
∆MLW 0.2◦ 0.2◦ 0.3◦
∆A −1.64 dB (−17.2%) −1.72 dB (−18%) −2.71 dB (−26.8%)
∆MSLL 2.2 dB (+28.8%) 4.36 dB (+65.2%) 4.9 dB (+75.8%)
72
4.5.2 Experimental calibration, results and discussion
Next, the relative amplitude and phase errors are counteracted by three techniques. First, before assembling
the array, the individual transducers can be characterized and pre-selected for similar amplitude and phase.
Second, a post correction, i.e. calibration, of the characterized phase and amplitude errors can be applied,
using channel dependent phase shifts and gain factors, respectively. Last, an individual tuning network of
additional electronic components can be added for each transducer channel, although this is time-consuming,
expensive and difficult to realize due to the high number of channels. Therefore, the focus is on the first two
methods, i.e. the pre-selection of elements and calibration. The goal is to compare the beam pattern before
and after the correction techniques and highlight the improvements.
For the calibration, the relative amplitudes and phases of the elements must be measured and compensated
afterwards by individually modifying the signals of the elements. While the compensation gains and phases can
be easily implemented in digital signal processing for the receive mode, in particular the gain compensation
for the transmit mode is challenging. The reason is that the excitation signals are most commonly created
with digital square-wave signals which are amplified by driving circuits to a common higher voltage level, e.g.
H-bridges. As a result, compensating phase errors by individually delaying each elements signal is simple, but
compensating for amplitude by providing individual voltage levels is typically not supported by most array
electronics.
Therefore, individual pulse width modulation (PWM) is applied, which enables the reduction of the
amplitude of the output pulse, but additionally affects the phase of the signal, which must be considered
for calibration (Fig. 4.15). Since the amplitude can only be corrected downwards, the weakest element
determines the target value. Hence, a pre-selection of elements is performed to filter out weak and defective
transducers, as well as to reduce the amplitude deviation before calibration. For this, a modified array PCB
is used, where the elements can be easily plugged in and out and sequentially measure the amplitude of
multiple batches of transducers before the final selection.
1
Norm. phase (◦ )
100
Norm. amp.
0.8
75
0.6
50
0.4
25
0.2 0
0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4 0.5
(a) Duty cycle (b) Duty cycle
Figure 4.15: Normalized duty-cycle-dependent amplitude and phase of five example transducers. The PWM
enables the reduction of the transmit amplitude without requiring the array electronics to provide
different voltage levels for each transducer (a). However, PWM results in an additional phase
alteration which must be considered for calibration (b) [35].
In general, the relative amplitudes and phases are measured in the free-field to avoid feedback effects on the
system response of the elements. Each element sequentially sends a pulse excited by a 40-kHz square-wave
burst with a length of 1 ms to avoid significant self-heating. The small path differences between the elements
and the microphone do not have a major influence on the received amplitude (Section 2.2.2). In contrast, the
phases of the received signals must be corrected accordingly to obtain only the phase errors caused by the
elements themselves.
The calibration is performed using the pre-selected array and consists of three steps. First, the relative
73
Before calibration Before calibration
1.1 50
Rel. phase (◦ )
Rel. amp.
1 0
0.9 −50
After calibration After calibration
0 10 20 30 40 50 60 0 10 20 30 40 50 60
(a) Transducer index (b) Transducer index
Figure 4.16: The relative amplitude (a) and phase (b) errors before and after the calibration highlight the
effective reduction of the deviations [35].
amplitudes and phases are measured for multiple duty cycles (Fig. 4.15). The weakest element excited with a
duty cycle of 50% determines the target amplitude. The PWM signals, required to match the target amplitude,
are selected accordingly for each element. Second, the relative phase measurement is repeated but each
element is excited with their corresponding phase-alterating PWM signal. This way, the phase errors are
obtained when using the amplitude-corrected pulses.
Magnitude (dBmax ) Magnitude (dBmax )

−20 −10 0 −20 −10 0
1 1
0.5 0.5
0 0
v
−0.5 −0.5
−1 −1
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
(a) u (b) u
1
0 Ideal Before cal.
After cal.
Magnitude (dB)
0.5 −10
0 −20
v
−0.5 −30
−40
−1
(c) −1 −0.5 0 0.5 1 (d) −1 −0.5 0
u
0.5 1
u
Figure 4.17: Comparison of the ideal beam pattern (a) to the non-calibrated case (b) and the beam pattern
of the calibrated array (c), as well as their corresponding horizontal sections (d). Overall, the
calibration leads to an improved MSLL and SLL [35].
Third, the final measurement of the jointly corrected relative amplitude and phase is conducted using
74
individually time-delayed PWM excitation signals. In order to assess the improvements, the beam pattern is
measured before and after the calibration by using all elements simultaneously for steering the beam in a
field of view of ±90◦ (±1 in uv-coordinates) (Fig. 4.17).
After calibration, the deviations of the relative amplitude and phase errors are reduced from ±15% and
±62◦ down to ±5% and ±8◦ respectively (Fig. 4.16), proving the effectiveness of the compensated excitation.
The beam pattern of the fully calibrated array is in excellent agreement to the ideal pattern and shows only
minor increases in meanSLL (+10.6%) and MSLL (+14%) (Fig. 4.17), (Table 4.3). However, due to the
downward correction of the amplitude, the array gain A is not increased compared to the uncalibrated array,
but decreased. Overall, the relative improvements over the uncalibrated case are rather small for the example
array presented, since the original beam pattern of the pre-selected array showed decent results from the
beginning. Nevertheless, based on the simulation results, the calibration is more significant if a less beneficial
combination of elements is selected, e.g. as in (Fig. 4.13).
Table 4.3: Experimental calibration results of the example PUT array (64 pre-selected transducers) [35].
no cal. φ cal. only p̂ cal. only φ and p̂ cal.
∆A∗ 0 dB 0.62 dB −1.64 dB −0.92 dB
*referenced to non calibrated array (0%) (+7.4%) (−17.1%) (−10%)
∗∗
∆SLL 1.72 dB 1.4 dB 1.57 dB +0.88 dB
**referenced to ideal array (+21.9%) (+17.4%) (+14.8%) (+10.6%)
∆MSLL∗∗ 2.15 dB 1.16 dB 1.75 dB 1.15 dB

**referenced to ideal array (+28%) (+14.3%) (+22.2%) (+14%)

This chapter covered the design, evaluation and calibration of an air-coupled spiral PUT phased array system
capable of transmit, receive, and pulse-echo operation, as well as 3D imaging. The large-aperture spiral
geometry provides grating-lobe-free, high-accuracy and high-precision beamforming in all operation modes
and is thus valuable for many ultrasonic applications. In particular, in-air imaging benefits from these features
as they allow unambiguous, high-resolution images to be generated without requiring an increased system
complexity or computational load. The PUTs in conjunction with the SLA imaging technique and the resulting
array gains enable a high range of view, so that a large volume can be imaged, spanning the near- and
far-field. Even object shapes and patterns of multiple reflectors are recognizable in the images generated,
opening up further possibilities, e.g. ultrasonic object classification in harsh environments using deep learning
techniques. A more uniform transducer directivity can improve the beamforming and imaging performance,
as it increases the field of view, the accuracy and relative dynamic range for high steering angles. Furthermore,
a compensation of the typical phase and amplitude errors of PUT arrays by calibration and pre-selection of the
elements can result in significantly improved beam pattern characteristics, e.g. the MSLL. In particular, a
phase error calibration is recommended due to the high impact on the beam pattern, whereas the calibration
of the amplitude error is advisable if the application can sacrifice array gain and benefits from a close-to-ideal
beam pattern, e.g. for super resolution imaging techniques. If a pre-selection of array elements is conducted,
the transmit amplitude is suggested to be the selection criterion. There are open questions for future work
concerning the variation of error values due to steering direction, ambient temperature, self-heating and aging.
A self-calibration procedure is likely to play a major role to ensure a long-term error correction. Chapter 6
covers the improvement of the frame rate by using the MLA imaging technique, for which reducing the MSLL
is mandatory to prevent severe side lobe artifacts in scenes with multiple reflectors of varying strengths.
75
5 Embedded low-cost sonar system concepts

[36] ”Embedded Air-coupled Ultrasonic 3D Sonar System with GPU Acceleration”,
in Proc. IEEE Sensors Conference, 2020, and
[131] ”Single Microcontroller Air-coupled Waveguided Ultrasonic Sonar System”,
in Proc. IEEE Sensors Conference, 2021.
The previous two chapters focus on PUT arrays, where the transceiver elements can be used for high-intensity
pulse transmission as well as for signal reception, which brings advantages but also a number of drawbacks.
The array design involving transceivers is primarily required for the SLA method, which utilizes two-way
beamforming, enabling favorable long-range, high-contrast, and high-resolution imaging, which, however,
provides only slow frame rates. The transceiver electronics required for this method leads to high costs,
considerable space requirements and a high power consumption due to the large number of components
for the transmit and receive channels including the TR switches. Other limitations of PUT elements are
the aforementioned amplitude and phase errors (Section 4.5) due to the narrow bandwidth resulting in
a significant impact of manufacturing tolerances on the imaging quality. In addition, the mismatch of the
series and parallel resonance frequency in conjunction with their narrow bandwidth causes PUTs to be either
particularly effective transmitters or receivers, but as transceivers, they severely attenuate the signals at least
for one case.
In order to effectively utilize ultrasonic phased arrays as a perception sensor, e.g. for mobile autonomous
robots, providing sufficient frame rates is crucial enabling to react quickly to dynamically changing environ-
ments, such as emerging obstacles. Therefore, these use-cases prioritize the MLA imaging method, where
transmit beamforming is discarded and instead, only a single PUT illuminates the scene, whereas all array
elements receive the echos. As a result, the high number of transmit channels and TR switches is obsolete,
enabling to fulfill further key optimization goals of viable perception sensors, namely, the reduction of cost,
size, and power consumption. Further downscaling is enabled by advances in ultrasonic MEMS technology,
specifically the development of low-cost, low-profile and low-power microphones, which are optimized for
receiving. This chapter therefore focuses on viable sonar system concepts that utilize hybrid transducer
technologies in order to bring together the advantages of the transmission with PUTs and the reception
with MEMS microphone arrays, while requiring only a minimum number of extra electronic components.
In addition, he implementation of embedded 3D imaging is addressed, which enables the ultrasonic sonar
system to be used for hardware-constrained mobile applications as well, first, by using a GPU-accelerated
single board computer and second, by using a single microcontroller only. Before covering the specific system
designs and experimental evaluation, the proposed concepts is motivated by highlighting related work on
ultrasound-based perception systems and their limitations.
5.1 Related work on air-coupled ultrasonic sonar systems

The key to ensure robust environmental perception for a wide range of conditions is to combine the strengths
and weaknesses of different sensor modalities [132]. Therefore, studies investigating autonomous ground
and air vehicles do not rely solely on optical sensors, which operate poorly in certain visibility conditions
76
[133] and in the presence of optically transparent and reflective objects [134], as typically found in modern
buildings. As a complement, ultrasonic sensor systems are added, which are unaffected by these conditions,
e.g. in [135]–[138].
In these studies, multiple 1D ultrasonic range finders are used, which are mounted all around the autonomous
vehicle, each covering a specific directional region, also referred to as sonar ring [139]–[141]. While a sonar
ring is a straightforward approach for realizing basic obstacle avoidance, there are several limitations. First,
a single 1D range finder is unable to extract the direction of an impinging echoes, but only its distance.
Therefore, a differentiation of multiple scatterers is impossible, if they are positioned at a similar distance but
different directions within the field of view of the transducer, such that azimuth and elevation information
is lost and there is no angular resolution capability. However, particularly for agile vehicles, such as mobile
robots, successfully finding a valid path to the target location depends heavily on the angular resolution
available. Consequently, in order to provide precise direction information, the sonar ring must ideally consist
of a high number of ultrasonic transducers, each with a narrow-beam width, i.e. large aperture, requiring
a lot of mounting space for covering a wide field of view. Furthermore, this technique suffers from mutual
interference of the single transducers, which can result in false detections.
A second type of sonar system relies on triangulation using one ultrasonic transmitter and three to five
receivers [142]–[146]. Here, the direction of objects is calculated directly from the relative phase shifts of
the echo signals based on the corresponding geometric relations, thus, requiring low computational effort
and few components. Based on this technique, a commercial sensor has been released by the start-up
company Toposens [147]. While the phase triangulation in conjunction with the time of flight principle can
provide azimuth, elevation and distance information, so that the corresponding sensors are attributed with 3D
capability, there is a drastic limitation. As for 1D range finders, the localization fails if two or more scatteres
are positioned at a same or similar distance but in different directions, i.e. when the echo signals overlap. In
this case, the phase shifts of the signals received are the result of multiple overlapping phases, so the direct
geometric calculations do not represent the individual originating directions. This way, the calculated direction
is always between the two true reflector directions. Therefore, in many real-world situations, e.g. gateways,
door frames, corridors, parking spots, etc., the directional information obtained are not reliable, since there is
no angular resolution capability provided with this technique either.
The third technique, utilized in related work and also covered in this thesis, is based on beamforming
using phased arrays of microphones and one or more transmitters as investigated in [85]–[87], [148]–[152].
While most of these studies maintain a reasonable size and provide angular resolution capability for 2D or 3D
localization of multiple objects with a wide field of view, the high number of components and computational
effort involved make a cost-effective and viable integration challenging. Therefore, two minimalistic embedded
sonar system concepts are introduced hereafter.
5.2 Embedded GPU-accelerated 3D sonar system with hexagonal aperture

The first system considered is focused on the formation of 3D images and therefore uses a 2D aperture, this time
with a hexagonal outline. Furthermore, the goal is to generate the images with a high frame rate, so the MLA
technique with a single-pulse illumination of the complete hemisphere is used. In order to quickly evaluate
the high number of scan lines of 3D images and still provide the viability for hardware- and size-contrained
mobile applications, a small-scale single board computer with integrated GPU is used. Another distinctive
feature is the optional wireless communication between the sonar head and the signal processor. The details
of the electronics, control and system are described in the following.
77
Transducer
MA40S4S
Ø 42 mm
37 mm
(d)
λ/2
(c)
(b)
(a)
(e)
36 Microphones (SPH0641LU4H-1)
Figure 5.1: The modular sonar head includes a stack of four PCBs containing 36 microphones (a), the driving
electronics of the ultrasonic transducer (b), an FPGA for signal acquisition and conversion (c),
and communication via WiFi or alternatively via Hi-Speed USB (d). This design simplifies the
evaluation of new modules, but increases the overall size [36].
5.2.1 Electronics and system design

The two primary components of the system consist of the sonar head for pulse transmission and echo reception,
and the Nvidia Jetson Nano as the main signal processor (Fig. 5.2). The sonar head includes a low-cost
narrow-band ultrasonic transducer (Murata MA40S4S) with an aperture diameter of 10 mm, driven with a
bipolar square-wave signal (±10 V, 40 kHz). The driving signal is generated using a single H-bridge MOSFET
driver (Analog Devices ADP3630), which is controlled by an FPGA (Intel MAX10) and supplied by a 5 V-to-10 V
step up converter (Analog Devices ADP1613). The bipolar excitation signal of the H-bridge ensures that a 10 V
supply voltage is sufficient to reach the nominal maximum peak-to-peak voltage of the transducer of 20 Vpp .
Optional variant Step Up

WiFi- UART ADP1613
ESP32 Wrover
MCU
Jetson Nano
Driver Transducer
PSRAM ADP3630 MA40S4S
FPGA
MAX10
USB FIFO 4 MHz PDM 36 Microphones
FT232H SPH0641LU4H-1
Figure 5.2: Block diagram of the complete system. Since the microphone signals are digital and the trans-
ducer is driven with a simple square-wave signal (±10 V, 40 kHz), the sonar head requires a
minimum number of components [36].
For echo reception, an array of 36 MEMS microphones (Knowles SPH0641LU4H-1) is used, arranged on an
equilateral triangular λ/2 grid, forming a hexagonal aperture with a maximum dimension of 25.7 mm (Fig. 5.1).
Compared to a 36-element uniform rectangular array (URA), this arrangement has reduced side lobes in
78
the receive pattern of the conventional beamformer, whereas the main lobe width is increased by only 1◦ .
The microphones are small, low-power, and broad-band (10 Hz to 80 kHz). The integrated delta-sigma ADCs
directly provide digital pulse density modulated (PDM) signals (1-bit, 4 MSa/s), thus additional components
are not required. In addition, each pair of microphones shares its data line by launching the interleaved data
on the rising and falling edge, reducing FPGA pin usage.
The FPGA contains 36 sinc-3 Cascaded-Integrator-Comb-filters (CIC) [153] required for converting and
downsampling the 1-bit 4 MHz PDM signals to 16-bit pulse code modulated (PCM) signals sampled at a rate
of 125 kSa/s. Apart from that, the FPGA includes the excitation signal synthesis and communication modules,
but no further signal conditioning.
There are two options for communication with the Jetson Nano. First, the FPGA is connected to a USB
FIFO IC (FTDI FT232H) via an 8-bit parallel bus, providing a transfer rate of 480 Mbit/s for bidirectional
communication. Since the array data rate is lower (72 Mbit/s) than the transfer rate, intermediate buffering is
not required, further reducing the component count. The second option features a microcontroller (MCU)
module (Espressif ESP32 Wrover-B) containing a WiFi-ready MCU and 8 MB of PSRAM. The FPGA is directly
connected to the memory-mapped PSRAM, transferring data via QSPI with a transfer rate of 160 Mbit/s,
whereas a UART interface is used to receive control commands. Here, the bottleneck is the WiFi transfer rate
of 14 Mbit/s on average using TCP/IP. Therefore, at the expense of frame rate, the sonar head is detachable
from the main processor, useful for space-constrained applications or for centralized processing of multiple
sensors.
The Nvidia Jetson Nano is a small-size single board computer featuring a 1.4 GHz quad-core CPU and a
Maxwell GPU with 128 CUDA cores, enabling parallel signal processing. Here, a cross-platform Qt application
(C++) is executed for system control and signal processing for image generation. The optimized frequency-
domain signal pre-processing and image formation, including the parallel tiled matrix multiplications for
receive beamforming with reduced processing times are described in detail in Chapter 3.4.
Sonar head Norm.

Amplitude
1
0.5
Spheres Ø100 mm
0
90 60 30 0 −30 −60 −90
Azimuth θ (◦ )
Figure 5.3: Setup for measuring the angular resolution (top left). The spacing between the spheres at a
range of 2 m is gradually increased until they are separately detectable. Rendered 3D-Scan of the
scene (bottom left). Corresponding horizontal sectional view (B-Scan) (middle) and the sectional
view at 2.1 m (right). The echoes of the spheres create two distinct local maxima, thus they are
separately detectable [36].
5.2.2 Experiments, results and discussion

The system’s range of view, field of view, angular resolution, and the achievable frame rates are measured in
an anechoic chamber. One or two hollow steel spheres (⌀100 mm) are used as targets, mounted on a movable
79
slide. The sonar head is attached to two rotational axes at the end of the rail, so the targets can be positioned
freely in the coordinate system (Fig. 5.3).
In order to measure the range of view, one sphere is positioned directly in front of the sonar head (0◦ ,
0 ), and the distance is gradually increased. The distances where the echo signal is below the noise level,
◦
i.e. not detectable, determine the minimum and maximum detection distances, which are 0.17 m and 5.5 m,
respectively. The field of view is determined using the same procedure, except the distance is fixed at 2 m and
the target’s azimuth angle is gradually increased. The sphere is detectable in a field of view of ±80◦ , mainly
limited by the directivity of the transducer.
The angular resolution is determined by the minimum angle between two objects in order to detect them
separately. For separation, the corresponding echoes must create two distinct local maxima in the image.
For measuring the angular resolution two spheres are positioned at a range of 2 m directly in front of the
sonar head. The spacing between the spheres is increased gradually. With a spacing greater than 440 mm, the
spheres are separately detectable, corresponding to an angular resolution of 14◦ . The measured value is in
good agreement with the simulation (13◦ ) using an analytical model.
The frame rate is determined by measuring and averaging (32×) the time between two completely generated
frames. The total time is mainly composed of sound travel, data transfer and processing time. Using a fixed
viewing range of 3 m (2048 samples), the frame rates over the number of scanning directions of the WiFi
and USB variant are compared (Fig. 5.4). The USB variant is clearly superior due to the higher data transfer
speed, thus allowing faster movements.
30
Frame rate (Hz)
USB
20
10
WiFi
101 102 103

Number of scanning directions
Figure 5.4: The USB variant is limited by the data transfer rate up to 240 directions. The frame rate drops
if the processing time is higher than the data transfer time, as the parallelization of the data
acquisition and signal processing becomes out of sync. With increasing scan directions the
achievable frame rate decreases for both variants. The WiFi variant is limited by the data transfer
rate for all numbers of scan directions [36].
5.3 Single-microcontroller waveguided sonar system based on a T-aperture

Motivated by the fact that in particular low-height autonomous mobile ground vehicles do not necessarily
benefit from the high-frame-rate 3D perception capability, the goal of the second sonar concept is to further
minimize the cost, size and power consumption of the overall system, specifically by eliminating the necessity
for an additional single-board computer. Instead, the signal processing is directly handled by a single off-
the-shelf microcontroller, also used to control the remaining system components. In order to realize the
sonar system under these severely limited hardware resources, a key factor is the use of two perpendicularly
arranged ULAs, one for pulse transmission using waveguided PUTs and one for echo reception using MEMS
microphones, which together form a T-array, also referred to as a row-column array. Therefore, the design
80
presented thereby unifies the previous concepts of the waveguided array (Chapter 3) and the hybrid hexagonal
array (Section 5.2). The T-arrangement enables the formation of 2D scans with an increased range of view and
high frame rates, whereas the number of microphones required and thus the number of signals to be processed
is significantly reduced. Nevertheless, 3D images can still be generated, although requiring multiple pulses,
so that they are only provided at lower frame rates. Acknowledgements are given to my student Tim Maier,
who contributed a large part to the realization of the concept in the course of his bachelor thesis [37] and
presented the project at the IEEE Sensors conference [131]. Additional thanks go to my colleague Matthias
Rutsch, who realized and analyzed the waveguide, which is an essential core component of the system [154].
The specific operation, electronics and system design details are described hereafter.
Microphones 3D-printed waveguide

λ/2
62 mm 20 mm
mm 62
Transducer PCB
Main PCB (MA40S4S)
Figure 5.5: The sonar system consists of three custom PCBs and a 3D-printed waveguide. The transducer
PCB is mounted with eight Murata MA40S4S ultrasonic 40-kHz transducers. In order to reduce
the pitch to λ/2, a waveguide is attached to the PCB, allowing grating lobe free beamforming. The
waveguide shown has been printed with an exposed section so that the internal ducts are visible.
Twelve digital MEMS microphones with an inter-element spacing of λ/2 are used to receive
the resulting echoes. The transducer and the receiver arrays are arranged in a T-configuration
which allows to perform a 2D-scan by transmitting only one pulse. The main board contains an
STM32 MCU, which not only controls the electronics but also is responsible for all the signal
processing [131].
5.3.1 Electronics and system design

The system consists of three custom PCBs and a 3D-printed waveguide (Fig. 5.6). The transducer PCB contains
eight Murata MA40S4S transducers, driven with a unipolar rectangular Signal (40 kHz, 20 Vpp ) created by an
eight-channel pulser IC (HV7355). The voltages required are provided by a step-up converter (LTC1871, 20 V)
and a charge pump (LTC1983, -5V). Since the transducers have a diameter of 10 mm, a 3D-printed waveguide
is placed on top of the transducers to reduce the effective inter-element spacing to λ/2, thus, enabling grating
lobe free beamforming. The transmit characteristic of this ULA features a fan-shape propagation of the
sound waves (Fig. 5.7). Due to the array gain of the transducers, the viewing range is increased. Since all
sound channels of the waveguide have the same length, the signals of all transducers are equally delayed,
so no propagation time compensation between the individual transducers is required. Furthermore, the
acoustic aperture of the output ports are reduced compared to the corresponding acoustic aperture of a single
transducer, resulting in a larger field of view as the directivity of a single element is widened [32].
The reflections caused by the emitted pulses are recorded by a ULA consisting of twelve digital broad-band
MEMS microphones (Knowles SPH0641LU4H-1) with an inter-element spacing of λ/2. The microphones have
an integrated sigma-delta ADC providing 1-bit pulse density modulated (PDM) signals with a sampling rate
of 4.76 MSa/s.
81
The receiver array is rotated by 90° to the transmitter array in a T-configuration allowing to perform a
2D-scan of the environment with a single pulse transmission and a minimum number of components.
Each 1-bit PDM signal is downsampled and converted to a Pulse Code Modulated (PCM) signal by a sinc-3
CIC filter integrated in the MCU on the main PCB [155]. Since the MCU only contains six of these filters, two
microphones per filter are multiplexed, at the expense of a lower resolution (±343 ≡ 9.42 Bit) and sampling
rate (85.5 kSa/s). The MCU used is a STM32F413 with a maximum clock frequency of 100 MHz and 320 kB
SRAM. Besides the conversion of the microphone signals and the control of the pulser, the MCU handles the
entire signal processing.
STM32F413 LTC1983 LTC1871

Microcontroller Charge pump (-5 V) Step up (20 V)
Data acquisition HV7355

MA40S4S
8-channel
SPI 20 Vpp 8x1 PUT array
Signal processing pulser
Image formation
SPH0641LU4H-1
USB communication 4.76 MHz 1-bit PDM 1x12 microphone array
Figure 5.6: Block diagram of the system. The MCU controls the pulser IC, samples the microphones and
handles the signal processing. Two converters are used to provide the voltages required for the
driving signal of the ultrasonic transducers [131].

In the following, characteristic parameters of the sonar system are observed, including the cost, the processing
time for a 2D-scan and the power consumption. Furthermore, the results of experiments conducted in an
anechoic chamber are presented.
The price for one sonar system (valid for 2021) is only between 70 € and 80 € for a production quantity
of 100 units, making the system interesting for cost-sensitive applications. Another important aspect for
these applications is low power consumption. The power consumed is approximately 250 mW while scanning
continuously, although the electronics and software have not been optimized for power yet.
Despite the low processing power of the MCU, frame rates of up to 10.2 Hz are achieved. The total processing
time is 98.6 ms of which approximately 25% of the time is required for pulsing and waiting for the echoes to
arrive at the microphones. Due to the small amount of RAM, parallelizing the pulsing and signal processing is
not possible with the current version. The FFT and IFFT calculations require a total processing time of 60 ms
whereas beamforming and filtering takes up only 14 ms. If the output data is sent to a computer and plotted
in a 2D-Scan, the frame rate drops to 3.8 Hz, mainly limited by the USB transmission speed of 12 MBit/s.
Next, initial experiments with the system are conducted in an anechoic chamber, including measuring the
transmit characteristics, the field of view and the angular resolution. For this purpose, the sonar system is
attached to two rotational axes, such that the measurement objects can be positioned in arbitrary relative
directions to the sonar system (Fig. 5.7).
A field of view of ±70° has been measured by placing a 3D-printed corner reflector at a distance of 2 m on
the linear axis and increasing the azimuth angle from -90° to 90° . Above 70° no distinct maximum could be
observed in the 2D-scan.
In order to determine the transmit characteristics, a microphone is placed on the carriage of the linear axis
at a distance of 2 m in front of the array (0°, 0°). The vertical transmit characteristic is measured by varying
the elevation angle from −90° to 90°, with an azimuth angle of 0°. In order to measure the horizontal transmit
82
characteristic the azimuth angle is varied from −90° to 90°, with an elevation angle of 0°. The measurement
shows a half-power beamwidth (HPBW) of 14° (12.8° in the simulation) and a maximum sidelobe level
(MSL) of -8 dB (-12 dB in the simulation) (Fig. 5.7a). The deviations can be explained by varying amplitude
and phase responses of the individual transducers. Further simulations and measurements with different
transducer PCBs have shown that the transmit characteristic is strongly dependent on these variations due to
the small number of transducer elements used.
(a) 0°
Sonar system attached to Azimuth
Corner reflectors two rotational axes
30° -30° 0°
10° -10°
20° Reflectors -20°
HPBW max.
SLL 30° -30°
40° - 40°
140 mm
60° -60°
(b) 1.5 m
1m
90° -90°
-60 -40 -20 0
e
ng
Ra
Horizontal Vertical Norm. amplitude (dB) 0.5 m
Horizontal simulation Vertical simulation

(c)
Microphone
Horizontal measurement Vertical measurement (d)
Figure 5.7: Measurement of vertical and horizontal transmit characteristic (a) and the corresponding mea-
surement setup (c). The fan-shaped transmit characteristic is visible. No grating lobes appear due
to the waveguide, featuring an effective inter-element spacing of λ/2. The angular resolution is
measured using two corner reflectors. The distance between the reflectors is gradually decreased
until they can not be separated in the 2D-scan anymore (d) [131].
In order to measure the angular resolution, two corner reflectors are placed at a distance of 1.5 m in front
of the sonar system. The distance between the objects is decreased until the echoes do not create two distinct
local maxima in the image i.e. they can not be separated any more. With a minimum distance of 40 cm, the
angular resolution is 15.2° (Fig. 5.7d). This can be further improved by increasing the number of directions
considered during conventional beamforming, which requires more RAM.

This chapter covered two embedded ultrasonic sonar system concepts, capable of localizing multiple objects in
3D with a wide field of view. The first system utilizes a detachable hexagonal 36-element 2D array geometry,
GPU acceleration and optimized frequency-domain signal processing to ensure sufficient 3D frame rates for
monitoring dynamically changing environments, e.g. due to motion. The high-frame-rate 3D capability comes
at the expense of requiring an additional single-board-computer. Therefore, the second sonar systems provides
a solution, aimed at applications, where high-frame-rate 2D image formation is sufficient. By using waveguided
PUTs and fan-beam scanning in transmit and receive mode, the number of required components and the
computational effort is significantly reduced. Here, the signal processing including the image formation is
directly performed using an off-the-shelf microcontroller, enabling an easy integration into mobile autonomous
systems. Although the low-power microcontroller achieves 2D frame rates of up to 10 Hz, there are two
major weaknesses, i.e. the limited number of scanning directions due to the small amount of RAM and the
low sampling resolution caused by the mandatory signal multiplexing. These difficulties can be addressed
by a microcontroller upgrade (STM32H7A3) or an FPGA-based solution. Both concepts highly profit from
the MEMS microphone technology as they are compact, broad-band, almost unidirectional, low-cost and
83
they provide already-digitized data. In conjunction with the narrow-band transducers, which are driven
by single-frequency square-wave signals, both sonar head electronics are almost purely digital, such that
only a minimum number of low-cost components are required, enabling a compact and low-power overall
design. Therefore, these self-contained sonar system can be viably integrated into e.g. mobile autonomous
ground and air vehicles for improving obstacle avoidance, mapping and navigation. Although a downside
of the compact system and consequently small aperture size is the limited angular resolution, it is sufficient
within the primary operating range up to 3 m, e.g. to separately detect doorways for navigation. Nevertheless,
Chapter 7 investigates how the resulting images can be further improved in terms of angular resolution and
contrast despite the small aperture size by post-processing.
84
6 Two-scale multi-frequency sparse spiral arrays
Parts of this chapter are under review for publication in

”Two-Scale Sparse Spiral Array Design”,
IEEE Open Journal of Ultrasonics, Ferroelectrics, and Frequency Control, 2023.
The previous two chapters cover array geometry and system design concepts that pursue different optimiza-
tion goals, which are partly conflicting. The sparse spiral array design of Chapter 4 enables high-resolution
and long-range imaging. However, in this case, the single line acquisition imaging method (Section 2.4.7)
based on two-way beamforming is mandatory to achieve sufficient image contrast, i.e. low MSLLs. Otherwise,
with one-way beamforming, strong side lobe artifacts arise, which cause false detection and make the image
unrecognizable, so that the sparse spiral array presented is incompatible with the multi line acquisition imaging
method (Section 2.4.2). As a consequence, the system introduced in Chapter 4 achieves high resolution, but
at the cost of slow frame rates and an expensive transceiver electronics.
In contrast, Chapter 5 presents two system variants that provide high frame rates due to one-way beam-
forming and multi-line acquisition in addition to a cost-efficient array electronics by using digital MEMS
microphones. In both system variants, dense array geometries are used, resulting in compact overall apertures
that provide only poor resolution.
Clearly, the challenge is to combine the two design approaches based on sparse spiral arrays and one-way
beamforming with MEMS microphones into one overall concept, with the objective of finding a favorable
trade-off between low system complexity and cost, as well as high frame rates, resolution and contrast. In
particular, the latter is a key factor to ensure the compatibility of the two design approaches by preventing
serious false detections in the image.
This chapter addresses this challenge based on two design strategies, that pursue the same objective, namely
improving the contrast by varying and combining one-way PSFs, but in different ways. The first design
strategy is based on modifying the spiral array geometry by using two heterogeneous sub-arrays, that exploit
different element densities and aperture sizes to vary the distribution of the respective main, side, and grating
lobes of the one-way PSF, so that they balance advantageously when they are combined. The second design
strategy achieves additional PSF variation by transmitting multiple different frequencies, all of which can
be received by the same array elements, enabled by the use of wide-band MEMS microphones. Here, the
focus is on the variation of the side lobe distribution, which also complement each other advantageously in
the multi-frequency compound image. In the final part of the chapter, a prototype system is built based on
the two-scale and multi-frequency design strategies, including its experimental evaluation of the transmit,
receive, pulse-echo, and imaging characteristics. Before going into more detail on the two design strategies,
in the following section, first, similar design modifications based on spiral arrays from other related studies
are addressed and the differences to the proposed approaches are highlighted thereafter.
6.1 Related work on sparse spiral array modifications

In [117], the authors presented a spiral sunflower design modification based on two-way line-by-line beam-
forming, where the transmit and receive array geometries differ. By excluding specific adjacent spiral arms in
85
the transmit and receive array pairs, the positions of the secondary lobe maxima and minima in the respective
PSFs can be manipulated such that they cancel out. Although this method can effectively reduce the overall
side lobe level, relying on two-way line-by-line beamforming is impractical for volumetric imaging applications
that require a minimum number of firing events to achieve high frame rates. Therefore, the focus is on
concepts for improving the one-way PSF characteristics, which are also valuable for transmit- or receive-only
applications apart from imaging.
A method for improving the general one-way PSF by modifying the spiral sunflower array geometry, referred
to as density tapering, has been proposed for antenna arrays in [108]. Density tapering adapts the idea of
weighting the element sensitivities based on a spatial window function to reduce the side lobe level, also
known as apodization [44]. However, instead of weighting the sensitivities themselves, the spatial density of
the element distribution is altered depending on the window function. This way, the MLW and MSLL can be
fine-tuned without sacrificing overall sensitivity. In [116], density tapered spiral arrays have been examined
for ultrasound imaging including a comparison of multiple window functions. Additionally, the authors
have created two 256-element CMUT and PZT array prototypes [156]–[158] based on a Blackman window
taper and experimentally evaluated various medical ultrasound imaging applications [118], [159]–[162].
In [109], Sarradj investigated another density tapering approach based on a parametric window function and
a non-linear least squares method for element positioning, enabling a flexible one-parameter control of the
taper.
6.2 Two-scale spiral array design

This section introduces the two-scale sparse spiral array design concept, which exploits the specific PSF
structure of sunflower arrays, instead of relying on window functions for density tapering, as used in related
works. The modification proposed combines two nested sunflower sub-arrays featuring two different spatial
element densities such that the locations of their respective main lobe, side lobes and grating lobes, referred
to as PSF zones, differ and combine favorably. As a result, the composite array geometry has a balanced and
improved one-way PSF in terms of MSLL and MLW compared to the previous approaches.
First, an analysis of the PSF characteristics of the unmodified classic sunflower array for different aperture
diameters and number of elements is provided in order to introduce a concept for estimating the PSF zone
locations requiring only the basic array design parameters. Second, the two-scale array design is elaborated
and the PSF zone estimation is extended for its sub-arrays. In addition, a specific well-matching sub-array
combination is investigated for highlighting its advantages and limitations. Third, the respective 64-element
and 256-element arrays of the classic sunflower and density tapering strategies are benchmarked with the
two-scale method and multiple optimum sub-array combinations are examined in more detail. After that, the
second part of the chapter covers multi-frequency image compounding and the experimental evaluation of the
real-world prototype.
6.2.1 Numerical beam pattern and point spread function model

The key factors for the following evaluations are the far-field beam pattern and one-way PSF, i.e. the beam
pattern with a single centered point source. The model used for their generation is based on the normalized
and discretized well-known Rayleigh integral, where point elements and point sources are assumed, as
well as far-field and narrow-band conditions. Therefore, the model allows an application-neutral analysis
focused on the array element positions, as the effects of a particular element size, focal distance, or specific
bandwidth are not included. The superimposed magnitude p for the beam pattern scanning direction
86
sin(θ) cos(ϕ), sin(ϕ) is given by
(︁ )︁
(u, v) =
⃓ ⎛ ⎞⃓⃓
⃓ L−1
⃓ ⃓
p(u, v) = ⃓aH (u, v) ⎝
∑︂
a(ul , vl ) · sl ⎠⃓ , (6.1)
⃓ ⃓
⃓ ⃓
⃓ l=0 ⃓
where θ and ϕ are the azimuth and elevation angles, L is the number of point sources, l is the corresponding
source index, (ul , vl ) and sl are the l-th source direction and magnitude, respectively, and a(u, v) ∈ CM ×1 is
the array-specific steering vector, whose m-th entry is given by
(︃ )︃
2π
am (u, v) = exp j (xm u + ym v) . (6.2)
λ
Here, M is the number of elements and (xm , ym ) is the position of the m-th array
√ element. The PSF is
generated by evaluating (6.1) for the directions in the complete hemisphere, i.e. u2 + v 2 ≤ 1 (if not stated
otherwise), and for a single centered point source, such that L = 1, sl = 1 and (ul , vl ) = (0, 0). The beam
patterns and PSFs are generally normalized to their maximum value.
6.2.2 Analysis of the classic sunflower spiral array

The position of the m-th element of the planar classic sunflower array rm is determined by discretely sampling
the Fermat spiral based on the models in [109], [163] with the design parameter V = 5, that is
(︄ √︃ √ )︄
m 1+ V
(Rm , φm ) = Rap , 2π(m − 1) , and (6.3)
M 2
rm = (xm , ym ) = Rm cos(φm ), Rm sin(φm ) , (6.4)
(︁ )︁
where Rm is the corresponding radius of the m-th element to the aperture center, φm is the corresponding
angle and Rap = Dap /2 is the maximum aperture radius. The model defines the element radii to the center
Rm such that the area of the ring spanned by two successive radii is constant and equivalent to the M -th part
of the total aperture area, that is
)︂ πR2
ap
(︂
∆Am = π Rm2 2
− Rm−1 = . (6.5)
M
The design parameter V controls the number and positions of the spiral arms created by altering the angular
distance between two successive elements. With the parameter V = 5, this angular distance corresponds
to the Golden Angle, which in conjunction with the constant ring area ∆Am , results in an approximately
uniform spatial element density, a main characteristic of the classic sunflower array [108]. Therefore, the
sunflower geometry depends only on two parameters, i.e. the total number of elements M and the aperture
diameter Dap .
In the following, the resulting one-way PSFs for different aperture diameters and three typical fixed total
numbers of elements are used to investigate the corresponding MSLLs and the MLW at −6 dB (MLW6 ), both
being widely used metrics for contrast and angular resolution (Fig. 6.1). The MLWs monotonously decrease
with increasing aperture diameter, whereas being independent of the observed number of elements. In
contrast, the MSLLs transition from a lower to a higher plateau, where there is no further significant increase.
The level of the plateaus as well as the diameter at which the plateau transition occurs are both dependent on
the number of elements. Therefore, if a low MSLL is desired, there is an optimal aperture diameter for each
M , where the MLW is narrowest, which is just before the transition to the higher plateau.
87
−6 [Fig. 6.2(a)]
MSLL (dB)
−9
−12
−15
−18 M
[Fig. 6.2(b)] 64
(a) 128
40 256
MLW6 (◦ )
30
20
10
0
(b) 2 5 10 15 20 25 30 35 40 45 50 55
Aperture diameter Dap (λ)
Figure 6.1: MSLL (a) and MLW6 (b) extracted from the normalized PSFs of classic sunflower arrays for various
aperture diameters Dap and three fixed numbers of elements M . While the MLW6 steadily narrows
with increasing aperture diameter independent of the number of elements, the MSLL features
a plateau-like increase, where the transition from a low to a high plateau depends on both, the
aperture diameter and the number of elements.
Magnitude (dB)
−20 −10 0
1
5
y (λ)
0 0
v
−5 a b
Magnitude (dB)
Dap = 15 λ −10
−1
−5 0 5 −1 0 1
(a1) x (λ) (a2) u
1 −20
5
−30
MLZ SLZ GLZ
y (λ)
MLZ SLZ
0 0
v
(c) 0 0.2 0.4

Ruv
0.6 0.8 1
−5
Dap = 5 λ
−1
−5 0 5 −1 0 1
(b1) x (λ) (b2) u
Figure 6.2: Two examples of 64-element classic sunflower arrays with a large (Dap = 15 λ) (a1) and a small
aperture diameter (Dap = 5 λ) (b1) and their corresponding normalized PSFs (a2, b2) showing
their different PSF zones. The radial√view (c) shows the respective highest side lobe levels for each
concentric ring with radius Ruv = u2 + v 2 centered around the PSF origin. With a sufficiently
high element density, e.g. as in (b), a GLZ is not formed.
88
In order to clarify the plateau-like increase in MSLL, the PSF of a 64-element array is examined for two
different aperture diameters (Fig. 6.2). The typical one-way PSF of the classic sunflower array can be
categorized into three basic zones, i.e.
1. the main lobe zone (MLZ),
2. the side lobe zone (SLZ), where low secondary lobes are formed,
3. and the grating lobe zone (GLZ), where the side lobe level rises significantly.
The three zones are clearly evident in the radial view

√ [Fig. 6.2(c)], showing the respective maximum side
lobe level for each concentric ring of radius Ruv = u2 + v 2 centered on the PSF origin (u, v) = (0, 0). The
transition radius from MLZ to SLZ (RMLZ ) is defined at the first minimum of the main lobe. The transition
radius from SLZ to GLZ (RGLZ ) is specified at the side lobe level that exceeds the first, and typically highest,
secondary lobe in the SLZ. The MLZ transition RMLZ is found to be mainly dependent on the aperture diameter,
whereas the GLZ transition (RGLZ ) depends on the inter-element spacings. Therefore, enlarging or reducing
the aperture diameter with a fixed number of elements will narrow or widen the PSF zones, respectively. By
maintaining a sufficiently small aperture, the GLZ can be forced out of the PSF, so that the MSLL decreases
significantly [Fig. 6.2(b)]. This way, the lower MSLL plateau in Fig. 6.1 is reached, although with the drawback
of main lobe widening.
The estimation of the positions of the three zones is of major interest for the array design. For the classic
sunflower array, the transition from MLZ to SLZ RMLZ can be estimated with the well-known first-minimum-
approximation for circular apertures [164], that is
RMLZ ≈ 1.22λ/Dap . (6.6)
The estimation of the SLZ to GLZ transition (RGLZ ) requires analyzing the inter-element spacings. Since
most of the inter-element spacings of the classic sunflower array are different to each other, Delaunay
triangulation [165], [166] is used to obtain the specific inter-element spacing between each neighboring
element [Fig. 6.3(a)].
In [117], the authors use the most prominent inter-element spacing d̂ in the corresponding histogram
[Fig. 6.3(b)], i.e. with the highest number of occurrences, to estimate the position of the radius, where the
highest grating lobes are located. However, the transition from SLZ to GLZ is desired, for which the mean
inter-element spacing d̄ is found to give a good estimate, that is
RGLZ ≈ λ/d̄. (6.7)
Based on this relation, the classic sunflower array geometry allows to directly estimate RGLZ with the basic
design parameters (Dap , M ) as follows. The area associated to each element Acell,m is analyzed using Voronoi
tessellation, which is found to be in good agreement with the M -th part of the total area just as with the
circular ring area ∆Am between two successive elements in (6.5). The Voronoi cell area of each element
can be approximated by a circular disk with a diameter corresponding to the mean inter-element spacing,
resulting in the relation
(︄ )︄2
2
πRap
d
Acell,m ≈ π ≈ = ∆Am . (6.8)
2 M
Therefore, the SLZ to GLZ transition RGLZ can be estimated by combining (6.7) and (6.8), that is
√
M
RGLZ ≈ λ . (6.9)
Dap
89
The approximations are validated by determining the true respective transition radii of the classic sunflower
arrays for different aperture diameters and number of elements [Fig. 6.3(c),(d)]. While the RMLZ estimate is
in good agreement√ with the true radii, minor correction factors hM are required for a more precise estimation
of RGLZ ≈ hM λ DM ap
depending on the number of array elements, i.e.
(h32 , h64 , h96 , h128 , h256 ) = (0.8, 0.85, 0.94, 0.99, 1.05). (6.10)
In order to prevent the formation of the GLZ in the PSF, as well √ as in beam patterns, where a source can
be located off-center in a specific field-of-view within Rfov = u2 + v 2 , the transition to the GLZ must be
chosen to be RGLZ ≥ 1 + Rfov . For example, if a source can be located in a field-of-view spanning the
full hemisphere (±90◦ ), i.e. Rfov = 1, the GLZ transition √ must be at least RGLZ ≥ 2 for GLZ prevention.
Therefore, the aperture diameter is required to be Dap ≤ 0.5 M λ, so that the mean inter-element spacing is
d̄ ≤ 0.5 λ, just as for the half-wavelength criterion of periodic dense arrays.
Inter-element spacing (λ) Inter-element spacing (λ)
0 1 2 3 4 Number of occurences 0 1 2 3 4
5
20
y (λ)
10 Mean d
−5
0
−5 0 5 1.5 2 2.5 3 3.5 4
(a) (b) Inter-element spacing (λ)
x (λ)
3
0.6
Estimate 1.22λ/Dap
Estimate λ M /Dap
0.4 2
√
32 32
0.2 64 1 64
96 96
128 128
256 256
0
0 0.2 0.4 0.6 0
0 1 2
(c) True Ruv, MLZ (d) True Ruv, GLZ
Figure 6.3: Voronoi tesselation and Delaunay triangulation applied to a 64-element classic sunflower array to
find the various element cells and inter-element spacing (a) and the corresponding histogram of
inter-element spacing values (b). The transition from MLZ to SLZ Ruv,MLZ is effectively estimated
with the overall aperture diameter Dap (c), approximately independent of the number element. The
estimation of the transition radius from SLZ to GLZ Ruv,GLZ requires minor corrections dependent
on the number of elements.
In summary, the MLW and the plateau-like MSLL behavior of the classic sunflower array have been analyzed,
90
which result from its three PSF zones, whose locations can be estimated prior to field simulation with the
aperture diameter and number of elements.
6.2.3 Geometry of the two-scale spiral array

The key idea leading to the proposed two-scale array design is to exploit the specific PSF zone structure of the
classic sunflower array by combining two sub-arrays featuring two different spatial element densities and
aperture sizes, resulting in different PSF zone locations (Fig. 6.4). The combination of PSF zones enables to
improve and flexibly balance the MLW and MSLL, similar to density tapering, without being constrained by
almost-discrete MSLL plateaus. Since the focus is on the one-way beam patterns, both combined sub-arrays
are used only for transmitting or only for receiving, depending on the use case. Therefore, a combination of
PSF zones corresponds to a complex addition, instead of a multiplication as in the two-way case. Although,
two dedicated classic sunflower arrays can be used to combine their specific far-field PSFs, nesting an inner
and outer sub-array allows for a more compact design. Assigning the sub-arrays into an inner area and outer
ring ensures that their element densities can be separately defined for one-way beamforming, which is more
complicated for two nested sub-arrays, that both start from the center and cover a shared area, since the
respective element distances are influenced by each other. Therefore, the two-scale spiral array is fully defined
with four design parameters, that is
⎧ √︂
m
⎨Rin Min , for 1≤m≤Min ,
⎪
⎪
Rm = √︃ (︂ )︂
2 −R2 (m−M ) (6.11)
Rap in in
⎪
⎪ + R 2 for M +1≤m≤M ,
⎩ M −Min in in
where Rin = Din /2 and Rap = Dap /2 are the inner and total aperture radius, Min and M are the inner
and total number of elements. The corresponding element position angles φm and the transformation into
Cartesian coordinates are equivalent to (6.3) and (6.4).
In order to estimate RMLZ and RGLZ for both sub-arrays of the two-scale geometry, their corresponding
design models are analyzed. First, the radii of the inner sub-array are consistent with the classic sunflower
design as in (6.3) using the respective inner aperture diameter Din and inner number of elements Min . Second,
the model for the outer sub-array radii is based on the similar design rule as in(︁(6.5) of the classic sunflower
geometry, i.e. the area of the ring spanned by two successive radii ∆Am,out = π Rm 2 − R2 is constant and
)︁
m−1
equivalent to the area of the outer sub-array divided by the number of outer elements. Therefore, the inner
and outer ring areas are given by
2
πRin
∆Am,in = , (6.12)
Min
π(Rap2 − R2 )
in
∆Am,out = . (6.13)
M − Min
As a result, a composite array is obtained with only two different element densities, which are constant within
each sub-array. Therefore, the two-scale array design allows to estimate the transition from SLZ to GLZ for
the inner RGLZ, in and outer sub-array RGLZ, out in the same way as shown in Section 6.2.2, that is
√ √︄
Min M − Min
RGLZ, in ≈ λ and RGLZ, out ≈ λ 2 − D2
. (6.14)
Din Dap in
The estimation of RMLZ, in and RMLZ, out of both sub-arrays is equivalent to (6.6) with the respective inner and
total aperture diameters.
91
In summary, the ability to estimate the PSF zone locations using the basic design parameters is a major
advantage of the two-scale design, since a desired PSF zone combination can be specified prior to field
simulation.
6.2.4 Characteristics of the two-scale spiral array

Next, a 64-element two-scale array is examined to demonstrate the MLW and MSLL balancing and improvement.
The focus is on a geometry that exploits a specific PSF zone combination, although other advantageous
combinations are possible, as shown in Section 6.2.5. In particular, a GLZ-free inner sub-array with a wide
MLZ is positioned within an outer sparser sub-array (Fig. 6.4), such that
a) the high GLZ of the outer sparse sub-array is combined with the low SLZ of the inner sub-array, and
b) the low SLZ of the outer sparse sub-array is paired with the MLZ and first secondary lobe of the inner
sub-array,
resulting in two effects. First, the main lobes of both sub-arrays accumulate to an overall higher level and
form a combined main lobe with a narrow peak and broad base. Second, the overall side lobe level is more
balanced, such that there are no pronounced differences between the SLZ and GLZ, which otherwise typically
consist of relatively low and high side lobe levels. Both characteristics lead to an effective reduction of the
MSLL, whereas a narrow peak of the combined MLW is preserved. As a result, a higher imaging resolution
with reduced artifact formation compared to the classic sunflower array is expected. Nevertheless, there is
another characteristic to consider, namely the main lobe base level (approx. −11 dB in the example), where
the narrow peak transitions into the broad base, whose effects are pointed out next.
Magnitude (dB)
−20 −10 0
1 Inner sub-array
Dap 0 Outer sub-array
5 Combined array
Main lobe base
Magnitude (dB)
Din −10
y (λ)
0 0
v
−20
Dap 15 λ
−5 Din 5 λ −30 MLZ SLZ GLZ
Min 40
MLZ SLZ
−1
−5 0 5 −1 0 1 0 0.2 0.4 0.6 0.8 1
(a) x (λ) (b) u (c) Ruv
Figure 6.4: Example two-scale 64-element array nesting a denser inner sub-array with a sparser outer sub-
array (a), resulting in an advantageous PSF zone combination demonstrated in the corresponding
normalized PSF (b) and its radial view (c). The combined main lobe consists of a narrow peak
and broad base, whereas the overall side lobe level is more balanced without distinct transitions
from SLZ to GLZ.
In order to emphasize the advantages and drawbacks of this particular two-scale spiral array, the resulting
beam patterns of two adjacent horizontally positioned point-sources at varying angular spacings are examined.
The beam patterns of the two-scale array and classic sunflower array are then compared, whereas both
geometries are selected to have an equal −6 dB MLW (MLW6 ) in their respective PSFs (Fig. 6.5). The angular
spacing θs between the two adjacent positioned equal-strength point-sources is gradually increased from 0◦ to
20◦ in steps of 0.5◦ . For each spacing θs , the resulting beam patterns of the two-scale and classic sunflower array
92
are generated using
(︁ the)︁ multi-source model
(︁ in (6.1) with L = 2, where the corresponding source locations
are u0 = + arcsin θs /2 and u1 = − arcsin θs /2 . Based on these beam patterns, the angular resolution and
)︁
the minimum level between the two sources are evaluated. If the minimum inter-source-level [Fig. 6.5(c)]
drops below 0 dB, two distinct maxima are formed, such that both sources are separable in the beam pattern
and the corresponding spacing defines the angular resolution.
Magnitude (dB) Magnitude (dB)
−20 −10 0 −20 −10 0

1
High SLL
0 Classic (a)
v
0
Not separable Two-scale (b)
Min. level between

Early separation
sources (dB)
−3
−1
(a1) (b1) Flat roll-off
1
−6
−9
0
v
0 5 10 15
Angular spacing between sources (◦ )
(c)
−1
−1 0 1 −1 0 1
u u
(a2) (b2)
Figure 6.5: Comparison of the normalized one-way beam patterns of the classic sunflower (a) and two-scale
array (b), both featuring the same MLW6 = 9◦ , for two adjacent point sources with varying angular
spacings, i.e. 10◦ (row 1) and 14◦ (row 2). Due to the accumulation of the GLZs of both sources,
the resulting MSLL of (a) is significantly higher compared to (b), increasing the risk of artifact
formation. The minimum side lobe level between the two sources for various spacings (c) shows
that the two-scale array can separate closer-spaced sources, but requires larger spacings for a
higher separation contrast due to the broad main lobe base.
Although both MLW6 are equal, the two-scale array can separate the two sources at a smaller angular
spacing (6◦ ) compared to the classic sunflower array (8.5◦ ), thus providing a higher effective angular resolution.
In addition, the MSLL of the two-scale array is significantly lower (−9.86 dB vs. −4.12 dB), since the high
GLZs of both sources accumulate for the classic sunflower array, increasing the risk of artifact formations. In
contrast, the side lobes in the beam patterns of the two-scale approach are more evenly distributed as in its
PSF, without particularly high or low levels, so that the MSLL is kept low. However, the higher resolution and
lower MSLL of the two-scale array come at a cost. Due to the wide main lobe base of the two-scale array, the
minimum inter-source-level has a flatter, plateau-like roll-off with increasing source spacing compared to the
classic sunflower array. Consequently, although even closely spaced sources are separable, a high separation
contrast requires a larger source spacing compared to the classic sunflower array. The roll-off of the minimum
inter-source level is mainly determined by the main lobe base level of the corresponding two-scale array PSF.
Therefore, the main lobe base level must be considered in the two-scale array design. The re-increase of the
minimum inter-source level for the classic sunflower array at 15◦ and above 20◦ angular spacing arises due to
93
the accumulation of side lobes, which is expected behavior.
In summary, compared to the classic sunflower array with the same MLW6 , the two-scale array provides a
higher angular resolution and a more effective artifact suppression due to the lower MSLL at the expense of
reduced contrast between closely spaced but separable sources.
6.2.5 Benchmark, results and discussion

In this section, the two-scale array geometries are benchmarked with the classic sunflower arrays and with
density tapering approaches for modifying the sunflower spiral geometry. Two methods of density tapering
are implemented for reference. One is based on a fixed window function introduced in [108], [116], where
the element positions are determined iteratively. Here, a Blackman window is considered [Fig. 6.6(a2)],
which has been utilized in multiple previous works, e.g. [156], [157], [161]. The other method is described
by Sarradj in [109] and relies on a configurable single-parameter (H) window, based on the modified zero-th
order Bessel function, where the radii of the elements are obtained by solving a non-linear least squares
problem [Fig. 6.6(b)]. Depending on the chosen H-parameter, the Sarradj method (SAR) enables to taper the
element density to the center (H > 0) [Fig. 6.6(b2)] or to the periphery of the aperture (H < 0) [Fig. 6.6(b3)].
The main difference of the density tapering methods to the two-scale array approach [Fig. 6.6(a3)] is that
the latter does not rely on window functions, originating from amplitude apodization, but rather aims for an
advantageous PSF zone combination.
In order to highlight the similarities and differences of the design methods, their density window functions
are compared with respect to the non-tapered classic sunflower case. Therefore, the equivalent density window
function are derived for the two-scale array, enabling to recreate the geometry introduced in this chapter
using the density tapering method described in [108], [116], based on the following considerations. The
tapering method defines the radii Rm such that the ring area spanned by two successive radii weighted by the
density window function f (R) is constant (K) [108], [116], that is
∫︂ Rm
2π f (R)R dR = K, (6.15)
Rm−1
∫︁ R
where K is defined as the M -th part of the effective total aperture area K = 2π/M 0 ap f (R)R dR. For
example, the equivalent density window function of the classic sunflower array is f (R) = 1, since the ring
areas ∆Am are already defined to be constant (6.5), such that
∫︂ Rm (︂ )︂ 2
πRap
2πf (R) R dR = π 2
Rm − 2
Rm−1 = , (6.16)
Rm−1 M
where πRap
2 /M = ∆A = K. Since the ring areas of the two-scale geometry (∆A
m m,in , ∆Am,out ) are constant
within each sub-array as shown in (6.12) and (6.13), its equivalent density window function relative to the
non-tapered classic sunflower case with equal aperture diameter is given by
⎧ 2 M
∆Am Rap in
= 2M , for 0≤R≤Rin ,
⎪
⎨ ∆A
Rin
(6.17)
m,in
f (R) = 2
Rap (M −Min )
∆Am
= , for Rin <R≤Rap .
⎪
2 2
⎩ ∆A
m,out (R −R )M ap in
Therefore, the equivalent density window function of the two-scale array consists of two discrete density levels
in contrast to the Blackman and SAR tapering methods, which feature a smooth characteristic [Fig. 6.6(c)].
Besides that, the similarities of the Blackman and SAR tapering methods become evident in the comparison.
The SAR window functions with H = 1 and H = 4 lead to a respectively weaker and stronger inner tapering
94
compared to the Blackman window, which is approximately identical to SAR H = 2.5. In fact, many spiral
array geometries are contained in the set of solutions created by the SAR method for varying H parameters,
including the classic sunflower array, which results for H = 0.
5
y (λ)
−5
(a1) Classic (a2) Blackman (a3) Two-scale
5
y (λ)
−5
−5 0 5 −5 0 5 −5 0 5
x (λ) x (λ) x (λ)
(b1) SAR H = 1 (b2) SAR H = 4 (b3) SAR H = −2
5 SAR H = 4
Two-scale
Density window function
4
Blackman
3
SAR H = 1
2
Classic
1
SAR H = −2
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Normalized aperture radius
Figure 6.6: Comparison of example geometries of the design methods used for the benchmark, i.e. the
classic sunflower array (a1), different spatial density tapering modifications based on a fixed
Blackman window (a2) [108], [116] and the approach of Sarradj [109] using a parametric (H)
density window function (b), as well as the two-scale method proposed in this chapter (a3). The
SAR method allows a flexible modification by varying the H parameter, e.g. for peripheral tapering
(b3). The density window functions (c) show the density taper relative to the non-tapered classic
sunflower case and emphasize the differences and similarities of the methods.
95
Benchmark of the design methods
For benchmarking, different array geometries are generated with a fixed number of elements (M = 64 and
M = 256) based on the four design methods presented, each with varying design parameters. The aperture
diameter, which is the only variable parameter for the classic sunflower and the Blackman density tapered
array, is varied in the interval Dap ∈ {2, 3, . . . , 60}λ. For each aperture diameter, the H-parameter of the
SAR arrays are additionally varied within H ∈ {−5, −4.8, . . . , 5}. The two-scale array geometries are created
using any combination of Dap with the remaining two variable parameters, i.e. inner diameter Din and inner
number of elements Min , in the intervals Min ∈ {1, 2, . . . , M − 1} and Din ∈{[1, 2, . . . , 59]λ | Din < Dap }. The
interval boundaries for the parameter sweep are chosen to be comparable to existing literature, e.g. as in
[109].
Subsequently, the corresponding PSF for each array geometry is formed from which the performance
metrics, i.e. MLW6 and MSLL, are automatically extracted. The MSLL is used as it reflects the worst-case
metric in the formation of side lobe artifacts. In order to consider the side lobes that are created by an
off-center
√ point-source located within the full-hemisphere field-of-view as well, the PSF is evaluted within
Ruv = u2 + v 2 ≤ 2 instead of ≤ 1. For clear comparison, only the optimum arrays with the lowest MSLL per
MLW6 of each approach are selected and illustrated . Here, two-scale arrays with an main lobe base level
above −9 dB are not considered. Otherwise, the comparison would be clearly in favor of two-scale arrays with
a narrow MLW6 but poor separation contrast for closely spaced sources (Section 6.2.4). The MLW6 and MSLL
of the selected arrays of all approaches are then compared (Fig. 6.7). First, the array geometries using 64
elements are considered.
The classic sunflower arrays provide the reference baseline for the comparison. Their characteristic MSLL
plateaus at approximately −8.8 dB and −16.5 dB including their steep transition are clearly noticeable, similar
as in (Fig. 6.1).
40 40
Classic
Blackman
SAR
30 Two-scale 30
V
Type V
MLW6 (◦ )
MLW6 (◦ )
20 20
IV
10 III 10
Typ
II eI
II
M = 64 M = 256
0 I 0
−22 −20 −18 −16 −14 −12 −10 −8 −35 −30 −25 −20 −15 −10
(a) MSLL (dB) (b) MSLL (dB)
Figure 6.7: Benchmark of the four design methods showing the resulting one-way PSF quality metrics of the
optimum arrays of each approach with a fixed number of elements (64 (a) and 256 (b)) and varying
design parameters. Sources located off-center in the entire hemisphere (field-of-view of ±90◦ ) are
considered for MSLL evaluation. The optimum 64-element arrays of the two-scale method include
five different types of PSF zone combinations (I to V), all of which feature improved or similar
MLW and MSLL compared to the other approaches. Similar results are obtained for 256-element
arrays, although the two-scale approach provides no further MSLL improvement for main lobe
widths wider than 15◦ .
The Blackman window density tapering approach has, to some extent, a lower MSLL for the same MLW6 ,
96
particularly between 14◦ and 21◦ , where a MSLL improvement from −8.8 dB down to −11.5 dB is observed.
In addition, the MSLL can be further reduced below the lower plateau of the classic sunflower approach for
MLWs wider than 32◦ .
The SAR arrays generally outperform the classic sunflower and Blackman window method as expected,
since they are a subset of the SAR geometries. The variation of the H parameter enables a more flexible
balancing between MLW and MSLL, e.g. with peripheral density tapering. This way, the performance gaps of
the Blackman approach are overcome.
Finally, the two-scale array design provides the lowest MSLLs for most MLWs compared to the previous
approaches, although the MSLLs for rather narrow or wide MLWs (e.g. < 7◦ and > 31◦ ) are similar to the
SAR density tapering method. In general, similar results are obtained for the 256-element array geometries,
where the classic sunflower method shows two distinct MSLL plateaus, the SAR technique outperforms the
Blackman approach, and the two-scale geometries provide lower or similar MSLL per MLW compared to the
SAR method. However, for wide MLWs above 15◦ , the two-scale approach reaches a plateau and stops to
improve the MSLL at approximately−27 dB, whereas it can be further lowered by the Blackman and SAR
technique.
Clearly, there are specific PSF zone combinations that lead to particularly effective improvements, while
others perform only similarly or worse compared to previous density tapering approaches. For example,
one of the greatest improvements for the 64-element geometries occurs for an MLW of 14◦ , where the MSLL
of the two-scale approach (−13.4 dB) is significantly lower than for the classic sunflower (−8.8 dB) or SAR
(−11.1 dB) method. In fact, this particular two-scale array utilizes the PSF zone combination (Type III)
analyzed in Section 6.2.4. Overall, five different types (I to V) of advantageous PSF zone combinations are
observed in the set of optimum 64-element two-scale arrays, which are examined next.
Optimum two-scale array combination types

The first type [Fig. 6.8(I)], consists of a large-aperture outer sub-array which includes most of the available
elements, and, therefore, predominantly determines the PSF. In contrast, the contribution of the inner sparser
sub-array is only supportive by adding to the relative main lobe level and by positioning its first secondary
lobe (SL) minimum close to the first high side lobe of the outer sub-array.
Type two [Fig. 6.8(II)], features a more balanced distribution of the inner and outer number of elements,
although the latter is still dominant. Most importantly, the inner and outer aperture diameters Din and Dap
differ only slightly. This way, a denser outer ring sub-array is formed filled by a sparser inner sub-array. As a
result, both MLWs are similar to each other and accumulate without significant widening. The low SLZ of the
sparse inner sub-array is positioned near the highest side lobes of the outer array, equalizing the overall side
lobe level. In general, this type is similar to the peripheral density tapering of the SAR approach, e.g. for
H = −2 [Fig. 6.6(b3)].
The third type [Fig. 6.8(III)] is the basis of the original idea for the two-scale array design, outlined in
the previous sections. Here, the distribution of the number of elements is balanced as well, with the inner
sub-array being denser and more populated. Basically, the densities of the inner and outer sub-arrays are
inverted compared to type II, resulting in a different PSF zone combination, as depicted in Section 6.2.4. The
MSLL is significantly reduced compared to the other approaches given the same MLW6 . Still, there is the
drawback of reduced contrast between closely spaced sources. The optimum 256-element two-scale array
geometries, which provide improved results compared to the previous approaches, consist exclusively of this
type III.
In type four [Fig. 6.8(IV)], the PSF is primarily determined by the inner denser sub-array, while few outer
sparse elements contribute only as support for balancing out the side lobe level, forming the counterpart to
type I.
97
0
Inner sub-array
1
Outer sub-array
Magnitude (dB)
−10
Combined array
y (λ)
0
−20
−1 −30
Secondary lobes only visible
for off-center sources
−40
V −1 0 1
1
0 0.5 1 1.5 2
0
2
Magnitude (dB)
−10
y (λ)
0
v
0
−20
−30
−2
IV
−1 −40
−2 0 2 −1 0 1 0 0.5 1 1.5 2
1
0
4
Magnitude (dB)
2 −10
y (λ)
0
v
0
−20
−2
−30
−4
III −1 −40
−4 −2 0 2 4 −1 0 1 0 0.5 1 1.5 2
1
0
5
Magnitude (dB)
−10
y (λ)
0
v
0
−20
−30
−5
−1 −40
II −5 0 5 −1 0 1 0 0.5 1 1.5 2
1
0
10
Magnitude (dB)
−10
y (λ)
0 0
−20
−10 −30
−1 −40
I −10 0 10 −1 0
u
1 0 0.5 1 1.5 2
x (λ) Ruv
Figure 6.8: Five different types of advantageous sub-array combinations observed in the set of optimum two-scale arrays (Fig. 6.7).
Type I and IV feature either a dominant outer or inner sub-array, whereas the opposite sub-array is only supportive.
In II and III, the ratio between inner and outer elements is more balanced and the sub-arrays are of different spatial
densities for exploiting the PSF zone combinations. In particular, type III shows significant improvements compared
to other design methods as examined in the previous sections. The sub-arrays of type V have only minor differences
in spatial densities, which nevertheless result in an effective positioning of the side lobe minima and maxima for a
more balanced overall side lobe level.
98
Finally, in type five [Fig. 6.8(V)], the sub-arrays differ only slightly in spatial density and therefore resemble
the classic sunflower array more closely compared to the previous types. Nevertheless, even the small
differences cause the side lobe minima and maxima to balance each other out, resulting in a significantly
lower MSLL compared to the classic sunflower approach. However, compared to the previous density tapering
approaches, type V provides only similar or worse results, particularly for the 256-element arrays.
In summary, considering the progression from wide to narrow MLWs, the optimum two-scale arrays feature
an increasing overall aperture, as well as a shift from type I to V (M = 64) or type III to V (M = 256),
respectively. For all optimum combinations, the corresponding sub-array main lobes accumulate to a higher
level, whereas the overall side lobe level becomes more balanced compared to the classic sunflower approach.
The most effective improvements over the existing density tapering approaches are achieved, where a dense
inner sub-array is combined with a sparse outer sub-array, with a balanced number of inner and outer elements,
as in type III. In contrast, for rather small apertures, where both sub-arrays are dense and grating lobe zones
do not form (type V), the two-scale PSF zone combination is less effective. In this case, although significant
improvements over the classic sunflower array are achieved, the two-scale results can be similar or even
inferior compared to the Blackman and SAR method.
Scope and limitations of the study

Finally, the scope and limitations of the analysis and results are highilghted. As explained in Section 6.2.1,
an application-neutral model is utilized, that focuses on the element positioning and is therefore based on
unidirectional point elements with infinitesimal size. As a result, the minimum spacing for element positioning
is not constrained, such that the real-world manufacturing of the identified optimum array geometries is
not ensured, as it depends on the application-specific transducer technology available. Therefore, certain
geometries may not be feasible, particularly for applications using relatively small wavelengths. Furthermore,
the effects of the application-dependent element size and shape are not included in the evaluation. While
these effects are easily predictable for far-field applications based on the pattern multiplication theorem [43],
near-field applications require a model extension, where the specific element geometry and directivity is
considered. In the latter near-field case, the results obtained with focused beams may additionally differ
from the far-field results provided, depending on the application-specific focal distance and region of interest.
Similar deviations can be caused if high-bandwidth transducers and temporally short pulses are involved,
which require a broad-band beamforming model for reliable analysis. Consequently, the results presented are
not directly valid for all applications, particularly if the base conditions in terms of element size and shape,
bandwidth, and region of interest differ significantly from the model assumptions used.
Apart from the limitations due to the generic model, further modifications and optimization procedures can
certainly lead to improved solutions compared to those investigated within the parameter space of the study.
For example, adding more than two sub-arrays with different element densities can potentially provide further
improvements at the expense of a more complicated array design including more parameters. Moreover, the
well-performing deterministic two-scale solutions identified can be used as initial seeds for further stochastic
optimization methods. Nevertheless, these two improvement strategies are beyond the scope of this thesis.
6.3 Two-scale sonar system concept with multi-frequency excitation

After introducing the two-scale spiral array geometries, in this section, the combination of PSF variations to
reduce the MSLL is further exploited, similar to the two-scale array design approach. However, this time, the
PSF variations are not only achieved by changing the aperture size or element density, but additionally by
combining multiple different discrete transmit frequencies.
99
First, the different frequencies and transducer technologies involved are described, as well as the firing
scheme used for their excitation. The signal processing of the received multi-frequency pulse packet and the
compound image formation are elaborated. Subsequently, the evaluation of the optimum two-scale spiral
array geometries is extended to the multi-frequency case. Here, one of these optimum array geometries is
selected for which a real-world multi-frequency two-scale prototype is developed.
The corresponding system architecture of this prototype is highlighted in detail. The architecture combines
several previously presented design strategies, such as the utilization of heterogeneous transducer technologies,
waveguides for directivity broadening and equalization, two-scale spiral geometries in combination with the
transmission of multiple discrete frequencies. The electronics required consists of several interchangeable
modules, so that not only multiple dedicated mono-frequency transducers but also a single broad-band
high-voltage transducer, e.g. a novel ferroelectric ultrasonic transducer, can be applied.
After the introduction of the system, the prototype is experimentally investigated with respect to the
transmit and receive characteristics, focusing on the effects of the waveguides on the transducers as well as the
performance of the MEMS microphones. Finally, the system is characterized in pulse-echo mode, including
the evaluation of multi-frequency high-frame-rate imaging with respect to the angular resolution and image
quality in multi-reflector scenes.
6.3.1 Multi-frequency excitation and image compounding

The basic principle exploited in this section is based on the variation of the PSF of a fixed array geometry by
using different frequencies. Since the beam pattern of an array geometry depends on the element positions,
but in relation to the wavelength, altering the utilized frequency results in a virtual scaling of these relative
positions. For example, a large array geometry at a low frequency can generate the identical beam pattern as
a small array geometry at a high frequency. Even small frequency changes, given a fixed array geometry, can
therefore lead to an expansion or shrinking of the PSF, so that the positions of the side lobes are altered as well,
whereas the main lobe remains positioned at the center. By combining different PSFs of multiple frequencies,
also referred to as compounding, the individual main lobes can thus accumulate, while the respective side
lobes and minima recombine and balance each other. As a result, in the composite PSF, the general relative
side lobe level is reduced and their positions are more evenly distributed.
In order to transmit multiple ultrasound frequencies in air, two fundamentally different strategies are
possible. First, broad-band capacitive transducer technologies are available, such as the Senscomp 600
series [167], whose bandwidth is specified between 20 kHz to 100 kHz. This type of transducer requires a
high driving and bias voltage of 200 V each, resulting in a sound pressure level of 110 dB at a distance of 1 m
using the nominal frequency of 50 kHz. For a 3D imaging application, the main problem of this transducer
type is its relatively large aperture with a diameter of 40 mm. Therefore, this transducer is highly directive
featuring an opening angle of only 15◦ at its nominal center frequency of 50 kHz, considerably limiting the
field of view. In addition, if multiple frequencies are used, the directivity characteristics are altered due to the
common fixed aperture size. As a consequence, certain frequency components do not equally illuminate the
surroundings, so that the positive effects of image compounding are weakened.
The second strategy is based on multiple dedicated narrow-band PUTs with different resonant frequencies.
These transducers feature a compact design, enabling a wide radiation directivity. Despite the compactness,
due to their resonance operation, they provide high transmit intensities even at low voltages, which are
distributed over multiple narrow frequency bands. Furthermore, an additional high bias voltage is not
required, allowing an easy excitation with cost-effective electronic components. Since a separate oscillating
aperture is available for each frequency, the directivity characteristics can be matched by selecting suitable
aperture diameters or by additional waveguide attachments. This way, an equalized spatial illumination of
the frequency components can be ensured, so that the relative side lobe reduction can be fully exploited by
100
image compounding.
On the basis of these advantages, the multi-frequency PUT strategy is chosen. Due to the operation in
air, only low resonant frequencies are considered in the market research of available transducer types. The
three transducers selected are, first, the previously used Murata MA40S4S at 40 kHz, second, the ProWave
328ST160 at 32.8 kHz, and third, the CUI T8012-2600-TH at 25 kHz (Fig. 6.9). All three transducers are
based on the PUT bending plate principle. The package dimensions and their SPLs provided, given the same
driving voltage, are similar to each other, although the aperture diameters of the Prowave and CUI transducers
are identical (16 mm) and differ from Murata type (10 mm). The effects of the aperture sizes on the directivity
characteristics are investigated experimentally in Section 6.3.3. In the frequency response measurement, the
−3 dB bandwidths of the transducers are 1.3 kHz, 1.4 kHz, and 1 kHz for the 25, 32.8, and 40 kHz transducers,
respectively. The true resonant frequencies differ up to a maximum of 500 Hz.
⌀16 mm ⌀16 mm ⌀10 mm

110
9.5 mm
7.1 mm
12 mm
SPL (dB)
100
CUI ProWave Murata 90

T8012-2600-TH 328ST160 MA40S4S
25 kHz 32.8 kHz 40 kHz at 30 cm, 20 Vpp
115 dB (30 cm, 10 Vrms ) 115 dB (30 cm, 10 Vrms ) 120 dB (30 cm, 10 Vrms ) 80
max. 160 Vpp max. 20 Vpp max. 20 Vpp 20 25 30 35 40 45
(a) (b) Frequency (kHz)
Figure 6.9: Three available PUT transducers with different resonant frequencies in the low ultrasonic range,
including their datasheet information [168]–[170] (a) and their corresponding frequency response,
measured in an anechoic chamber (b). Since the transducers are based on the same piezoelectric
bending-plate principle, their provided SPLs and narrow bandwidths are in good agreement.
Next, the firing scheme used to excite the transducers is addressed. One option is to drive all transducers
simultaneously, which results in a superimposed multi-frequency pulse. However, since these pulses incoher-
ently interfere with each other, an inhomogeneous location-dependent sound field distribution is created while
the environment is illuminated. As a consequence, within hot-spot areas, objects can reflect considerably
stronger, while in other areas, the echoes are undetectable. Another disadvantage of simultaneous excitation
is that the envelope of the incoherently superimposed pulse creates audible frequency components and is
therefore not suitable for an operation in the presence of individuals.
For these reasons, a sequential firing scheme is used, in which the individual pulses are excited separately in
time (Fig. 6.10). The transducers are excited for 1 ms each, followed by a 1 ms pause, allowing the transducer
to settle and decay before the next transducer excitation. The excitation and pause times can be adjusted
variably. This way, the incoherent interference of the individual pulses is prevented, so that homogeneous
illumination is ensured.
After the pulse packet is transmitted and the echo signal is received, the individual pulses must be temporally
superimposed. Otherwise, a reflecting object appears at three different distances in the resulting image and
the advantages of the image compounding are lost. Therefore, a matched filter is used for signal processing,
consisting of the identical pulse sequence but inverted in time. By convolving the matched filter with the echo
signal received, the time delays between the pulses are thus reversed, so that the individual pulses overlap
101
and their maximum positions correspond to the object distance. As a result, the rather long pulse packet is
temporally compressed to the length of a single pulse. A further compression, as in the case of chirps, is not
achieved due to the narrow bandwidth. In addition to the time correction, the matched filter suppresses noise
and interfering frequencies due to the bandpass characteristics at the corresponding transmit frequencies.
Furthermore, the matched filter can be used to split the spectral components of the received signals, so that
the individual signal frequencies can be processed separately, which is described in detail hereafter.
1
2 2
32.8 kHz 0
Matched filter -1
1 25 kHz 40 kHz 1
Amplitude
Amplitude
10 5 10 15 20
0 ⊛ 0 0 Time (ms)
-1
-1 -1
10 5 10 15 20
0 Time (ms)
-2 -2
-1
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
(a) Time (ms) (b) Time (ms) (c) Time (ms)
Figure 6.10: The echo signal received (a) is convolved by a matcher filter (b), consisting of the temporally
inversed pulse sequence. This way, the pulse packet is compressed to the length of a single
pulse by shifting the individual pulses accordingly, such that they superimpose in time (c). In
addition, interfering frequencies and noise are filtered.
Next, the implementation of the multi-frequency signal processing and image compounding is described,
which enables parallelized, efficient execution and does not require a general broad-band beamforming
technique, such as time-domain delay-and-sum. First, the real-valued signals of all elements are transformed
from the time domain to the frequency domain using real-to-complex FFT. This way, only the non-redundant
half of the spectrum corresponding to the analytical signal spectrum is generated, which is exploited in the
subsequent envelope generation. Each transformed frequency spectrum is multiplied by the matched filter in
the frequency domain. The individual frequency bands are then processed separately (Fig. 6.11).
Narrowband CBF IFFT Envelope

25 kHz Complex to complex
FFT Matched Narrowband CBF IFFT Compound

Envelope
Real to complex filter 32.8 kHz Complex to complex image
Narrowband CBF IFFT Envelope

40 kHz Complex to complex
Figure 6.11: The signal processing and image formation is primarily performed in the frequency domain,
where all microphone signal spectra are split into three frequency bands after matched filtering.
Each frequency band is spatially filtered using conventional narrow-band beamforming with the
respective resonant frequency. After envelope formation in the time-domain, three volumetric
images are obtained, which are then compounded.
102
Due to the narrow bandwidth of the individual signal components, each frequency band can be processed
by narrow-band conventional beamforming using the respective resonant frequency, which is efficiently
implemented by parallel-tiled matrix multiplication. The resulting spatially filtered frequency spectra are
transformed back into the time domain for each frequency band using complex-to-complex IFFT, and the
envelope is generated by forming the absolute values. After this step, three volumetric images obtained by
different frequencies are available, which, in the last step, are superimposed to form a compound image.
Overall, the processing architecture can be easily extended with additional frequency bands.
In the following, the multi-frequency excitation and signal processing is utilized to evaluate the optimum two-
scale array geometries by a parameter search. Afterwards, an optimum geometry for the prototype is selected.
For the parameter search, the total aperture diameter is swept within a reasonable and manufacturable range
from Dap = 5 cm to 25 cm, whereas the inner diameter is swept between Din = 1 cm to 24 cm. The total
number of elements is M = 64 and the number of inner elements is varied between Min = 1 and 63. For each
array geometry, the multi-frequency PSF is evaluated in terms of MSLL and MLW. The optimum geometries
with the lowest MSLL per MLW are depicted in (Fig. 6.12).
Compared to the single-frequency two-scale array geometries, the multi-frequency excitation and image
compounding reduces the MSLL by an additional 2.5 dB on average. For example, for an MLW of 5◦ , an
MSLL of −16.3 dB is achieved with the multi-frequency two-scale strategy, −13.9 dB with the single-frequency
two-scale approach, and −8.8 dB with the classic sunflower method. The prototype created is based on this
specific multi-frequency two-scale array geometry, which consists of a total aperture of 19 cm, an inner aperture
diameter of 5.5 cm, and an inner and outer number of elements of Min = 40 and Mout = 24 .
0.1 10
Classic
Blackman
0.05 8 SAR
MLW (◦ )
y(m)
0 6 Two-scale
Two-scale MF
-0.05 4
-0.1 2
−22 −20 −18 −16 −14 −12 −10 −8
(a) 0.1 0.05 0 -0.05 -0.1 (b) MSLL (dB)
x(m)
Figure 6.12: The two-scale array selected for the prototype (a) is based on the quality metrics comparison
of the optimum geometries, found with a parameter search (b). The results of the two-scale
approach using single and multi-frequency excitation are compared.
6.3.2 Two-scale multi-frequency electronics and system design

Based on the selected optimum two-scale geometry, the prototype system utilizing multi-frequency excitation
is created (Fig. 6.13). The receiving array contains 64 digital MEMS microphones of the type Knowles
SPH0641LU4H-1, which are all mounted on a common PCB by reflow soldering. The microphones have an
acoustic port diameter of 0.325 mm, which is located on the bottom side of the package, where the latter has
the dimensions of 3.5 × 2.6 × 1 mm. Due to the bottom port, vias through the PCB are required, which have
larger diameter of 0.6 mm to compensate for positioning tolerances during reflow soldering. The microphones
feature a high bandwidth of 10 Hz to 80 kHz and a particularly uniform directivity, which is examined in more
103
detail in the experimental section 6.3.3. Furthermore, they provide an integrated analog to digital conversion
and output a digital 1-bit 4 MHz pulse-density-modulated signal. Two microphones can share a common data
line by sending out data interleaved on the rising and falling clock edges, respectively. This way, only a total
of 32 input pins are required when receiving the data.
The PCB itself has a thickness of 1.6 mm and a diameter of 24 cm, whereas the actively used area is within
the diameter of 19 cm. The outer passive part enables an easy integration into the measurement setup and
further functions as a rigid baffle.
The three transducers are attached flush to the PCB with a 3D-printed mounting for which a cutout with a
diameter of 4 cm is provided. No elements of the selected array geometry are positioned within the cutout area,
so no microphones must be removed or repositioned. Two versions of the 3D-printed transducer mounting are
designed, one for direct outcoupling, where the transducers are aligned flush with the printed circuit board,
and one, which incorporates additional individual waveguides for each frequency into which the transducers
are inserted. The waveguides are used to equalize and broaden the directivity characteristics of the individual
transducers. Therefore, depending on the frequency, the individual channels are tapered to different sized
output ports. The exact dimensioning and the directivity characteristics of the waveguides in comparison to
the direct coupling will be described in the next experimental section 6.3.3.
ProWave CUI
190 mm 32.8 kHz 25 kHz
Murata
40 kHz
55 mm
0.6 mm
Waveguide mounting
Direct mounting
Figure 6.13: The two-scale multi-frequency system design consists of the three PUTs and 64 bottom-port
MEMS microphones soldered on the back of the PCB. The transducers are attached using a
3D-printed mounting, for which there are two versions, either for direct coupling or for coupling
into a waveguide structure. The latter includes three tapered ducts with different sized output
ports for equalizing and broadening of the radiation directivity.
On the back of the microphone PCB, the remaining system components are mounted on modular inter-
changeable PCBs via plug-in connectors (Fig. 6.14). The first module contains the FPGA (Intel MAX10), which
is used to acquire the digital microphone signals and convert them from 1-bit 4 MHz PDM to 16-bit 125 kHz
PCM signals using 64 SINC-3 CIC filters, resulting in a total data rate of 128 Mbit/s during the acquisition
window. The data obtained is not buffered, but streamed directly to a processing computer during signal
104
acquisition via high-speed USB at 480 Mbit/s transfer rate. For this, an FTDI FT232H USB communications
IC is used, which is connected to the FPGA via an 8-bit parallel bus. In addition to the microphone signal
transmissions, the bidirectional USB interface is used to transmit start and setting commands from the control
computer to the FPGA, such as the transducer excitation and pause times, as well as the utilized transmit
frequencies and their pulse sequence. Based on these settings, the FPGA generates three separate digital
square-wave burst signals with different frequencies. However, these burst signals must be amplified to drive
the transducers, which is the purpose of the next two modules (2,3).
Module two is used to generate a higher DC voltage level by using a boost converter circuit, whose main
components include an LT3757 converter controller IC, a 22 µH inductor (Coilcraft SER1512-223MED) and
a MOSFET (ON semi FDMC86184). The 5 V supply voltage, which is provided via USB, is thus converted
to 35 V. Module three includes three separate H-bridges ICs (Texas Instruments DRV8870), each connected
to one of the transducers, the 35 V supply voltage and one of the three digital control signals, as well as its
inverted signal. This way, a square-wave voltage signal between +35 V and -35 V is generated at the output of
the H-bridge, so that the transducers are each excited with 70 Vpp .
The 70 Vpp excitation voltage is higher than the maximum voltage for continuous excitation of 20 Vpp
specified in the transducer data sheets. Nevertheless, this voltage is approved for burst operation, which
has been investigated experimentally in a stress test measurement series. Here, all transducers performed
reliably up to 120 Vpp using 200 cycle burst with 10 ms pause each. By using 70 Vpp , a safety margin to this
limit is maintained. The transmit sound pressure is increased by 8 dB for each transducer compared to the
20 Vpp excitation. As a result, higher ranges can be achieved and the losses due to the waveguide can be
compensated, which will be investigated in the experiments section 6.3.3.
FPGA USB 480 Mbit/s

1
Intel MAX10 FTDI FT232H
Step-up Flyback 1:10

2 LT3757 LT3757
5 to 35 V 5 to 350 V
1 2
4 3
3x H-bridges HV H-bridge
3 STM
TI
5 PWD13F60
DRV8870
64x microphones (Waveguided) HV

5 Knowles 4 multi-frequency broadband
SPH0641LU4H-1 transducers transducer
Figure 6.14: The remaining system components and electronic modules are attached on the PCB backside
with plug-in connectors. The modules are interchangeable to drive either three transducers at
70 Vpp or one high-voltage transducer, e.g. based on ferroelectrets [171], at 700 Vpp .
105
Alternative options are also available for modules two and three, which are designed to be used with
broad-band high-voltage transducers for multi-frequency transmission, such as novel ferroelectric transducers.
The alternative module two is based on the identical converter controller and MOSFET IC featuring a similar
overall layout, but uses a flyback topology based on a 1:10 transformer (Coilcraft DA2034-ALD). This circuit
converts the 5 V supply voltage to 350 V. The alternative module three consists primarily of a single H-bridge
IC (STM PDFM13F60), which is capable of switching voltages up to 600V. The outputs of the H-bridge are
switched between ±350 V, so that the broad-band transducer is excited by 700 Vpp . If the transducer power
requirement is higher than the available USB interface power of 2.5 W, an additional external 5 V source can
be connected via the DC barrel jack. The three PUT transducers have an RMS power consumption at 70 Vpp
of 0.4, 0.37, and 0.3 W for the ProWave 32.8 kHz, CUI 25 kHz, and Murata 40 kHz transducer, respectively,
which is sufficiently covered by the USB source.

In this section, the two-scale multi-frequency array system developed is evaluated experimentally. First, the
transmit and receive characteristics are examined independently and then investigated in the combined
pulse-echo mode. In the transmit characterization, the focus is on the impact of the waveguides on the
directivity, intensity losses and pulse shape alterations for the individual PUT transducers with different
transmit frequencies. In the receive mode, apart from the basic attributes of the MEMS microphones, such
as directivity and frequency response, their characteristics in an array configuration are also analyzed. This
includes the evaluation of relative amplitude and phase errors, which are expected to be reduced due to their
broad-band response compared to PUT receivers. Furthermore, the beam pattern is measured and analyzed
for the different utilized frequencies separately and in combination. Similarly, for the pulse-echo mode, the
directivity, i.e. the field of view, as well as the beam pattern, i.e. the PSF, are measured and compared to the
numerical simulation. In addition, the one-way imaging characteristics are experimentally assessed, such as
the achievable angular resolution using a two-reflector setup. Furthermore the image quality of a multi-target
scene is analyzed and compared with the previous array prototype geometry, i.e. the classic sunflower array.
Transmit experiments
First, the effects of the individual waveguides on the directivity, intensity loss and on the pulse shapes of the
individual transducers are analyzed using the frequencies 25 kHz, 32.8 kHz and 40 kHz. The measurement
setup used for this purpose consists of a calibrated microphone (B&K Type 4138) mounted on a movable slide
along a linear axis, as well as a fixture attached on two rotational axes (Fig. 6.15). The fixture includes a
rigid baffle into which various 3D-printed transducer adapters can be embedded. A total of six transducer
adapters are utilized, i.e. two versions for each of the three transducers, one with direct out-coupling and
one with an individual waveguide structure. This way, the respective transducer under test can be centered
on the receiving microphone. By using the 2-DOF rotation axes and the linear axis, the microphone can be
positioned three-dimensionally in the coordinate system of the transducer.
For the directivity characterization, the microphone is positioned at a distance of 1 m, i.e. in the far field,
and its direction is sequentially varied along the horizontal axis between −90◦ and 90◦ in 1◦ steps. At each
direction step, the transducer sequentially transmits 50 pulses, each with a temporal pulse length of 200
cycles. All transducers are driven with a unipolar square-wave signal of 20 Vpp . The sound pressure level is
determined in the steady state, i.e. after the transient of the pulse, as well as the mean and standard deviation
of the 50 pulses. This procedure is repeated for all transducers with and without waveguide attached. The
directivity patterns show the sound pressure level normalized to the respective maximum value over the
corresponding set of directions.
106
Ø5.4 mm Ø4.1 mm Ø3.4 mm
Ø16 mm Ø16 mm Ø10 mm
CUI T8012-2600TH ProWave 328ST160 Murata MA40S4S

25 kHz 32.8 kHz 40 kHz
Figure 6.15: Measurement setup in the anechoic chamber including a calibrated microphone and the trans-
ducer mounting (with or without waveguide) embedded in a rigid baffle fixed on two rotational
axes.
First, the directivity patterns without waveguides attached are considered (Fig. 6.16). As expected, the
individual transducer patterns vary due to the different frequencies and aperture diameters. Despite the small
aperture diameter of 10 mm, the Murata transducer does not provide the widest directivity due to the highest
frequency of 40 kHz. Instead, the CUI 25 kHz transducer has a slightly wider but still similar directivity,
as it uses a lower frequency but also consists of a larger aperture diameter of 16 mm. The ratio between
the wavelength and nominal aperture diameter for both transducers is identical with λ/Dap = 0.86, which
reasons the resembling behavior. The differences are explained by the influences due to the horn shape and
the housing, which both alter the effective aperture diameter. The narrowest directivity is measured for the
ProWave 32.8 kHz transducer, which also features an aperture diameter of 16 mm, corresponding to a λ/Dap
ratio of only 0.65. Thus, the relative amplitude in peripheral directions is approximately 5 to 10 dB weaker
compared to the directivity of the other transducers.
0 0 0
-5 -5
-2
Magnitude (dB)
Magnitude (dB)
Magnitude (dB)
-10 -10
-4
-15 CUI 25 kHz -15 CUI 25 kHz CUI 25 kHz
ProWave 32.8 kHz ProWave 32.8 kHz -6 ProWave 32.8 kHz
-20 -20
Murata 40 kHz Murata 40 kHz Murata 40 kHz
-25 -25 -8
90 45 0 -45 -90 90 45 0 -45 -90 90 45 0 -45 -90
(a) Direction (°) (b) Direction (°) (c) Direction (°)
Figure 6.16: Directivity of the single PUTs without waveguide (a) and with waveguide attached (b),(c). The
tapered ducts and different-sized output ports equalize and broaden the directivity of the single
transducers.
As a consequence, the compounding of the single-frequency images becomes unbalanced, particularly for
107
reflectors located in the peripheral region of the ROI, so that the desired improvement of contrast is impaired.
Therefore, next the transducers are measured with waveguides attached, which are designed to equalize
and broaden the directivities despite different aperture sizes and frequencies to allow a wider field of view.
The waveguides taper the input aperture diameters to a smaller effective output aperture depending on the
respective frequency, so that the ratio λ/Dap is identical for all transducers. The common ratio (λ/Dap = 2.52)
is selected, so that the smallest waveguide output diameter is a minimum of 3.4 mm, which is manufacturable
using 3D printing and has been extensively studied with the 8 × 8 waveguide (Chapter 3). The directivity
measurement results confirm that the equalization of the directivities of the individual transducers is feasible.
The difference between the relative amplitudes is at most 2 dB for all directions. Furthermore, the directivity
is broadened by the waveguides, so that the attenuation is above −3 dB on average up to approximately ±85◦
compared to direct coupling approach where the directivity pattern falls below −3 dB at ±30◦ and ±45◦ ,
respectively.
However, the broadened directivity and equalization between the transducers come at the expense of an
reduced intensity of the transmitted pulse. In order to analyze the SPL loss, the absolute directivity patterns
of the individual transducers with and without waveguide attached are considered (Fig. 6.17). The SPL in
peripheral directions are barely affected in all transducer directivity patterns, whereas the SPL in central
directions are severely attenuated, resulting in the relative broadening of the directivity. There are two reasons
for the central attenuation. First, the attenuation is due to an expected diffraction loss, reasoned by the fact
that a constant acoustic power is spread over a larger illumination surface, thus increasing the peripheral
directions and attenuating the central ones. Second, there are internal losses within the waveguide duct,
e.g. due to friction and back reflections caused by the impedance mismatch at the output ports, leading to a
general attenuation, so that the SPL increase of the peripheral directions is not directly evident. The internal
waveguide losses, excluding the expected diffraction losses, are −5.8 dB (−50%) for CUI 25 kHz, −8.6 dB
(−62%) for ProWave 32.8 kHz, −4.6 dB (−42%) for Murata 40 kHz. In order to reduce the internal losses,
generally larger output port and a consequently narrower directivity must be considered if the application
requirements are not impaired. Furthermore, an additional optimization of the waveguide shape for SPL
improvements can be conducted, which is beyond the scope of this work. Therefore, despite the SPL loss,
the waveguides are utilized due to their positive impact on the imaging quality in terms of field-of-view and
contrast by equalizing and broadening of the directivities. The resulting losses are compensated by a higher
driving voltage.
100 25 kHz w/o waveguide 100 32.8 kHz w/o waveguide 100 40 kHz w/o waveguide
95 95 95
SPL (dB)
SPL (dB)
SPL (dB)
90 90 90
85 85 85
80 80 80
25 kHz w/ waveguide 32.8 kHz w/ waveguide 40 kHz w/ waveguide
75 75 75
90 45 0 -45 -90 90 45 0 -45 -90 90 45 0 -45 -90
(a) Direction (°) (b) Direction (°) (c) Direction (°)
Figure 6.17: Comparison of the SPL directivity of the transducers with and without waveguide attached. The
waveguides cause a SPL loss due to the broadening and the resulting diffraction loss, but also
due to friction and internal reflections due to the impedance mismatch at the output port.
108
In addition to the losses due to the waveguide and the broadening of the directivity, the change in the
temporal shape of the respective pulses is considered, which are normalized to the amplitude after the transient
overshoot, i.e. in the steady state (Fig. 6.18). For all three pulse shapes, the respective wavguides result in
an increased overshoot and a minor ringing extension. A likely explanation for these effects are the internal
reflections due to the impedance mismatch between the duct and free field. However, the overall rise and fall
times are approximately identical, leading to the conclusion that the general bandwidth is not significantly
affected.
In summary, the equalization and broadening of directivities by inserting the transducers into individual
waveguides has been successful, but also involves drawbacks in terms of transmit intensity losses and a minor
increase of ringing. Therefore, the choice of using waveguides has to be carefully evaluated and compared
to utilizing or creating already well-matched transducer geometries in terms of frequencies and aperture
sizes. The remainder of this chapter continues to use the waveguide approach for the pulse-echo and imaging
measurements.
25 kHz w/ waveguide 32.8 kHz w/ waveguide 40 kHz w/ waveguide
25 kHz w/o waveguide 32.8 kHz w/o waveguide 40 kHz w/o waveguide
1 1 1
Amplitude norm.
Amplitude norm.
Amplitude norm.
0.8 0.8 0.8
0.6 0.6 0.6
0.4 0.4 0.4
0.2 0.2 0.2
0 0 0
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20
(a) Time (ms) (b) Time (ms) (c) Time (ms)
Figure 6.18: Comparison of pulse shape alterations of the transducers with and without waveguide attached.
The waveguides cause a minor increase in overshoot and ringing due to internal reflections.
Receive experiments
Next, the receive mode is considered, specifically the basic characteristics of the MEMS microphones, i.e. the
directivity and frequency response. Afterwards, the characteristics in an array configuration are measured,
including the relative amplitudes and phase errors, as well as the beam patterns for the different frequencies
and their combination.
The experimental setup is similar to the previous one for the transmit characterization with the following
differences. This time, the transmit transducers and mountings are positioned on the slide and the movable
rail. The waveguided mountings are not used for the transmitters in these experiments. The microphone
array PCB is attached to the fixture on the two rotational axes, which also functions as a rigid baffle.
In the following measurements of the directivity of the individual microphones over the different frequencies,
the slide and transducer are positioned at a distance of 2 m to be in the far field of the array. The transducer
sequentially transmits 50 pulses (200 cycles, unipolar square-wave 20 Vpp ), which are received by the
microphones. This process is repeated for each orientation of the microphone PCB, altered in 1◦ increments
between −90◦ and 90◦ in the horizontal plane. The received level of the pulse envelope after the transient
overshoot in steady-state is averaged over the 50 pulses for each orientation and the standard deviation
is determined. Subsequently, the average per orientation is formed over all microphones. The series of
109
measurements is repeated for all three transducer types and their respective resonant frequencies. The pulse
levels received are normalized to the respective maximum value per frequency and presented in the directivity
pattern (Fig. 6.19).

25 kHz
32.8 kHz
40 kHz
combined

(a) (b) (c)
Figure 6.19: Measurement setup for the receive mode characteristics. The microphone PCB is attached to
the two rotational axes, which receives pulses of different frequencies using one of the three
transducers (a). The directivity of the individual microphones is approximately unidirectional
over all frequencies due to the relative small port opening of 0.325 mm (b),(c).
As expected, the directivity of the microphones is particularly broad and almost uniform due to the
relatively small aperture opening with a diameter of only 0.6 mm, which corresponds to Dap = 0.044 λ25 kHz , =
0.057λ32.8 kHz , = 0.07λ40 kHz in relation to the respective wavelength. Thus, the directivity is above −2 dB for all
frequencies up to approximately ±85◦ , which is followed by a stronger attenuation down to −6.2 dB at ±90◦ .
Particularly prominent and unexpected are the increases of +1 dB at approximately ±75◦ , which occur at all
frequencies and are followed by a steeper attenuation. The cause for this effect is not exactly clear. However,
a likely explanation for this characteristic is the bottom port design of the microphone and the therefore
mandatory via through the PCB with thickness of 1.6 mm, which is also a type of waveguide. Nevertheless,
the differences over the directions are marginal and also negligible over the three frequencies with an average
amplitude variation of only 0.5 dB. In summary, this MEMS microphone type provides an already-equalized
and almost unidirectional acquisition of the three ultrasound frequencies, enabling a particularly wide field of
view and effective contrast enhancement through image compounding.
Before examining the relative amplitude and phase errors in the array configuration, first, the frequency
response of the MEMS microphone type is investigated (Fig. 6.20). The frequency response is particularly
flat in the audible range and then transitions to a low-Q resonance peak at 25 kHz, followed by a higher
flat plateau in the ultrasonic range starting at 40 kHz. The three frequencies utilized are thus located at the
resonance peak (25 kHz), at its falling slope (32.8 kHz)and in the flat region (40 kHz), respectively. As a
result, the sensitivity between the different frequencies differs by up to 6 dB. However, this variation can be
easily adjusted by suitable a equalization, e.g. by accordingly scaling the matched filter components. Due to
the low-Q resonance and the generally flat response in the ultrasonic range compared to the narrow-band
frequency response of the PUT transducers, manufacturing tolerances are expected to have a weaker impact
on the relative differences in sensitivity and phase responses between different microphones. These relative
amplitudes (sensitivity) and phase errors are examined hereafter.
110
20
18
16
14
Sensitivity (dB1 kHz )
12
10
8
6
4
2
0
−2
−4
−6
−8
−10
−12
−14
−16
−18
−20
100 1000 10 k 20 k 30 k 40 k 50 k 60 k 70 k 80 k
Frequency (Hz)
Figure 6.20: Broad-band frequency response of the MEMS microphone sensitivity as given in the corre-
sponding datasheet [172]. The 25 kHz excitation frequency is located within a low-Q resonance,
resulting in a higher relative signal amplitude compared to the other frequencies. Relative
amplitude differences are compensated by signal processing.

25 kHz
32.8
kHz

(a) (b)


40 kHz

(c) (d)
Figure 6.21: Analysis of relative receive amplitude errors between the different microphone elements of the
array for different frequencies. The relative amplitude errors for the 25 kHz excitation are higher
than for the other freqencies. Due to the resonance peak at 25 kHz, manufacturing tolerances
have a higher impact on the amplitude deviation. A systematic error in dependence of the
microphone position does not occur.
111
For the measurement of relative amplitude and phase errors the setup and procedure previously described
is used. However, the microphone array is not rotated, but points directly to the transmitter. The amplitude
in steady-state after overshoot is averaged over 50 pulses for each microphone and the standard deviation
is formed. The relative phase of the pulse signals within the expectation time window is determined in the
frequency domain at the corresponding transmit frequencies and are also averaged over the 50 pulses and the
standard deviation is determined. Due to the individual microphone positions in the array configuration, there
are small relative path differences to the transmit transducer. These path differences do not significantly affect
the relative amplitudes, but must be compensated for the relative phase analysis. After the path difference
compensation, ideally, all relative amplitudes and phases are identical across the microphones of the array.
Remaining differences are due to the manufacturing or mounting tolerances of the MEMS microphones
themselves.
In the following, the amplitude errors over the different frequencies are considered, first, in dependence
of the microphone position, in order to exclude possible systematic relative amplitude errors, e.g. by the
above-mentioned path difference, and, second, over the microphone index for a clearer quantitative compari-
son (Fig. 6.21).

25 kHz
32.8
kHz

(a) (b)

40
kHz

(c) (d)
Figure 6.22: Analysis of the compensated relative phase errors between the different microphone elements
of the array for different frequencies. The relative phases are shifted in order to compensate the
small path differences between the transmitting transducer and the individual microphones.
After the compensation, the systematic error dependent on the microphone position is removed.
The remaining phase errors are due to manufacturing and soldering tolerances.
112
As expected, a systematic error depending on the microphone positions does not arise due to the minor path
differences to the transducer [Fig. 6.21(a),(b),(c)]. The relative amplitude errors for the frequencies 32.8 kHz
and 40 kHz are within the range ±20%. Only for the frequency 25 kHz larger deviations arise occasionally
up to +70%, but predominantly only in the range of ±25%. The reason for these larger errors is due to
the resonance peak at 25 kHz, which increases the impact of manufacturing tolerances. Other amplitude
error sources are, e.g., the manufacturing tolerances of the PCB vias or positioning tolerances during reflow
soldering. Compared to the PUTs with an amplitude variation of ±50%, the MEMS microphones feature a
smaller deviation on average.
Next, the relative phase errors for different frequencies over the individual microphones are consid-
ered (Fig. 6.22). After the phase compensation due to the path differences to the transducer, a systematic
error depending on the microphone position does not arise here either, which is clearly evident without active
compensation. In the latter case, the microphones indicate an increasingly negative phase shift with increasing
distance from the array center. The compensated relative phases are in the range ±25◦ for all frequencies,
which is an improvement over the PUTs, which have an expected phase deviation of up to ±60◦ . Thus, the
MEMS microphone characteristics are out-of-the-box relatively similar to each other and an error calibration
has a less significant impact on the beamforming capability, but can still improve it.
(a) (b) (c)

combined

!

(d) (e)

Figure 6.23: Measured receive mode directivity patterns for three separate frequencies (a),(b),(c) and the
resulting compound pattern (d), as well as the corresponding side views, showing the highest
side and main lobe level along a specific azimuth direction (e). The compound pattern features
more evenly distributed side lobes and a generally lower MSLL.
In the following, the beam patterns for the different frequencies are examined separately and afterwards in
combination. For this, the identical measurement setup as described above is used. The transducer sends a
pulse (200 cycles, 20 Vpp) to the microphones, whose signals are processed according to the procedure in
113
Section 6.3.1 in order to obtain the beam pattern with a single transmitting source at the center. This process
is repeated for all three frequencies. Afterward, the compound beam pattern is formed and analyzed.
All beam patterns measured feature the intended two-scale profile consisting of a main lobe with a narrow
peak and a wider base, as well as a balanced distribution of side lobes without particularly high or low
levels (Fig. 6.23). As the frequency increases, the beam pattern shrinks radially to the center, resulting in
slight variations in the positions of the side lobes. However, the main lobe positions are fixed and only vary in
terms of width. In the combined beam pattern measured, the side lobes are thus evenly distributed, which
resembles a smearing effect. Thus, in combination with the main lobe accumulation, the MSLL reduces to
−15.1 dB and a MLW of 5.1◦ is formed. Both results are in good agreement with the simulation with an MSLL
of −16.3 dB and an MLW of 5◦ . As expected, the measured MLW of the combined pattern is narrower than
the 25 kHz MLW, wider than the 40 kHz MLW, and approximately matches the 32.8 kHz MLW. Overall, the
beam pattern compounding significantly influences the MSLL, which reduces the MSLL from the respective
single-frequency beam pattern, i.e. −11.1 dB at 25 kHz, −9 dB at 32.8 kHz and −9.7 dB at 40 kHz to the
compound MSLL of −15.1 dB.
Pulse-echo experiments
After the transmit and receive characteristics have been analyzed separately, the pulse-echo mode is addressed
in this section. Here, all three transducers are used for the transmission of a pulse packet as described in
Section 6.3.1, which impinges on one or more passive targets. First, the directivity and the beampattern are
investigated in the same way as in the previous sections, which, in terms of imaging, corresponds to the field
of view and the point spread function. Subsequently, the angular resolution using two spheres, as well as the
imaging quality in a multi-target scene are analyzed, where the latter being compared to images obtained by
the classic sunflower spiral array.
0 0
-2 -2
Magnitude (dB)
Magnitude (dB)
-4 -4
Transmit comb.
-6 Pulse echo comb. -6
Receive comb.
-8 -8
90 45 0 -45 -90 90 45 0 -45 -90
(a) (b) Direction (°) (c) Direction (°)
Figure 6.24: Setup for measuring the directivity in the pulse-echo mode (a), corresponding to the imaging
field-of-view. One hollow steel sphere (⌀10 cm) is positioned at a distance of 2 m. All three
transducer frequencies are used to obtain the compound pulse-echo directivity (b), which
is in good agreement with the combination of the compound transmit-only and receive-only
directivities (c).
The setup for measuring the pulse-echo directivity is similar to that of the receive mode, except that a
passive target, namely a hollow steel sphere with a diameter of 10 cm positioned at a distance of 2 m, is
used (Fig. 6.24). The mounting of the sphere is covered with foam to suppress interfering reflections. The
114
orientation of the array system is varied from −90◦ to 90◦ in the horizontal plane in 1◦ steps. For each
orientation, 50 pulse packets are sequentially transmitted and their echoes are received by the microphones.
The pulse packets consist of the pulse sequence of the three frequencies, each excited by a 1 ms square-wave
burst signal with 70 Vpp followed by a 1 ms excitation pause to allow the respective transducer to decay. The
echoes received with different frequencies are shifted in time using the matched filter, which also adjust the
relative level to each other. The level of the combined echo amplitude in the steady state is averaged over 50
firing events and the standard deviation is determined. The combined echo level normalized to its maximum
is plotted over the direction in the directivity pattern (Fig. 6.25).
The pulse-echo directivity measured is similar to the combined directivity of the previously measured
separate transmit and receive characteristics, as expected. Therefore, the combined echo level of the sphere
is above the −3 dB limit for directions within approximately ±60◦ and above the −6 dB limit for ±85◦ . The
prominent peaks for peripheral directions at approximately ±75◦ are higher for the pulse-echo directivity
compared to the separate transmit and receive characteristics with a relative increase of up to +3 dB, as these
occur and overlap for both the MEMS microphones and the waveguide-attached transducers. Overall, by
using the waveguides and the nearly unidirectional microphones, a particularly wide field-of-view is achieved.
This way, peripheral objects are only slightly attenuated, so that they can be detected even if equally strong
reflectors in the center are present.
Next, the pulse-echo point spread function is measured by positioning the sphere at 2 m distance. The array
is not rotated for this measurement, but sends the pulse packet centrally to the sphere. The resulting echo
signals from all microphones are processed as described in Section 6.3.1 and all directions of the 2D beam
pattern are evaluated using receive beamforming.
Pulse echo comb. Simulation comb. Receive comb.
(a) (b) (c)

0
Pulse echo comb.
Magnitude (dB)
-5 Simulation comb.
Receive comb.
-10
-15
(d) 80 60 40 20 0 -20 -40 -60 -80

Azimuth (°)
Figure 6.25: Comparison of the pulse-echo (a) to the simulated (b) and receive-only (c) multi-frequency
PSFs, as well as the corresponding side views, showing the respective highest level per azimuth
direction (d). All PSFs are in good agreement, particularly the main lobe shapes and MSLLs.
115
The measured pulse-echo 2D beam pattern is then compared with that of the receive-only case and the
single-point-source simulation. For a clear comparison, the side view is considered as well, which shows the
highest side lobes per azimuth direction [Fig. 6.25(d)].
All three 2D beam patterns agree well with each other. In particular, the shape of the main lobe, i.e. its
width and base level, are nearly identical to each other and to the simulation. There are minor differences in
the height and location of the MSLL. The simulation has the lowest MSLL (−16.3 dB), followed by the MSLL
of the pulse echo beam pattern (−15.2 dB) and the receive MSLL (−15.1 dB). The remaining differences in
the side lobe positions can be attributed to the moderate relative phase and amplitude errors of the array
elements, which, however, have only minor effects on the MSLL and MLW.
The next experiment is used to determine the angular resolution of the compound imaging using three
frequencies. For this, the above-described measurement setup is used with the difference that two spheres
with a smaller diameter of 5 cm are utilized, which are mounted on a lateral profile on the sled at a distance
of 1 m from the array (Fig.6.26). The lateral profile, as well as the mounts of the spheres are covered with
sound absorbers to avoid interfering reflections. The distance between the two spheres is sequentially reduced
from 20 cm in 1 cm steps. After each step a compound image is generated and the separability is evaluated.
The separability is fulfilled if the two detected echoes in the image form two distinct local maxima. The level
of the local minimum between the two maxima is used as a metric for the separability contrast.
0
20 cm (11.4◦ ) 8 cm (4.6◦ )
Magnitude (dB)
16 cm (9.1◦ ) 6 cm (3.4◦ )
-5 12 cm (6.9◦ ) 5 cm (2.9◦ )
-10
-15
20 15 10 5 0 -5 -10 -15 -20

(a) (b) Direction (°)
90◦ 16 cm (9.1◦ ) 8 cm (4.6◦ ) 6 cm (3.4◦ ) 5 cm (2.9◦ )
1
Elevation (◦ )
-90◦ 0
90◦ Azimuth (◦ ) -90◦ Azimuth (◦ ) Azimuth (◦ ) Azimuth (◦ )
Figure 6.26: Setup for measuring the angular resolution (a) using two horizontally adjacent steel spheres
(⌀5 cm). The spacing between the spheres is sequentially decreased and the horizontal sections
(b) of the respective images are evaluated. The spheres are separable for a minimum spacing
of 3.4◦ corresponding to the angular resolution of the system.
As expected, the highest separation contrast of −15 dB is achieved for the largest spacing between the
spheres of 20 cm. The level of the local minima increases with decreasing sphere spacing. The last detected
minimum has a level of −0.6 dB and occurs at a spacing of 6 cm, which corresponds to an angular separation
of 3.4◦ , determining the angular resolution. At a spacing of 5 cm (2.9◦ ), the echoes merge and are no longer
116
distinguishable. In practice, the −3 dB limit is a commonly used threshold for segmentation, which is reached
at a sphere spacing of 12 cm (6.9◦ ). Another characteristic is the bright zone surrounding the two detected
echoes in the image . This zone is a result of the broad the main lobe base of the two-scale geometry PSF.
Finally, the compound imaging for a multi-reflector scenario is considered consisting of 13 corner reflectors,
each with an edge length of 140 mm, which together form a reflector pattern (Fig. 6.27). This corner reflector
pattern spans an area with a diameter of 1 m, where the distances between reflectors are 37 cm in the outer
ring, 25 cm in the straight lines, and minimally 17 cm in the inner lower quarter ring. The distance between
the reflector pattern and the array is varied in 0.5 m steps and an image with a separation threshold of −3 dB
is formed for each distance. Here, the separability of the individual reflections and the formation of artifacts
due to the side lobe accumulation are investigated.
(a) (b) 1 m (c) 1.5 m
(d) 2 m (e) 2.5 m
Figure 6.27: Multi-reflector measurement setup consisting of 13 corner reflectors with a edge length of
140 mm. The reflector pattern is imaged and evaluated at multiple distances. With increasing
distance, the detections gradually merge due to the broad main lobe base of the PSF. Side lobe
artifacts far from the actual reflections do not arise, preventing false detections.
At a distance of 1 m and 1.5 m, all reflectors are clearly separated and sidelobe artifacts are not formed
outside the reflector locations. However, smaller artifacts occur at a distance of 1.5 m, but in direct proximity
to the reflector detection, causing a minor distortion of the shape of the detection. Another noticeable
characteristic is that the detections are of different strengths, which can be attributed to the slightly different
orientations of the reflectors, as well as to the interference and accumulation of side lobes with the reflector
main lobes. At a distance of 2 m, a pronounced merging between the reflectors of the inner ring and outer
ring is evident. At 2.5 m, the individual reflectors are, for the most part, not imaged separately as bridging
117
occurs between the individual detections. Remarkably, artifacts do not form that are significantly distant from
the true reflections. In addition to the general low MSLL, this effect is due to the wide main lobe bases, which
overlap for close reflectors and lift the overall main lobe level, preventing side lobe artifacts from emerging
outside of the true reflection. Finally, it is emphasized that the images presented are obtained using a single
firing event, such that high frame rates can be achieved.
Last, a comparison of the images of the reflector pattern at a distance of 1 m obtained with the multi-
frequency two-scale array and with the previous classic sunflower array approach is provided to highlight the
differences in the one-way beamforming case (Fig. 6.28). Both array geometries have the identical overall
aperture size and number of elements. In addition to the raw images obtained, their segmentation using a
−3 dB threshold is considered as well. Compared to the classic array, the side lobe level background of the
two-scale method is more evenly filled. Here, the reflector pattern is recognizable even without segmentation
applied. In contrast, for the classic sunflower array, the side lobes and surrounding minima alternate, so
the image appears darker altogether. However, the actual reflections are not clearly distinguishable. In
the segmented classic sunflower images, several artifacts arise due to side lobe accumulation. which are
considerably distant to the actual reflector positions, leading to false detections. In comparison, the reflectors
in the two-scale image are well separated and artifacts do not arise. In summary, the multi-frequency two-scale
approach is therefore particularly suitable for high-frame-rate imaging using one-way beamforming.
(a)
(b)
Figure 6.28: Comparison of the multi-reflector pattern imaged with the two-scale multi-frequency array (a)
and the classic sunflower array (b), both having identical aperture sizes. The respective image
segmentations on the right are formed using a −3 dB threshold. Due to the higher MSLL of the
classic sunflower geometry, side lobe artifacts arise far from the true reflections, whereas the
two-scale approach images the pattern correctly.
118
This chapter covered the identification of an advantageous trade-off between several competing optimization
goals, i.e. low system complexity and costs, high frame rates, resolution and contrast. Here, a key factor is to
enable the use of sparse spiral array geometries in conjunction with the high-frame-rate multi-line-acquisition
imaging technique (Section 2.4.2) based on one-way beamforming. Therefore, a reduction of the MSLL in the
PSF is necessarily required in order to improve contrast, for which two design strategies have been elaborated.
Both strategies are based on the same principle, i.e. the combination of diversified PSFs, which are obtained
by different approaches.
The first strategy is a modification of the classic sunflower array geometry, combining two sub-arrays with
two different aperture sizes and element densities, which are constant within each sub-array, in contrast
to previous modification approaches using density window functions. Instead, these two-scale arrays are
designed to exploit the advantageous combination of the respective altered main, side and grating lobe zones.
The positions of the PSF zones can be estimated prior to field simulation using the deterministic and flexible
4-parameter design method presented. This way, the search for well-matching array configurations within
predefined design constraints can be narrowed. The comparison to the previous modification approaches
confirmed an improved achievable perfomance in terms of MLW and MSLL, particularly for 64-element arrays.
Due to these excellent one-way PSF characteristics, the two-scale method proposed is valuable not only for
imaging applications, where high frame rates are of great importance, but also for transmit- and receive-only
applications.
In addition to the two-scale geometry, the second strategy uses multiple narrow-band transmission fre-
quencies, each used to form an individual image, which are subsequently combined to a compound image.
In this case, the PSFs are varied as well due to the different frequencies, so that particularly the side lobe
positions of the respective images do not coincide. As a result, the side lobes are distributed more evenly
and accumulate only marginally in the compound PSF. In contrast, the positions of the respective main
lobes primarily coincide, so that their level significantly accumulates when combined. Thus, the benchmark
performance of the optimum multi-frequency two-scale geometries in terms of MSLL per MLW is further
improved compared to the single-frequency two-scale approach. Compared to the previous single-frequency
classic sunflower method, the combination of both design strategies achieves a reduction in MSLL from −8.8 dB
to −16.3 dB given the same MLW of 5°, enabling the use of the high-frame-rate MLA imaging technique.
In order to validate the results of the numerical simulations and benchmarks, a prototype system is built,
which includes three waveguided PUTs with different narrow-band resonant frequencies and 64 digital
broad-band MEMS microphones in a two-scale configuration. The radiation patterns of the three dedicated
PUTs, equalized and broadened by the respective waveguides, enable a particularly wide imaging field of
view, in combination with the almost-unidirectional MEMS microphones. Furthermore, by spreading the
radiated power over multiple frequency bands and compounding the respective images, a general increase of
SNR is achieved, which extends the range of view compared to single-frequency systems, in addition to the
contrast enhancement. The imaging contrast and resolution achieved enable the use of the MLA technique,
such that even multi-reflector scenes can be imaged wit high-frame rates without causing false detections,
in contrast to the previous singe-frequency classic sunflower array. Additionally, the use of digital MEMS
microphones and the omission of transmit beamforming reduce the system complexity and cost compared to
the required transceiver electronics described in Chapter 4. However, the system complexity of the two-scale
multi-frequency design is higher compared to the minimalistic single-frequency system variants (Chapter
5), due to the higher driving voltage and multi-frequency excitation. Nevertheless, the former significantly
improves the angular resolution from 14° (Section 5.2) or 15.2° (Section 5.3) down to 3.6°, while maintaining
high frame rates of 30 Hz.
Overall, by combining several design strategies, i.e. the utilization of heterogeneous transducer technologies,
119
waveguides for directivity broadening and equalization, two-scale spiral geometries and the transmission
of multiple frequencies, the sonar system presented jointly improves multiple optimization goals. These
include high frame rates, contrast, and resolution, for which, however, certain trade-offs must be made as
well, e.g. a degraded resolution compared to the classic sunflower transceiver array or the increased system
complexity compared to the hexagonal or T-geometry dense arrays. Future work can explore additional
features enabled by the available differing frequencies, such as simultaneous communication during imaging,
as well as dynamically adjusting imaging characteristics in dependence of the current situation. For example,
the frame rate can be further increased at the expense of a contrast reduction by transmitting the individual
frequencies interleaved rather than in a pulse packet, in order to directly use the generated images without
merging them into a compound image. Apart from that, the impact of additional density tapering of the
two-scale sub-arrays can be investigated, potentially enabling further improvement.
120
7 Deep-learned sonar image enhancement

[39] ”Deep-Learned Air-Coupled Ultrasonic Sonar Image Enhancement and Object Localization”,
in Proc. IEEE Sensors Conference, 2022.
Motivated by the great advances in neural-network-based processing of optical images, enabling fast
segmentation, pattern recognition, and classification of objects in challenging scenes and even on hardware-
limited systems, the central idea in this section is to apply this technology to ultrasonic sonar images. Here,
the goal is to train a neural network on the characteristics of ultrasound images and achieve an improvement
in contrast and resolution, and, additionally, provide a robust object localization. Greatest thanks go to my
student Stefan Schulte, who contributed a major part to the implementation of the idea in the course of his
bachelor thesis [173] and presented the project at the IEEE Sensors conference [39].
In detail, the concept of image enhancement is based on the following considerations. Section 2.4.6
explains that the ultrasound images obtained using conventional beamforming are composed of the true
information on the direction of reflectors, convolved by the point spread function, which depends primarily
on the array geometry. Therefore, the point spread function results in the ideal image, only containing the
true reflector positions, being degraded in two ways. On the one hand, the contrast is lowered because side
lobes form adjacently to the main detection, which can additionally accumulate from multiple reflectors and,
consequently, reduce the relative detectable dynamic range. In the worst case, these accumulated side lobes
can results in false detections, also called side lobe artifacts. On the other hand, the true object reflection
is imaged with a broader outline due to the main lobe width, limiting the achievable angular resolution.
Furthermore, independent of the array-geometry-specific PSF, but caused by the temporal length of the
narrow-band illumination pulse, there is an additional radial broadening of the true object reflection, which
impairs the range resolution as well as the accurate detection of the distance. All these negative effects on the
image quality lead to a particularly challenging object localization in real-world multi-target environments as
investigated in the previous chapters.
So far, this work addressed the problem by designing advanced sparse and large-aperture array geometries
in combination with the use of multiple frequencies to improve the PSF (Chapter 4, 6). However, in particular,
the compact and low-cost system designs (Chapter 5), which are valuable for size- and hardware-constrained
mobile applications, are most affected by the degraded resolution due to the small aperture size required.
Therefore, this chapter focuses on the post-processing of CBF images using a deep neural auto-encoder
network, which can be universally applied to all array geometries considered so far.
The specific objective is to train the network with various labeled multi-target scenes, so that it learns the
characteristic main lobe, side lobe and pulse shape patterns, as well as their accumulation in the multi-target
case, in order to generate a cleaned image, which ideally contains only the positions of reflection points.
Hence, this operation corresponds to an inverse convolution removing the PSF pattern along the azimuth and
elevation directions, as well as removing the pulse shape pattern along the distance, such that the procedure
can also be considered as deep-learned deconvolution.
Traditional deconvolution techniques, such as Wiener deconvolution, are based on the principle that a
convolution is equivalent to a multiplication in the Fourier domain. Therefore, in this method, a division by
the point spread function in the Fourier domain is used to restore the original image, although a regularization
121
term must be included to avoid high amplification of noise, where the PSF is small or zero [174]. Another
popular method is the Richardson-Lucy deconvolution, which involves an iterative Bayesian-based approach
for image restoration and outperforms the Wiener deconvolution [175]. Both techniques assume a known,
constant and shift-invariant PSF, which is therefore independent of the origin direction of an impinging echo.
However in practice, particularly for real-world arrays, the PSF is subject to variations due to, e.g., directional
amplitude and phase errors of the array elements, so that the model assumptions of the deconvolution
techniques are not fully satisfied.
Therefore, a neural-network-based approach is expected to achieve an improved performance compared
to the traditional deconvolution techniques due to its ability to adapt to these errors, noise and the general
high complexity of real-world multi-target sonar images with overlap of many reflections. In addition to the
image enhancement, the network can provide further information, e.g., to directly extract the coordinates of
localized objects using a clustering algorithm, particularly valuable for the obstacle avoidance and path finding
of mobile autonomous vehicles. In the following, the processing and neural network architecture is described
that extends the work in [176] by using a state-of-the-art U-Net [177] alternative utilizing Xception [178].
CBF sonar Auto-encoder network Enhanced

image image
2D
2D
nv
Xception Xception
op
nv
Co
Cr
Co
down-sampling up-sampling
188x24x128
188x24x128
376x48x64
96x12x256 96x12x256
376x48x64
751x91x32
751x91x1
752x96x32
752x96x1
751x91x1 Flatten 751x91x1
Ultrasonic Dense
transducer
(MA40S4S)
Ø 42 mm Number of reflectors
37 mm Gaussian mixture Thresholding

λ/2
model clustering
36 Microphones (a) Estimated reflector coordinates
Figure 7.1: The processing architecture proposed leading to the final object coordinate extraction. The
auto-encoder neural network estimates the amount of reflectors and enhances the sonar image.
The ultrasonic sonar, on which the simulation is based on, consists of one 40-kHz ultrasound
transducer and 36 MEMS microphones in a hexagonal array configuration (a) as presented in
Chapter 5.2 [39].
7.1 Neural auto-encoder network architecture

The architecture presented consists of an auto-encoder network and a clustering algorithm (Fig. 7.1). The
auto-encoder network enhances the input image by removing the typical side and main lobe characteristics.
In addition, it also returns an estimate of the number of individual reflections detected in the image. In
comparison to [176], the auto-encoder network is based on the Xception structure [178]. This way, an
122
improved performance is achieved, considering the superior performance of Xception to InceptionV3 [178],
the effective use of Inception in [179] and the similar performance of U-Net [177] to InceptionV3 [180] in
comparable segmentation tasks [181].
In the best case, the enhanced image obtained from the network only contains high values at the locations of
the theoretical reflection origins. Since the segmentation can be ambiguous, Gaussian mixture model (GMM)
clustering is used for the exact localization of the origins of the reflections. In this way, a sufficient estimation
of the reflection origins is still possible, even if the output from the network does not already specify a unique
position for each reflection origin. The GMM requires an estimate of the number of clusters which is provided
by the network as well. The final estimates of the reflection origins are determined using the cluster centers.
Before applying the clustering algorithm to the enhanced sonar image, the segmentation is binarized by
a threshold. Due to the suppression of noise, PSF and pulse shape characteristics by the auto-encoder, the
relative dynamic range of the enhanced image is increased compared to the original image. Therefore, the
threshold can be set significantly lower.
The loss-function used for the segmentation path of the network is the Weighted Hausdorff Distance [176],
which takes the distance to the closest target image point into account, to assess the result for each image
point in the network output. As the ideal segmentation only contains one image point with a high value per
target location, widely used loss-functions, e.g. Binary Cross-Entropy, would lead to a less efficient learning
process. The loss-function for the prediction of the number of targets is the mean squared error.
7.2 Data synthesis, training and testing

The large amounts of CBF images required for training and testing are generated using a numerical simulation
model. In order to highlight the improvements for small-aperture arrays, a 36-element hexagonal array
geometry is used for the simulation, where the elements are positioned on an equidistant triangular grid with
an inter-element spacing of half wavelength as in the low-cost embedded prototype system in [36] [Fig. 7.1(a)]
and Chapter 5.2. A diversified training and testing is ensured by generating a large set of different scenes and
the corresponding ultrasonics images, where several parameters are randomly varied, i.e.
• the number of reflectors,
• the reflector positions,
• the reflector echo amplitudes, and
• the general signal-to-noise ratio (SNR).
This way, the network is trained on 30000 labeled CBF images with scenes containing up to six simultaneous
reflectors for which the random parameters are constrained to the settings given in (Table 7.1).
Table 7.1: Simulation settings for training data [39].

Minimum Max. Max. echo
Number of Number of
adjacent adjacent SNR amplitude
CBF images reflectors
angle reflectors variation
30000 1 to 6 5◦ 4 10 dB ±50%
In order to assess the quality of the approach presented, two different tests are performed. First, the
localization precision, i.e. the distance error between the true and estimated position, for the simulated
123
1.5
Main lobes
1.25 Weak
main lobe
Distance (m)
overlaps
1.0 Multiple with
overlapping Side lobes side lobe
main lobes
0.75
0.5 True reflector positions

Estimated reflector positions
90 45 0 −45 −90
Angle (◦ )
Figure 7.2: Example of a 2D normalized CBF sonar image including multiple targets with different reflection
characteristics and Gaussian noise. In the CBF image, the reflection origin is located at the front
edge of the main lobe of a measured reflection, as expected from the narrow-band transmit pulse
shape [39].
randomized scenes (Fig. 7.2) is evaluated. Second, the angular resolution for separating two independent
reflections within the test scene is determined.
The quality of the final reflector position estimation is assessed by the mean and standard deviation of
the direction and distance error. Here, the error for the localization is only evaluated on scenes, where the
number of targets has been predicted correctly. Otherwise, the GMM algorithm would produce a systematic
error, matching too many or too few clusters to the image points in the segmentation, and, thus, preventing a
differentiation of the two error sources.
In order to evaluate the overall performance of the approach, the localization precision test is performed
on a randomized dataset using the same constraint settings as for the network training (Table 1). The
angular resolution test for evaluating the minimum angular spacing between two adjacent reflections required
for which both are reliably and independently detected is conducted by a test series with two reflectors
symmetrically positioned to the center axis of the array. The angular spacing is set to the respective test
spacing, whereas all other parameters are randomly varied, again, within the constraints in (Table 7.1).
7.3 Test results and discussion

The results of the performance evaluation are presented as a first proof of concept. Although the mean error
is sufficiently low, the standard deviation of the error must be improved in further work and is expected to
decrease with an higher amount of training data (Table 7.2). The increased deviation towards the periphery
of the field of view (Fig. 7.3) is a result of the typical degrading angular resolution of the sonar image.
Two centrally located reflectors can be separated even at 2° angular spacing, where in the original CBF sonar
image two distinct local maxima only form, if the angular spacing is at least 14°, given the small-aperture
array geometry in [36] [Fig. 7.1(a)]. Thus, the effective angular resolution is significantly improved (Figs.
7.5, 7.4).
124
40
30
30 25
Error of angular
20
estimation (◦ )
20 15
10 10
0
5
−10
−20
−30
−40
0
90 45 0 −45 −90
True angular position (◦ )
Figure 7.3: Histogram of the angular error across the entire field of view for every single localization. This
test is performed on the same dataset as used for Table 7.2. The increased error at the periphery
of the field of view is related to the expected decreased resolution of the CBF images in that area
and non-uniform angular sampling [39].
Table 7.2: Test results for random scenes similar to training data [39].
Standard Standard Reflector
Mean Mean
Number of deviation of deviation of count
angular distance
CBF images angular distance estimation
error error
error error accuracy
2000 −0.61 ◦ 16.69 ◦ −3 mm 161 mm 83%
Although the network is only trained with CBF images in which the reflectors have an angular spacing of at
least 5°, it detects objects down to 2° (Fig. 7.5). Thus, the conclusion is that the network is able to extract
solid features that enable some degree of abstraction. All results strongly depend on the amount of training
data, especially for scenes with many simultaneous or very close neighboring objects.
2.5
false detections (%)
Number of
1.5
0 5 10 15 20 25 30 35
Angular distance between adjacent reflectors (◦ )
Figure 7.5: Error of the detected number of reflectors over the angular spacing between two adjacent re-
flectors. For each angular spacing, 1000 CBF sonar images are assessed. The reflectors are
symmetrically positioned to the center axis of the array, with all other parameters randomly varied
as in the training set (Table 7.1). The accuracy is approximately independent of the angular
spacing [39].
125
1
True reflector position
Distance (m)
Estimated reflector position

0.9
0.8
Normalized amplitude
0.75 Main lobe
0.5 True Estimated

0.25 angular angular
position position
0
30 25 20 15 10 5 0 −5 −10 −15 −20 −25 −30
Angle (◦ )
Figure 7.4: Example of object separation capabilities for close adjacent reflectors. The angular spacing of
the reflectors is 2°. Although only one local maximum is visible in the CBF sonar image, both
reflectors are detectable after processing [39].

In this chapter, a feasible deep-learned deconvolution technique is demonstrated to enhance in-air CBF sonar
images including a suitable coordinate extraction method, which improves the angular resolution, contrast and
localization precision even in noisy high-complexity scenes with multiple overlapping reflections. Of course,
the architecture presented can also be utilized without coordinate extraction for image enhancement only.
Here, the capability to enhance even CBF images with low SNR shows the potential of the used auto-encoder
for de-noising applications.
One of the essential next steps is to increase the amount of data for training in order to further strengthen
and improve the results. Here, the consideration of other error sources, such as amplitude and phase errors
of the elements, is of particular interest to evaluate the robustness and adaptability of the neural network.
Since the ROI has been limited to the horizontal plane for the first proof-of-concept, the expansion to a 3D
deep-learned deconvolution is another key goal. Apart from that, so far, only the absolute values of the complex
ultrasound images have been considered, since this way, the architecture is similar to that of optical images.
Therefore, the adaptation of the architecture to process complex ultrasound images for obtaining an additional
level of information is a promising approach. Other advancements are aimed at the general modification of
the architecture, such as using genetic algorithms to improve the structure or including upsampling beyond
the input image size in the decoder path of the neural network, for potential increases in localization accuracy.
Finally, the evaluation of the network in real-world scenes is planned by gathering training and test data in
the automated measurement setup in the anechoic chamber, as widely used in the previous chapters. Here,
it is of particular interest, whether training with synthesized scenes including appropriate errors, leads to
satisfying results in the real-world case.
126
8 Conclusion and outlook
This work covered the conception and realization of different air-coupled 3D sonar imaging systems, as well
as their numerical and experimental evaluation. These systems are designed to pursue different optimization
goals, in order to explore the capabilities, trade-offs and limitations in real-world settings. In this context,
the fundamental characteristics of airborne ultrasound pose major challenges to sonar imaging, such as the
relatively slow speed of sound, the frequency-dependent attenuation, and the poor coupling of ultrasound into
air. Under these constraints, providing high frame rates, a large range and field of view, high angular resolution
and contrast, while maintaining reasonable system complexity and size, form the primary optimization goals,
which, however, are partially interdependent and competing. Therefore, some of the investigated system
concepts focused on maximizing the angular resolution, contrast and range, but at the expense of slow image
formation (Chapter 4). Other concepts prioritized high frame rates, a low system complexity, and compact size,
although sacrificing angular resolution and range (Chapter 5). Additionally, a design strategy is investigated
that achieves a favorable trade-off between all optimization metrics (Chapter 6).
This concluding chapter brings together these independently analyzed design concepts into a comprehensive
overview, in which their key capabilities and limitations are highlighted with respect to the primary optimization
goals pursued. Based on this, general design recommendations are derived for the selection and combination of
air-coupled transducer technologies and their excitation strategies, as well as for the design of array geometries
for sonar imaging in air.
First, the different concepts are considered with respect to the optimization of the frame rate. High frame
rates are crucial in order to be able to perceive dynamically changing environments, e.g. due to the motion of
the sensor system itself. Here, the major constraint is the physical limitation resulting from the low propagation
speed of sound in air. For this reason, the system concepts (Hexagon, T-Array and TSMF), that required only a
single fire event for illumination and image formation, i.e. MLA imaging, achieve significantly higher frame
rates than the systems (Waveguided URA and Spiral), that use sequential two-way line-by-line scanning, i.e.
SLA imaging. However, there are further differences between the high-frame-rate MLA methods. While the
Hexagon and TSMF use an approximately omnidirectional pulse radiation to illuminate the entire hemisphere
for creating a 3D image, the T-array uses transmit beamforming to irradiate only the horizontal plane with a
fan-shaped beam and, therefore, provides high frame rates only for 2D scans. The TSMF concept includes
another unique characteristic, where a firing event consists of several pulses of different frequencies, which
are subsequently superimposed.
Second, the objective of a long range of view is addressed, which, e.g., enables the early detection of
obstacles or determines the installation proximity of the sensor required. The range of view is primarily
limited by the physical sound attenuation in air and is therefore dependent on the radiated power of the sonar
system. For this reason, long-range concepts use multiple transmitters for illumination, although utilized in
two fundamentally different ways. The first approach includes the concepts that exploit transmit beamforming
and the resulting array gain (T-array, Waveguide URA, Spiral). Here, the direct-coupled spiral array achieves a
higher intensity than the waveguided variants, which are impaired by the internal duct reflections and friction
losses. The second approach spreads the radiated power over multiple frequency bands and subsequently
compounds them without using coherent transmit beamforming, as applied in the TSMF system.
Third, the objective of providing a wide field of view is discussed, which, similar to the range of view,
127
Hexagon
Waveguide URA
T-Array TSMF Spiral array
Figure 8.1: Various sonar system concepts examined throughout this work, including the waveguided URA
(Chapter 3), the low-cost embedded hexagon (Section 5.2) and T-array (Section 5.3), as well as
the large-aperture two-scale multi-frequency (TSMF) (Chapter 6) and spiral array (Chapter 4).
determines the perception zone covered. The field of view depends primarily on the directivity of the individual
transmit and receive elements. Due to the general choice of transducer technologies, which have a relatively
small aperture compared to the wavelength, the achieved field of view is similar for all concepts, although there
are minor differences. The concepts using almost-unidirectional MEMS microphones for receiving (Hexagon, T-
Array, TSMF) feature a particularly wide field of view, especially when combined with waveguided transmitters,
whose apertures are additionally tapered (TSMF, T-Array). The narrowest field of view is given by the spiral
array, whose transceiver PUT elements do not use an additional tapering waveguide, as in the URA, and have
a larger aperture than the MEMS microphones.
Fourth, the optimization objectives of achieving high contrast and angular resolution are considered. The
latter metric determines the level of detail of the obtained images, e.g. whether objects are represented as
simple blobs or if patterns and shapes can be recognized. Furthermore, a high angular resolution enables
the separate detection of adjacent reflectors also at large distances, which is beneficial for path finding, for
example. In the conventional beamforming case considered in this work, the obtainable angular resolution
is primarily determined by the main lobe width. Therefore, concepts using large array apertures achieve
relatively high angular resolutions (Spiral, TSMF), which enable, e.g., a precise imaging of a hand or a reflector
pattern. In order to span these large apertures using a reasonable number of elements, either sparse-only
array geometries (Spiral) or a combination of sparse and dense element positioning are exploited (TSMF).
The uniform dense-only concepts (Waveguide URA, Hexagon, T-array) provide low resolution due to the
λ/2-spacing requirement and the resulting aperture limitation, which is not sufficient for the recognition of
shapes with the examined number of elements (max. 64).
At higher resolution, multiple targets can be imaged separately, so that their main lobe levels do not
accumulate, unlike their side lobes, whose superimposed level increases, leading to false detections and even
unrecognizable images. Therefore, most importantly, the optimization of resolution must not be addressed
alone, but the contrast must be considered as well, which is primarily determined by the MSLL. A particu-
larly low MSLL is achieved by concepts that utilize the SLA imaging technique and two-way beamforming
(Waveguide URA, Spiral), due to the two-fold spatial filtering. If the MLA imaging technique is used, contrast
enhancement is achieved by multi-frequency image compounding (TSMF). Apart from that, given that the
identical imaging technique and number of transducers are used, dense array geometries, whether uniform or
non-uniform, achieve a lower MSLL than sparse arrays. Furthermore, by using two-scale arrays, which consist
128
of an advantageous combination of dense and sparse sub-arrays, a favorable and flexibly selectable trade-off
between resolution and contrast is possible (TSMF).
Fifth, the optimization in terms of system complexity and overall size is highlighted, which determine
the cost-effectiveness, viability, and feasibility of sonar imaging systems. In general, the concepts with a
small number of required components, i.e. only a few transmitter and receiver elements, analog and digital
ICs, as well as external signal processors, result in a low system complexity. Particularly outstanding is the
T-array concept, which requires only 8 transmitters and 12 receivers, and handles signal processing and image
formation on a single microprocessor, albeit limited to 2D scans. Similarly low system requirements are
achieved with the Hexagon concept, which uses one transmitter and 36 receivers, but requires an external
single-board computer for 3D image formation. A moderately increased system complexity results from the
TSMF design, which, in addition to the larger number of receivers, i.e. 64, also drives three multi-frequency
transmitters. All three concepts are based on the compact MEMS microphone receivers, each featuring a
digital interface, such that further electronic components for analog signal conditioning and digitization are
not required. In comparison, the highest system complexity results from the SLA imaging concepts (Waveguide
URA, Spiral), which are both based on the identical transceiver electronics. This electronics features a large
number of analog transmit and receive channels and, thus, requires a high amount of additional electronic
components, apart from the 64 transceiver PUT elements. Clearly, the large-aperture array concepts (Spiral,
TSMF) demand an increased installation area, but also the small-aperture Waveguide concept demands a
large installation volume instead.
Design recommendations
Based on these results, general recommendations are derived for the selection and combination of transmit
and receive transducer technologies and their excitation schemes. Moreover, suggestions for the design of
array geometries are given to meet different requirements for sonar imaging in air.
Concerning the selection of air-coupled ultrasonic transmitters, the capacitive broad-band technologies
available to date, e.g. Senscomp Series 600 [167], are not advisable for imaging purposes, due to their large
aperture and thus pronounced narrow directivity, as well as their relatively low SPL, despite the considerable
driving requirements. Instead, the narrow-band bending-plate PUTs are particularly recommended due to
their wide directivity, compact size, simple driving, and relatively high SPL in resonant operation. Considering
that high frame rates are required, these PUT transmitters are only to be used in an array configuration
for transmit beamforming operation if the ROI is constrained, e.g. to the horizontal plane. In this case,
the array gain allows higher sound pressure levels and ranges. However, for 3D high-frame-rate imaging,
a combination of multi-frequency PUTs offers advantages and potential for additional features, including
increased range and contrast via image compounding, flexibility in the excitation scheme for dynamic frame
rate adjustments, and the option of combining imaging and communications. Further PUT modifications by
attaching waveguides enable to create dense array configurations, allow to broaden and equalize directivities,
and offer protection in harsh environments. However, the advantages of waveguides come with significant
SPL losses, in particular, if used in transmit and receive mode due to the two-way friction and reflections
caused by the output port impedance mismatch. Therefore, the use of waveguides must be carefully assessed
against an application-specific transducer design.
Next, the transducer technologies for receiving are considered. Despite their excellent transmit charac-
teristics, using the very same narrow-band PUTs for transceiving is not advisable, due to the differing open-
and short-circuit resonant frequencies, which highly attenuate the signal in either transmit or receive mode.
Instead, the MEMS microphones, used throughout this work, are the preferable receiver technology choice, as
they provide many benefits, including their almost-unidirectional sensitivity, small size, a digital interface,
low-power consumption, minor amplitude and phase variances, as well as their broad bandwidth, which
129
covers the ultrasonic and audible range. In particular, the latter enables to exploit different transmission
frequencies, or, in addition to imaging, to localize audible sound sources. All in all, the use of PUT transmitters
in combination with MEMS microphon arrays is suggested, even in cases where a PUT array is already used
for transmit beamforming.
The selection of the array geometry depends on the available installation space and the angular resolution
required, as well as on the applicable number of elements and the contrast to be obtained. Given a fixed number
of array elements, the lowest MSLL and thus highest contrast is achievable with uniform or non-uniform
dense arrays, where the inter-element spacings are less than half the wavelength. If uniform dense arrays
are utilized, the hexagonal array is favored over the rectangular array due to lower MSLL at the same MLW.
These uniform array types are advantageous, as their signals can be decorrelated, required for advanced
DOA algorithms, e.g. Capon, MUSIC. The non-uniform dense spiral array achieves approximately identical
results in terms of MLW and MSLL as the dense hexagonal type, while providing the capability to be used at
higher frequencies as well, without the formation of ambiguous grating lobes. If a trade-off in favor of angular
resolution at the expense of contrast and size is pursued, non-uniform sparse arrays are a viable solution. In
this case, the two-scale design strategy, combining two different element densities, provides a particularly
versatile and favorable trade-off over comparable density tapering approaches.
Future work
In future work, the application-oriented evaluations of the ultrasonic sonar imaging systems are crucial for
identifying further requirements and potentials for improvement. In particular, a promising field of research
is the investigation of the complementary 3D sonar sensor for mobile autonomous robots in conjunction
with other sensor modalities. Here, a key objective is to develop smart sensor data fusion strategies and to
adapt the localization, navigation and mapping algorithms to the novel sonar information. For this purpose,
an autonomous robotic system has been built, which is equipped with a lidar sensor, a 3D camera, and
the Hexagon sonar system, whose data are jointly incorporated into the Robot Operating System (ROS)
environment (Fig. 8.2). First preliminary experiments in an obstacle course have shown that reflective and
transparent surfaces can only be reliably detected and avoided with the sonar system active [182].
Figure 8.2: Autonomous mobile robot equipped with the Hexagon 3D sonar, an Intel RealSense camera and
an RP lidar sensor [182]. Experiment for evaluating the Doppler velocity estimation using the
Hexagon sonar system [183].
130
A further task is the extraction of additional information from the signals acquired, such as velocity
estimations by exploiting the Doppler effect. In addition to the imaging of the environment, the sonar system
can thus provide velocity measurements for the odometry of mobile systems. In a preliminary experiment using
a pendulum-like sphere, the Doppler velocity estimation obtained with the ultrasonic sonar has demonstrated
to be in good agreement with an optical reference velocity measurement system [183], [184]. More detailed
investigations, particularly for high velocities, and the optimization of the signal analysis technique are part
of future work. Another type of additional information can be extracted due to the audible frequency range
covered by the microphone arrays. This way, their signals can be used for sound source localization as well,
enabling to consider voice commands or warning signals such as honking, shouts or sirens for navigation
decisions. Therefore, sonar imaging systems provide a unique dual-use feature over radar, lidar and camera
sensors, i.e. the capability to hear.
Furthermore, advanced methods for image formation and post-processing are investigated, which can
improve the image quality in terms of angular resolution and contrast without necessarily requiring an increase
of the array aperture or number of elements. The advanced DoA estimators for image formation to be explored
include, e.g., Capon and MUSIC, as well as new sparsity-based methods [185]. In addition, further studies
are in progress to improve the image post-processing based on the deep-learned deconvolution approach [39],
[173] and to extend its capability to enhance 3D images as well.
In order to ensure an optimum performance for all advanced image formation and post-processing methods,
one key aspect is to calibrate the amplitude and phase errors of the real array elements, that cause a deviation
from the ideal model assumptions [35], [130]. Here, the influence of various effects, such as ambient
temperature, aging, self-heating and directivity on the element errors must be investigated. In this context, a
self-calibration procedure is likely to be required to ensure a long-term error compensation.
Moreover, ongoing advances in transducer technologies lead to further variations of sonar concepts in
the future. These include small-scale PMUT and CMUT transmitters, receivers or transceivers, as well as
ferroelectret-based transducers [171]. Particularly advantageous for in-air sonar imaging is the development
of narrow-band transducers with multiple different resonant frequencies or generally broad-band technologies
featuring small apertures. Beyond that, the design of highly specialized ASICs for signal and image formation,
as well as the additional integration of transmitters and receivers into a system-on-chip, enable to create
cost-effective and compact solutions. All in all, ultrasonic 3D sonar sensors have the potential to become widely
popular, as they provide valuable perceptual information, that surpasses existing 1D range finders, to meet
the requirements of emerging technologies in autonomous vehicles, robotics, and industrial environments.
131
Bibliography
[1] Amazon.com, Inc. “Meet amazon scout”. (2019), [Online]. Available: https://www.aboutamazon.
com/news/transportation/meet-scout (visited on 02/28/2023).
[2] P. Fankhauser and M. Hutter, “ANYmal: A Unique Quadruped Robot Conquering Harsh Environments”,
ETH Zurich, 2018.
[3] Fraunhofer-Institut für Produktionstechnik und Automatisierung. “Care-o-bot 3”. (2010), [Online].
Available: https://www.care-o-bot.de (visited on 02/28/2023).
[4] B. W. Drinkwater and P. D. Wilcox, “Ultrasonic arrays for non-destructive evaluation: A review”, NDT
& E Int., vol. 39, Oct. 2006.
[5] S. N. Ramadas, J. C. Jackson, J. Dziewierz, R. O’Leary, and A. Gachagan, “Application of conformal
map theory for design of 2-D ultrasonic array structure for ndt imaging application: A feasibility
study”, IEEE Trans. Ultrason. Ferroelectr. Freq. Control, vol. 61, Mar. 2014.
[6] T. Marhenke, J. Neuenschwander, R. Furrer, P. Zolliker, J. Twiefel, J. Hasener, J. Wallaschek, and
S. J. Sanabria, “Air-Coupled Ultrasound Time Reversal (ACU-TR) For Subwavelength Nondestructive
Imaging”, IEEE Trans. Ultrason. Ferroelectr. Freq. Control, vol. 67, Mar. 2020.
[7] S. Surappa, M. Tao, and F. L. Degertekin, “Analysis and Design of Capacitive Parametric Ultrasonic
Transducers for Efficient Ultrasonic Power Transfer Based on a 1-D Lumped Model”, IEEE Trans.
Ultrason., Ferroelect., Freq. Contr., vol. 65, no. 11, pp. 2103–2112, Nov. 2018.
[8] A. S. Rekhi, B. T. Khuri-Yakub, and A. Arbabian, “Wireless Power Transfer to Millimeter-Sized Nodes
Using Airborne Ultrasound”, IEEE Trans. Ultrason., Ferroelect., Freq. Contr., vol. 64, no. 10, pp. 1526–
1541, Oct. 2017.
[9] D. Yosra, D. Certon, F. Vander Meulon, S. Callé, T. Hoang, G. Ferin, and B. Rosinski, “Contactless
Acoustic Power Transmission through air/Skin interface : A feasibility study”, in Proc. IEEE Int. Ultrason.
Symp. (IUS), Sep. 2020.
[10] D. Sancho-Knapik, H. Calas, J. J. Peguero-Pina, A. Ramos Fernandez, E. Gil-Pelegrin, and T. E. G.
Alvarez-Arenas, “Air-coupled ultrasonic resonant spectroscopy for the study of the relationship between
plant leaves’ elasticity and their water content”, IEEE Trans. Ultrason. Ferroelectr. Freq. Control, vol. 59,
Feb. 2012.
[11] B. C. Barry, L. Verstraten, F. T. Butler, P. M. Whelan, and W. M. Wright, “The Use of Airborne Ultrasound
for Varroa Destructor Mite Control in Beehives”, in Proc. IEEE Int. Ultrason. Symp. (IUS), Oct. 2018.
[12] A. M. Ginel and T. G. Alvarez-Arenas, “Air-coupled Transducers for Quality Control in the Food
Industry”, in Proc. IEEE Int. Ultrason. Symp. (IUS), Oct. 2019.
[13] L. Fariñas, M. Contreras, V. Sanchez-Jimenez, J. Benedito, and J. V. Garcia-Perez, “Use of air-coupled
ultrasound for the non-invasive characterization of the textural properties of pork burger patties”, J.
Food Eng., vol. 297, May 2021.
[14] K. Hasegawa and H. Shinoda, “Aerial Vibrotactile Display Based on Multiunit Ultrasound Phased
Array”, IEEE Trans. Haptics, vol. 11, Jul. 2018.
132
[15] R. Hirayama, D. Martinez Plasencia, N. Masuda, and S. Subramanian, “A volumetric display for visual,
tactile and audio presentation using acoustic trapping”, Nature, vol. 575, Nov. 2019.
[16] R. Morales, I. Ezcurdia, J. Irisarri, M. A. B. Andrade, and A. Marzo, “Generating Airborne Ultrasonic
Amplitude Patterns Using an Open Hardware Phased Array”, Appl. Sci, vol. 11, Mar. 2021.
[17] A. Marzo, S. A. Seah, B. W. Drinkwater, D. R. Sahoo, B. Long, and S. Subramanian, “Holographic
acoustic elements for manipulation of levitated objects”, Nat. Commun., vol. 6, Dec. 2015.
[18] A. Watanabe, K. Hasegawa, and Y. Abe, “Contactless Fluid Manipulation in Air: Droplet Coalescence
and Active Mixing by Acoustic Levitation”, Sci. Rep., vol. 8, Dec. 2018.
[19] T. Kasai, T. Furumoto, and H. Shinoda, “Rotation and Position Control of a Cubic Object Using Airborne
Ultrasound”, in Proc. IEEE Int. Ultrason. Symp. (IUS), Sep. 2020.
[20] I. Wygant, M. Kupnik, J. Windsor, W. Wright, M. Wochner, G. Yaralioglu, M. Hamilton, and B. Khuri-
Yakub, “50 kHz capacitive micromachined ultrasonic transducers for generation of highly directional
sound with parametric arrays”, IEEE Transactions on Ultrasonics, Ferroelectrics and Frequency Control,
vol. 56, no. 1, pp. 193–203, Jan. 2009.
[21] K. Nakagawa, C. Shi, and Y. Kajikawa, “Beam Steering of Portable Parametric Array Loudspeaker”, in
Proc. Asia-Pac. Signal and Inf. Process. Assoc. Annu. Summit and Conf. (APSIPA ASC), Nov. 2019.
[22] Z. Shao, S. Pala, Y. Liang, Y. Peng, and L. Lin, “A Single Chip Directional Loudspeaker Based on
PMUTS”, in Proc. IEEE Int. Conf. on Micro Electro Mechan. Syst. (MEMS), Jan. 2021.
[23] P. R. Hoskins, K. Martin, and A. Thrush, Eds., Diagnostic Ultrasound: Physics and Equipment (Cambridge
Medicine), 2nd ed. Cambridge, UK ; New York: Cambridge University Press, 2010.
[24] H. E. Bass, L. C. Sutherland, and A. J. Zuckerwar, “Atmospheric absorption of sound: Update”, The
Journal of the Acoustical Society of America, vol. 88, no. 4, pp. 2019–2021, Oct. 1990.
[25] A. Ens, L. M. Reindl, T. Janson, and C. Schindelhauer, “Low-power simplex ultrasound communication
for indoor localization”, in Proc. 22nd European Signal Processing Conference (EUSIPCO), 2014.
[26] C. Shi and W.-S. Gan, “Development of Parametric Loudspeaker”, IEEE Potentials, vol. 29, no. 6,
pp. 20–24, Nov. 2010.
[27] M. Legg and S. Bradley, “Ultrasonic Arrays for Remote Sensing of Pasture Biomass”, Remote Sens.,
vol. 12, Dec. 2019.
[28] S. Suzuki, S. Inoue, M. Fujiwara, Y. Makino, and H. Shinoda, “AUTD3: Scalable Airborne Ultrasound
Tactile Display”, IEEE Transactions on Haptics, 2021.
[29] C. Haugwitz, C. Hartmann, G. Allevato, M. Rutsch, J. Hinrichs, J. Brötz, D. Bothe, P. F. Pelz, and
M. Kupnik, “Multipath Flow Metering of High-Velocity Gas Using Ultrasonic Phased-Arrays”, IEEE
Open Journal of Ultrasonics, Ferroelectrics, and Frequency Control, 2022.
[30] J. Hinrichs, M. Sachsenweger, M. Rutsch, G. Allevato, W. M. D. Wright, and M. Kupnik, “Lamb
waves excited by an air-coupled ultrasonic phased array for non-contact, non-destructive detection
of discontinuities in sheet materials”, in Proc. IEEE International Ultrasonics Symposium (IUS), Sep.
2021.
[31] G. Allevato, J. Hinrichs, D. Grosskurth, M. Rutsch, J. Adler, A. Jäger, M. Pesavento, and M. Kupnik, “3D
imaging method for an air-coupled 40 kHz ultrasound phased-array”, in Proc. International Congress
on Acoustics, 2019.
[32] G. Allevato, J. Hinrichs, M. Rutsch, J. P. Adler, A. Jäger, M. Pesavento, and M. Kupnik, “Real-Time 3-D
Imaging Using an Air-Coupled Ultrasonic Phased-Array”, IEEE Transactions on Ultrasonics, Ferroelectrics,
and Frequency Control, vol. 68, no. 3, pp. 796–806, Mar. 2021.
133
[33] G. Allevato, M. Rutsch, J. Hinrichs, E. Sarradj, M. Pesavento, and M. Kupnik, “Spiral Air-Coupled
Ultrasonic Phased Array for High Resolution 3D Imaging”, in Proc. IEEE International Ultrasonics
Symposium (IUS), Sep. 2020.
[34] G. Allevato, M. Rutsch, J. Hinrichs, C. Haugwitz, R. Müller, M. Pesavento, and M. Kupnik, “Air-Coupled
Ultrasonic Spiral Phased Array for High-Precision Beamforming and Imaging”, IEEE Open Journal of
Ultrasonics, Ferroelectrics, and Frequency Control, vol. 2, pp. 40–54, 2022.
[35] G. Allevato, T. Frey, C. Haugwitz, M. Rutsch, J. Hinrichs, R. Müller, M. Pesavento, and M. Kupnik,
“Calibration of Air-Coupled Ultrasonic Phased Arrays. Is it worth it?”, in Proc. IEEE International
Ultrasonics Symposium (IUS), Oct. 2022.
[36] G. Allevato, M. Rutsch, J. Hinrichs, M. Pesavento, and M. Kupnik, “Embedded Air-Coupled Ultrasonic
3D Sonar System with GPU Acceleration”, in Proc. IEEE Sensors, 2020.
[37] T. Maier, “Luftgekoppeltes embedded low-cost sonar basierend auf einer waveguide-struktur”, B.S.
thesis, Technische Universität Darmstadt, Darmstadt, 2021.
[38] G. Allevato, C. Haugwitz, M. Rutsch, R. Müller, M. Pesavento, and M. Kupnik, “Two-Scale Sparse
Spiral Array Design”, IEEE Open Journal of Ultrasonics, Ferroelectrics, and Frequency Control, 2023.
[39] S. Schulte, G. Allevato, C. Haugwitz, and M. Kupnik, “Deep-Learned Air-Coupled Ultrasonic Sonar
Image Enhancement and Object Localization”, in Proc. IEEE Sensors, Oct. 2022.
[40] G. F. Masters and S. F. Gregson, “Coordinate System Plotting for Antenna Measurements”, in Proc.
AMTA Annual Meeting and Symposium, 2007.
[41] B. Van Veen and K. Buckley, “Beamforming: A versatile approach to spatial filtering”, IEEE ASSP
Magazine, vol. 5, no. 2, pp. 4–24, Apr. 1988.
[42] W. L. Stutzman and G. A. Thiele, Antenna Theory and Design, 3rd ed. Hoboken, NJ: Wiley, 2013.
[43] C. A. Balanis, Antenna Theory: Analysis and Design, Fourth edition. Hoboken, New Jersey: Wiley, 2016.
[44] T. L. Szabo, Diagnostic Ultrasound Imaging: Inside Out, Second edition. Amsterdam ; Boston: Else-
vier/AP, Academic Press is an imprint of Elsevier, 2014.
[45] H. L. Van Trees, Optimum Array Processing. New York, USA: John Wiley & Sons, Inc., Mar. 2002.
[46] A. D. Pierce, Acoustics: An Introduction to Its Physical Principles and Applications, 1989 ed. Woodbury,
N.Y: Acoustical Society of America, 1989.
[47] D. H. Johnson and D. E. Dudgeon, Array Signal Processing: Concepts and Techniques (Prentice-Hall
Signal Processing Series). Englewood Cliffs, NJ: P T R Prentice Hall, 1993.
[48] A. M. Zoubir, M. Viberg, R. Chellappa, and S. Theodoridis, Eds., Array and Statistical Signal Processing
(Academic Press Library in Signal Processing v. 3), First edition. Amsterdam: Academic Press, 2014.
[49] L. W. Schmerr, Fundamentals of Ultrasonic Phased Arrays (Solid Mechanics and Its Applications 215).
Cham: Springer, 2015.
[50] J. W. Strutt, The Theory of Sound, First. Cambridge University Press, Jun. 2011.
[51] M. A. Richards, Fundamentals of Radar Signal Processing, Second edition. New York: McGraw-Hill
Education, 2014.
[52] R. Lerch, G. Sessler, and D. Wolf, Technische Akustik. Berlin, Heidelberg: Springer Berlin Heidelberg,
2009.
[53] P.-O. Persson and G. Strang, “A Simple Mesh Generator in MATLAB”, SIAM Review, vol. 46, no. 2,
pp. 329–345, Jan. 2004.
134
[54] H. Rahman, Fundamental Principles of Radar. Boca Raton: Taylor & Francis, 2019.
[55] R. J. Mailloux, Phased Array Antenna Handbook (Artech House Antennas and Propagation Library),
2nd ed. Boston: Artech House, 2005.
[56] G. Matrone, A. Ramalli, and P. Tortoli, Ultrasound B-mode Imaging: Beamforming and Image Formation
Techniques. Basel, Switzerland: MDPI, 2019.
[57] G. Allevato, J. Hinrichs, M. Rutsch, J. Adler, A. Jager, M. Pesavento, and M. Kupnik, “Real-Time 3D
Imaging using an Air-Coupled Ultrasonic Phased-Array”, IEEE Transactions on Ultrasonics, Ferroelectrics,
and Frequency Control, pp. 1–1, 2020.
[58] P. Hoskins, K. Martin, and A. Thrush, Diagnostic Ultrasound: Physics and Equipment, Second. Cambridge:
Cambridge University Press, 2010.
[59] J. Capon, “High-resolution frequency-wavenumber spectrum analysis”, Proceedings of the IEEE, vol. 57,
no. 8, pp. 1408–1418, Aug. 1969.
[60] R. Schmidt, “Multiple emitter location and signal parameter estimation”, IEEE Transactions on Antennas
and Propagation, vol. 34, no. 3, pp. 276–280, Mar. 1986.
[61] R. Roy and T. Kailath, “ESPRIT-estimation of signal parameters via rotational invariance techniques”,
IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 37, no. 7, pp. 984–995, Jul. 1989.
[62] C. Dalitz, R. Pohle-Frohlich, and T. Michalk, “Point spread functions and deconvolution of ultrasonic
images”, IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 62, no. 3, pp. 531–
544, Mar. 2015.
[63] C. Bahr and L. Cattafesta, “Wavespace-Based Coherent Deconvolution”, in Proc. 18th AIAA/CEAS
Aeroacoustics Conference (33rd AIAA Aeroacoustics Conference), Jun. 2012.
[64] J. A. Högbom, “Aperture Synthesis with a Non-Regular Distribution of Interferometer Baselines”,
Astronomy and Astrophysics Supplement Series, vol. 15, p. 417, Jun. 1974.
[65] O. M. H. Rindal, A. Austeng, A. Fatemi, and A. Rodriguez-Molares, “The Effect of Dynamic Range
Alterations in the Estimation of Contrast”, IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency
Control, vol. 66, no. 7, pp. 1198–1208, Jul. 2019.
[66] G. S. Kino, Acoustic Waves: Devices, Imaging, and Analog Signal Processing (Prentice-Hall Signal
Processing Series). Englewood Cliffs, N.J: Prentice-Hall, 1987.
[67] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Fourth edition, global edition. New York,
NY: Pearson, 2018.
[68] A. V. Oppenheim, R. W. Schafer, and J. R. Buck, Discrete-Time Signal Processing, Second. New Jersey:
Prentice Hall, 1999.
[69] A. Jäger, “Airborne ultrasound phased arrays”, Ph.D. dissertation, Technische Universität Darmstadt,
Darmstadt, 2019.
[70] M. Rutsch, “Duct acoustics for air-coupled ultrasonic phased arrays”, Ph.D. dissertation, Technische
Universität Darmstadt, Darmstadt, 2023.
[71] J. Hinrichs, “Entwicklung und Aufbau einer Empfangselektronik für ein Ultraschall-Phased-Array in
Luft”, M.S. thesis, Technische Universität Darmstadt, Darmstadt, 2017.
[72] T. Carter, S. A. Seah, B. Long, B. Drinkwater, and S. Subramanian, “UltraHaptics: Multi-point mid-air
haptic feedback for touch surfaces”, in Proc. of the 26th Annual ACM Symposium on User Interface
Software and Technology - UIST, 2013.
135
[73] Y. Monnai, K. Hasegawa, M. Fujiwara, K. Yoshino, S. Inoue, and H. Shinoda, “HaptoMime: Mid-air
haptic interaction with a floating virtual screen”, in Proc. of the 27th Annual ACM Symposium on User
Interface Software and Technology - UIST, 2014.
[74] A. Marzo, T. Corkett, and B. W. Drinkwater, “Ultraino: An Open Phased-Array System for Narrowband
Airborne Ultrasound Transmission”, IEEE Trans. Ultrason., Ferroelect., Freq. Contr., vol. 65, no. 1,
pp. 102–111, Jan. 2018.
[75] W. Beasley, B. Gatusch, D. Connolly-Taylor, C. Teng, A. Marzo, and J. Nunez-Yanez, “Ultrasonic Levita-
tion with Software-Defined FPGAs and Electronically Phased Arrays”, in Proc. NASA/ESA Conference
on Adaptive Hardware and Systems (AHS), Jul. 2019.
[76] Y. Hatano, C. Shi, and Y. Kajikawa, “Compensation for Nonlinear Distortion of the Frequency Modulation-
Based Parametric Array Loudspeaker”, IEEE/ACM Transactions on Audio, Speech, and Language Pro-
cessing, vol. 25, no. 8, pp. 1709–1717, Aug. 2017.
[77] A. Aminot, P. Shirkovskiy, M. Fink, and R. K. Ing, “Non-Contact Surface Wave Elastography Using 40
kHz Airborne Ultrasound Surface Motion Camera”, in Proc. IEEE International Ultrasonics Symposium
(IUS), 22.
[78] J. F. Pazos-Ospina, J. L. Ealo, and E. E. Franco, “Characterization of phased array-steered acoustic
vortex beams”, The Journal of the Acoustical Society of America, vol. 142, no. 1, pp. 61–71, Jul. 2017.
[79] D. H. Johnson and D. E. Dudgeon, Array Signal Processing: Concepts and Techniques, 1st ed. New
Jersey: Prentice Hall, 1993, pp. 89-91.
[80] T. Leighton, “Ultrasound in Air: New applications need improved measurement methods and proce-
dures, and appreciation of any adverse effects on humans”, in Proc. International Congress on Acoustics,
2019.
[81] M. Moebus and A. M. Zoubir, “Three-Dimensional Ultrasound Imaging in Air using a 2D Array on a
Fixed Platform”, in Proc. IEEE International Conference on Acoustics, Speech and Signal Processing -
ICASSP, 2007.
[82] K. K. Park and B. T. Khuri-Yakub, “3-D airborne ultrasound synthetic aperture imaging based on
capacitive micromachined ultrasonic transducers”, Ultrasonics, vol. 53, no. 7, pp. 1355–1362, Sep.
2013.
[83] J. A. Jensen, S. I. Nikolov, K. L. Gammelmark, and M. H. Pedersen, “Synthetic aperture ultrasound
imaging”, Ultrasonics, vol. 44, e5–e15, Dec. 2006.
[84] S. Harput and A. Bozkurt, “Ultrasonic Phased Array Device for Acoustic Imaging in Air”, IEEE Sensors
Journal, vol. 8, no. 11, pp. 1755–1762, Nov. 2008.
[85] S. Kumar and H. Furuhashi, “Long-range measurement system using ultrasonic range sensor with
high-power transmitter array in air”, Ultrasonics, vol. 74, no. 74, pp. 186–195, 2017.
[86] J. Steckel, A. Boen, and H. Peremans, “Broadband 3-D Sonar System Using a Sparse Array for Indoor
Navigation”, IEEE Transactions on Robotics, vol. 29, no. 1, pp. 161–172, Feb. 2013.
[87] R. Kerstens, D. Laurijssen, and J. Steckel, “eRTIS: A Fully Embedded Real Time 3D Imaging Sonar
Sensor for Robotic Applications”, in Proc. International Conference on Robotics and Automation (ICRA),
May 2019.
[88] G. Massimino, A. Colombo, R. Ardito, F. Quaglia, F. Foncellino, and A. Corigliano, “Air-Coupled Array
of Pmuts at 100 kHz with PZT Active Layer: Multiphysics Model and Experiments”, in Proc. 20th
International Conference on Thermal, Mechanical and Multi-Physics Simulation and Experiments in
Microelectronics and Microsystems (EuroSimE), Mar. 2019.
136
[89] X. Chen, J. Xu, H. Chen, H. Ding, and J. Xie, “High-Accuracy Ultrasonic Rangefinders via pMUTs
Arrays Using Multi-Frequency Continuous Waves”, Journal of Microelectromechanical Systems, vol. 28,
no. 4, pp. 634–642, Aug. 2019.
[90] Shuai Na, L. L. Wong, A. I. Chen, Z. Li, M. Macecek, and J. T. Yeow, “A CMUT array based on annular
cell geometry for air-coupled applications”, in Proc. IEEE International Ultrasonics Symposium (IUS),
Sep. 2016.
[91] P. Webb and C. Wykes, “High-resolution beam forming for ultrasonic arrays”, IEEE Transactions on
Robotics and Automation, vol. 12, no. 1, pp. 138–146, Feb./1996.
[92] P. L. M. J. van Neer, A. W. F. Volker, A. P. Berkhoff, H. B. Akkerman, T. Schrama, A. van Breemen, and
G. H. Gelinck, “Feasiblity of Using Printed Polymer Transducers for Mid-Air Haptic Feedback”, in Proc.
IEEE International Ultrasonics Symposium (IUS), 22.
[93] S. Takahashi and H. Ohigashi, “Ultrasonic imaging using air-coupled P(VDF/TrFE) transducers at
2MHz”, Ultrasonics, vol. 49, no. 4-5, pp. 495–498, May 2009.
[94] L. Capineri, A. Bulletti, M. Calzolai, and P. Giannelli, “An Airborne Ultrasonic Imaging System Based
on 16 Elements: 150 kHz Piezopolymer Transducer Arrays—Preliminary Simulated and Experimental
Results for Cylindrical Targets Detection”, Sensing and Imaging, vol. 17, no. 1, Dec. 2016.
[95] J. Ealo, J. Camacho, and C. Fritsch, “Airborne ultrasonic phased arrays using ferroelectrets: A new
fabrication approach”, IEEE Transactions on Ultrasonics, Ferroelectrics and Frequency Control, vol. 56,
no. 4, pp. 848–858, Apr. 2009.
[96] T. Takahashi, R. Takahashi, and S. Jeong, “Ultrasonic phased array sensor for electric travel aids
for visually impaired people”, in Proc. International Workshop and Conference on Photonics and
Nanotechnology 2007, M. Sasaki, G. Choi Sang, Z. Li, R. Ikeura, H. Kim, and F. Xue, Eds., Dec. 2007.
[97] A. Jäger, D. Großkurth, M. Rutsch, A. Unger, R. Golinske, H. Wang, S. Dixon, K. Hofmann, and M.
Kupnik, “Air-coupled 40 kHz Ultrasonic 2D-Phased Array Based on a 3D-printed waveguide structure”,
in Proc. IEEE International Ultrasonics Symposium, 2017.
[98] C. Haugwitz, A. Jäger, G. Allevato, J. Hinrichs, A. Unger, S. Saul, J. Brötz, B. Matyschok, P. Pelz, and
M. Kupnik, “Flow Metering of Gases Using Ultrasonic Phased-Arrays at High Velocities”, in Proc. IEEE
International Ultrasonics Symposium (IUS), 2019.
[99] A. Jäger, J. Hinrichs, G. Allevato, M. Sachsenweger, S. Kadel, D. Stasevich, W. Gebhard, G. Hubschen,
T. Hahn-Jose, W. M. D. Wright, and M. Kupnik, “Non-contact ultrasound with optimum electronic
steering angle to excite Lamb waves in thin metal sheets for mechanical stress measurements”, in
Proc. IEEE International Ultrasonics Symposium (IUS), 2019.
[100] M. Karaman, I. O. Wygant, Ö. Oralkan, and B. T. Khuri-Yakub, “Minimally Redundant 2-D Array
Designs for 3-D Medical Ultrasound Imaging”, IEEE Transactions on Medical Imaging, vol. 28, no. 7,
2009.
[101] H. Hasegawa and H. Kanai, “High frame rate echocardiography using diverging beams”, in Proc. IEEE
International Ultrasonics Symposium, Oct. 2011.
[102] J. Provost, C. Papadacci, J. E. Arango, M. Imbault, M. Fink, J.-L. Gennisson, M. Tanter, and M. Pernot,
“3D ultrafast ultrasound imaging in vivo”, Physics in Medicine and Biology, vol. 59, no. 19, pp. L1–L13,
Oct. 2014.
[103] E. Boni, A. C. H. Yu, S. Freear, J. A. Jensen, and P. Tortoli, “Ultrasound Open Platforms for Next-
Generation Imaging Technique Development”, IEEE Transactions on Ultrasonics, Ferroelectrics, and
Frequency Control, vol. 65, no. 7, pp. 1078–1092, Jul. 2018.
137
[104] P. Weber, R. Schmitt, B. Tylkowski, and J. Steck, “Optimization of random sparse 2-D transducer
arrays for 3-D electronic beam steering and focusing”, in Proc. IEEE Ultrason. Symp., Oct. 1994.
[105] R. Davidsen, “Two-Dimensional Random Arrays for Real Time Volumetric Imaging”, Ultrasonic Imaging,
vol. 16, Jul. 1994.
[106] A. Austeng and S. Holm, “Sparse 2-D arrays for 3-D phased array imaging - design methods”, IEEE
Trans. Ultrason. Ferroelectr. Freq. Control, vol. 49, Aug. 2002.
[107] D. W. Boeringer, “Phased array including a logarithmic spiral lattice of uniformly spaced radiating
and receiving elements”, US6433754B1, Aug. 2002.
[108] M. C. Viganó, G. Toso, G. Caille, C. Mangenot, and I. E. Lager, “Sunflower Array Antenna with
Adjustable Density Taper”, Int. J. of Antennas Propag., 2009.
[109] E. Sarradj, “A Generic Approach to Synthesize Optimal Array Microphone Arrangements”, in Proc.
Berlin Beamforming Conference, 2016.
[110] A. Vallecchi, M. Cerretelli, M. Linari, and G. B. Gentili, “Investigation of optimal array configurations
for full azimuth scan HF skywave radars”, in Proc. of the 6th Eur. Radar Conf., 2009.
[111] Q. Cheng, Y. Liu, H. Zhang, and Y. Hao, “A Generic Spiral MIMO Array Design Method for Short-Range
UWB Imaging”, IEEE Antennas Wirel. Propag. Lett., vol. 19, May 2020.
[112] L. H. Gabrielli and H. E. Hernandez-Figueroa, “Aperiodic Antenna Array for Secondary Lobe Suppres-
sion”, IEEE Photonics Technol. Lett., vol. 28, Jan. 2016.
[113] Zhixiong Lei and Kunde Yang, “Sound sources localization using compressive beamforming with a
spiral array”, in Proc. Int. Conf. Inf. Commun. Technol. (ICT), 2015.
[114] S. Luesutthiviboon, A. Malgoezar, M. Snellen, P. Sijtsma, and D. Simons, “Improving Source Discrimi-
nation Performance by using an Optimized Acoustic Array and Adaptive High-Resolution CLEAN-SC
Beamforming”, in Proc. Berlin Beamforming Conf., 2018.
[115] O. Martínez-Graullera, C. J. Martín, G. Godoy, and L. G. Ullate, “2D array design based on Fermat
spiral for ultrasound imaging”, Ultrasonics, vol. 50, Feb. 2010.
[116] A. Ramalli, E. Boni, A. S. Savoia, and P. Tortoli, “Density-tapered spiral arrays for ultrasound 3-D
imaging”, IEEE Trans. Ultrason. Ferroelectr. Freq. Control, vol. 62, Aug. 2015.
[117] H. Yoon and T.-K. Song, “Sparse Rectangular and Spiral Array Designs for 3D Medical Ultrasound
Imaging”, Sensors, vol. 20, Dec. 2019.
[118] A. Ramalli, E. Boni, C. Giangrossi, P. Mattesini, A. Dallai, H. Liebgott, and P. Tortoli, “Real-Time 3-D
Spectral Doppler Analysis With a Sparse Spiral Array”, IEEE Trans. Ultrason. Ferroelectr. Freq. Control,
vol. 68, May 2021.
[119] A. Price and B. Long, “Fibonacci Spiral Arranged Ultrasound Phased Array for Mid-Air Haptics”, in
Proc. IEEE Int. Ultrason. Symp. (IUS), Oct. 2018.
[120] A. Movahed, T. Waschkies, and U. Rabe, “Air Ultrasonic Signal Localization with a Beamforming
Microphone Array”, Adv. Acoust. Vib., Feb. 2019.
[121] B. Parr, M. Legg, S. Bradley, and F. Alam, “Occluded Grape Cluster Detection and Vine Canopy
Visualisation Using an Ultrasonic Phased Array”, Sensors, vol. 21, Mar. 2021.
[122] M. Harter, J. Hildebrandt, A. Ziroff, and T. Zwick, “Self-Calibration of a 3-D-Digital Beamforming Radar
System for Automotive Applications With Installation Behind Automotive Covers”, IEEE Transactions
on Microwave Theory and Techniques, vol. 64, no. 9, 2016.
138
[123] M. Passoni, N. Petrini, S. Sanvito, and F. Quaglia, “Real-Time Control of the Resonance Frequency
of a Piezoelectric Micromachined Ultrasonic Transducer for Airborne Applications”, in Proc. IEEE
International Ultrasonics Symposium (IUS), Sep. 2021.
[124] Y. Kusano, Q. Wang, G.-L. Luo, Y. Lu, R. Q. Rudy, R. G. Polcawich, and D. A. Horsley, “Effects of DC
Bias Tuning on Air-Coupled PZT Piezoelectric Micromachined Ultrasonic Transducers”, Journal of
Microelectromechanical Systems, vol. 27, no. 2, Apr. 2018.
[125] C. Buyle, L. De Strycker, and L. Van Der Perre, “An Accurately Steerable, Compact Phased Array
System for Narrowbeam Ultrasound Transmission”, IEEE Sensors Journal, 2022.
[126] M. Ingram, A. Gachagan, A. J. Mulholland, A. Nordon, J. Dziewierz, M. Hegarty, and E. Becker,
“Calibration of ultrasonic phased arrays for industrial applications”, in Proc. IEEE Sensors, Oct. 2017.
[127] D. Duxbury, J. Russell, and M. Lowe, “The effect of variation in phased array element performance for
Non-Destructive Evaluation (NDE)”, Ultrasonics, vol. 53, no. 6, Aug. 2013.
[128] J. Zhang, B. W. Drinkwater, and P. D. Wilcox, “Effects of array transducer inconsistencies on total
focusing method imaging performance”, NDT & E International, vol. 44, no. 4, 2011.
[129] C. Nageswaran, “Coping with failed elements on an array: A modelling approach to the technical
justification”, Insight-Non-Destructive Testing and Condition Monitoring, vol. 52, no. 7, 2010.
[130] T. Frey, “Analyse und kalibrierung von amplituden und phasenabweichung von ultraschall phased
arrays”, B.S. thesis, Technische Universität Darmstadt, Darmstadt, 2022.
[131] T. Maier, G. Allevato, M. Rutsch, and M. Kupnik, “Single Microcontroller Air-coupled Waveguided
Ultrasonic Sonar System”, in Proc. IEEE Sensors, Oct. 2021.
[132] M. B. Alatise and G. P. Hancke, “A Review on Challenges of Autonomous Mobile Robot and Sensor
Fusion Methods”, IEEE Access, vol. 8, pp. 39 830–39 846, 2020.
[133] A. Ziebinski, R. Cupek, H. Erdogan, and S. Waechter, “A Survey of ADAS Technologies for the Future
Perspective of Sensor Fusion”, in Computational Collective Intelligence, N. T. Nguyen, L. Iliadis, Y.
Manolopoulos, and B. Trawiński, Eds., vol. 9876, Cham: Springer International Publishing, 2016,
pp. 135–146.
[134] S.-W. Yang and C.-C. Wang, “On Solving Mirror Reflection in LIDAR Sensing”, IEEE/ASME Transactions
on Mechatronics, vol. 16, no. 2, pp. 255–265, Apr. 2011.
[135] H. Wei, X. Li, Y. Shi, B. You, and Y. Xu, “Fusing Sonars and LRF Data to Glass Detection for Robotics
Navigation”, in Proc. IEEE International Conference on Robotics and Biomimetics (ROBIO), Dec. 2018.
[136] R. Singh and K. S. Nagla, “Multi-data sensor fusion framework to detect transparent object for the
efficient mobile robot mapping”, International Journal of Intelligent Unmanned Systems, vol. 7, no. 1,
pp. 2–18, Jan. 2019.
[137] N. Gageik, P. Benz, and S. Montenegro, “Obstacle Detection and Collision Avoidance for a UAV With
Complementary Low-Cost Sensors”, IEEE Access, vol. 3, 2015.
[138] Xue-Cheng Lai, Cheong-Yeen Kong, Shuzhi Sam Ge, and A. Al Mamun, “Online map building for
autonomous mobile robots by fusing laser and sonar data”, in Proc. IEEE International Conference
Mechatronics and Automation, vol. 2, 2005.
[139] D. Browne and L. Kleeman, “An advanced sonar ring design with 48 channels of continuous echo
processing using matched filters”, in Proc. IEEE/RSJ International Conference on Intelligent Robots and
Systems, Oct. 2009.
139
[140] D. Bank and T. Kampke, “High-Resolution Ultrasonic Environment Imaging”, IEEE Transactions on
Robotics, vol. 23, no. 2, pp. 370–381, Apr. 2007.
[141] K.-W. Jörg, “World modeling for an autonomous mobile robot using heterogenous sensor information”,
Robotics and Autonomous Systems, vol. 14, no. 2-3, pp. 159–170, May 1995.
[142] Kreczmer, “Estimation of the Azimuth Angle of the Arrival Direction for an Ultrasonic Signal by Using
Indirect Determination of the Phase Shift”, Archives of Acoustics, vol. 44, no. 3, pp. 585–601, 2019.
[143] F. Guarato, V. Laudan, and J. F. Windmill, “Ultrasonic sonar system for target localization with one
emitter and four receivers: Ultrasonic 3D localization”, in Proc. IEEE Sensors, Oct. 2017.
[144] C. Walter and H. Schweinzer, “Locating of objects with discontinuities, boundaries and intersections
using a compact ultrasonic 3D sensor”, in Proc. International Conference on Indoor Positioning and
Indoor Navigation (IPIN), Oct. 2014.
[145] A. Diosi and L. Kleeman, “Advanced sonar and laser range finder fusion for simultaneous localization
and mapping”, in Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS),
vol. 2, 2004.
[146] B. Barshan and R. Kuc, “A bat-like sonar system for obstacle localization”, IEEE Transactions on Systems,
Man, and Cybernetics, vol. 22, no. 4, pp. 636–646, July-Aug./1992.
[147] N. Seckel and A. Singh, “Physics of 3D Ultrasonic Sensors”, Toposens GmbH, Tech. Rep., 2019,
www.toposens.com.
[148] K. Herman and L. Stach, “Analysis and Elaboration of an Air-Coupled Ultrasound Wideband Sensor
Array”, Acta Physica Polonica A, vol. 123, no. 1, p. 31, 2013.
[149] M. Turqueti, V. Kunin, B. Cardoso, J. Saniie, and E. Oruklu, “Acoustic sensor array for sonic imaging
in air”, in Proc. IEEE International Ultrasonics Symposium, Oct. 2010.
[150] H. Furuhashi, Y. Uchida, and M. Shimizu, “Imaging sensor system using a composite ultrasonic array”,
in Proc. IEEE Sensors, Oct. 2009.
[151] M. Clapp and R. Etienne-Cummings, “Ultrasonic bearing estimation using a MEMS microphone array
and spatiotemporal filters”, in Proc. IEEE International Symposium on Circuits and Systems, vol. 1,
2002.
[152] S. Anzinger, F. Lickert, A. Fusco, G. Bosetti, D. Tumpold, C. Bretthauer, and A. Dehe, “Low Power
Capacitive Ultrasonic Transceiver Array for Airborne Object Detection”, in Proc. IEEE 33rd International
Conference on Micro Electro Mechanical Systems (MEMS), Jan. 2020.
[153] E. Hogenauer, “An economical class of digital filters for decimation and interpolation”, IEEE Transactions
on Acoustics, Speech, and Signal Processing, vol. 29, no. 2, pp. 155–162, Apr. 1981.
[154] M. Rutsch, G. Allevato, J. Hinrichs, C. Haugwitz, R. Augenstein, T. Kaindl, and M. Kupnik, “A compact
acoustic waveguide for air-coupled ultrasonic phased arrays at 40 kHz”, in Proc. IEEE International
Ultrasonics Symposium (IUS), Oct. 2022.
[155] E. Hogenauer, “An economical class of digital filters for decimation and interpolation”, IEEE Transactions
on Acoustics, Speech, and Signal Processing, vol. 29, no. 2, pp. 155–162, 1981.
[156] A. Stuart Savoia, G. Matrone, R. Bardelli, P. Bellutti, F. Quaglia, G. Caliano, A. Mazzanti, P. Tortoli, B.
Mauti, L. Fanni, A. Bagolini, E. Boni, A. Ramalli, F. Guanziroli, S. Passi, and M. Sautto, “A 256-Element
Spiral CMUT Array with Integrated Analog Front End and Transmit Beamforming Circuits”, in Proc.
IEEE International Ultrasonics Symposium (IUS), Oct. 2018.
140
[157] H. J. Vos, E. Boni, A. Ramalli, F. Piccardi, A. Traversi, D. Galeotti, E. C. Noothout, V. Daeichin, M. D.
Verweij, P. Tortoli, and N. de Jong, “Sparse Volumetric PZT Array with Density Tapering”, in Proc.
IEEE International Ultrasonics Symposium (IUS), Oct. 2018.
[158] Wei, Luxi, E. Boni, A. Ramalli, F. Fool, E. Noothout, A. F. W. van der Steen, M. D. Verweij, P. Tortoli,
N. De Jong, and H. J. Vos, “Sparse 2-D PZT-on-PCB Arrays With Density Tapering”, IEEE Transactions
on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 69, no. 10, pp. 2798–2809, Oct. 2022.
[159] A. Ramalli, S. Harput, S. Bezy, E. Boni, R. J. Eckersley, P. Tortoli, and J. D’Hooge, “High-Frame-Rate
Tri-Plane Echocardiography With Spiral Arrays: From Simulation to Real-Time Implementation”, IEEE
Trans. Ultrason. Ferroelectr. Freq. Control, vol. 67, Jan. 2020.
[160] S. Harput, K. Christensen-Jeffries, A. Ramalli, J. Brown, J. Zhu, G. Zhang, C. H. Leow, M. Toulemonde,
E. Boni, P. Tortoli, R. J. Eckersley, C. Dunsby, and M.-X. Tang, “3-D Super-Resolution Ultrasound
Imaging With a 2-D Sparse Array”, IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency
Control, vol. 67, no. 2, pp. 269–277, Feb. 2020.
[161] L. Wei, G. Wahyulaksana, B. Meijlink, A. Ramalli, E. Noothout, M. D. Verweij, E. Boni, K. Kooiman,
A. F. W. van der Steen, P. Tortoli, N. de Jong, and H. J. Vos, “High Frame Rate Volumetric Imaging
of Microbubbles Using a Sparse Array and Spatial Coherence Beamforming”, IEEE Transactions on
Ultrasonics, Ferroelectrics, and Frequency Control, vol. 68, no. 10, pp. 3069–3081, Oct. 2021.
[162] R. Maffett, E. Boni, A. J. Y. Chee, B. Y. S. Yiu, A. S. Savoia, A. Ramalli, P. Tortoli, and A. C. H. Yu,
“Unfocused Field Analysis of a Density-Tapered Spiral Array for High-Volume-Rate 3D Ultrasound
Imaging”, IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 2022.
[163] H. Vogel, “A better way to construct the sunflower head”, Mathematical Biosciences, vol. 44, no. 3-4,
pp. 179–189, Jun. 1979.
[164] H. L. Van Trees, Optimum array processing: Part IV of detection, estimation, and modulation theory.
John Wiley & Sons, 2004.
[165] S. Bae, J. Park, and T.-K. Song, “Contrast and Volume Rate Enhancement of 3-D Ultrasound Imaging
Using Aperiodic Plane Wave Angles: A Simulation Study”, IEEE Transactions on Ultrasonics, Ferroelectrics,
and Frequency Control, vol. 66, no. 11, pp. 1731–1748, Nov. 2019.
[166] S. F. Liew, H. Noh, J. Trevino, L. Dal Negro, and H. Cao, “Localized photonic band edge modes and
orbital angular momenta of light in a golden-angle spiral”, Optics express, vol. 19, no. 24, 2011.
[167] SensComp, Series 600, Datasheet, 2022.
[168] CUI Devices, CUSA-T80-12-2600-TH, Datasheet, 2022.
[169] ProWave, 328ST/R160, Datasheet, 2022.
[170] Murata, MA40S4S, Datasheet, 2023.
[171] O. B. Dali, S. Zhukov, M. Rutsch, C. Hartmann, H. Von Seggern, G. M. Sessler, and M. Kupnik,
“Biodegradable 3D-printed ferroelectret ultrasonic transducer with large output pressure”, in Proc.
IEEE International Ultrasonics Symposium (IUS), Sep. 2021.
[172] Knowles, SPH0641LU4H-1, Datasheet, 2015.
[173] S. Schulte, “Ultraschall Phased-Array Objektlokalisierung mittels Machine Learning”, B.S. thesis,
Technische Universität Darmstadt, Darmstadt, 2021.
[174] P. A. Lynn, An Introduction to the Analysis and Processing of Signals. London: Macmillan Education UK,
1982.
141
[175] J. Verbeeck and G. Bertoni, “Deconvolution of core electron energy loss spectra”, Ultramicroscopy,
vol. 109, no. 11, pp. 1343–1352, Oct. 2009.
[176] J. Ribera, D. Guera, Y. Chen, and E. J. Delp, “Locating Objects Without Bounding Boxes”, in Proc.
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2019.
[177] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image
Segmentation”, in Proc. Medical Image Computing and Computer-Assisted Intervention (MICCAI), May
2015.
[178] F. Chollet, “Xception: Deep Learning with Depthwise Separable Convolutions”, in Proc. IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), Apr. 2017.
[179] Y. Wang, Z. Jiang, Y. Li, J.-N. Hwang, G. Xing, and H. Liu, “RODNet: A Real-Time Radar Object
Detection Network Cross-Supervised by Camera-Radar Fused Object 3D Localization”, IEEE Journal of
Selected Topics in Signal Processing, vol. 15, no. 4, pp. 954–967, Jun. 2021.
[180] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the Inception Architecture for
Computer Vision”, in Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun.
2016.
[181] T. Pécot, M. C. Cuitiño, R. H. Johnson, C. Timmers, and G. Leone, “Deep learning tools and modeling
to estimate the temporal expression of cell cycle proteins from 2D still images”, PLOS Computational
Biology, vol. 18, no. 3, Mar. 2022.
[182] H. Kaiser, “Integration und Evaluation von 3D Sensoren für autonome mobile Robotikplattform”, M.S.
thesis, Technische Universität Darmstadt, Darmstadt, 2020.
[183] F. Bauer, “Signalverarbeitung zur 3D Doppler-Geschwindigkeitsbestimmung mit einem luftgekoppelten
Ultraschall Phased-Array”, M.S. thesis, Technische Universität Darmstadt, Darmstadt, 2020.
[184] A. Schäfer, “Lidar basiertes Referenzsystem zur Evaluierung der Doppler Geschwindigkeitsschätzung
eines 3D Ultraschallsonar”, B.S. thesis, Technische Universität Darmstadt, Darmstadt, 2021.
[185] R. Müller, D. Schenck, G. Allevato, M. Rutsch, J. Hinrichs, M. Kupnik, and M. Pesavento, “Dictionary-
Based Learning for 3D-Imaging with Air-Coupled Ultrasonic Phased Arrays”, in Proc. IEEE International
Ultrasonics Symposium (IUS), Sep. 2020.
142
List of acronyms
3D three-dimensional
ADC analog-digital-converter
API application programming interface
ASIC application specific integrated circuit
CBF conventional beamforming
CIC cascaded-integrator-comb
CPMUT capacitive-micromachined-ultrasonic-transducer
CR contrast ratio
CUDA compute unified device architecture.
CW continuous-wave
DAS delay-and-sum
dB decibel
DC direct current
DoA direction of arrival estimator
DOF depth of field
FFT Fast Fourier Transform
FIFO first in first out
FoV field-of-view
FPGA field-programmable-gate-array
FWHM full-width-half-maximum
GLSL OpenGL shading language
GLZ grating lobe zone
GMM gaussian mixture model
GPGPU general purpose computing on graphics processing unit
143
GPU graphic processing unit
HPBW half-power-beam-width
IC integrated circuit
IES inter-element spacing
MC Monte Carlo
MCU micro controller unit
MEMS micro-electro-mechanical-system
MLA multi line acquisition
MLW main lobe width
MLZ main lobe zone
MSLL maximum side lobe level
NDT non-destructive testing
OpenGL open graphics library
PCB printed-circuit-board
PCM pulse code modulation
PDM pulse density modulation
PE pulse echo
PMUT piezoelectric-micromachined-ultrasonic-transducer
PRF pulse repetition frequency
PSF point-spread-function
PSRAM pseudo static random access memory
PUT piezoelectric ultrasonic transducer
PVDF Polyvinylidenfluorid
PWM pulse width modulation
PZT lead zirconate titanate
QSPI quad serial parallel interface
RAM random access memory
ROI region-of-interest
ROS robot operating system
144
RoV range-of-view
RX receive
SLA single line acquisition
SLL side lobe level
SLZ side lobe zone
SNR signal-to-noise-ratio
SoC system-on-chip
SPL sound pressure level
ToF time-of-flight
TR transmit receive
TSMF two-scale-multi-frequency
TX transmit
ULA uniform line array
URA uniform rectangular array
145
List of symbols
Symbol Unit Description

r m Coordinate vector
rP m Wave field coordinate vector
r0 m Focus point coordinate vector
rm m Element coordinate vector
x, y, z m Cartesian coordinates
xm , y m , z m m Element coordinates
R m Distance
θ rad Azimuth angle
ϕ rad Elevation angle
R0 m Focus distance
θ0 rad Focus azimuth angle
ϕ0 rad Focus elevation angle
Rnat m Natural focus distance
u, v Direction cosine coordinates
c m/s Speed of sound
f0 Hz Wave frequency
λ m Wavelength
m Element index
M Number of elements
Mx , M y Number of elements in x and y
Min Number of inner sub-aperture elements
d m Inter element spacing
d̄ m Average inter element spacing
Dap,x,y m Aperture diameter or dimension in x and y
Dap, in m Aperture diameter of inner sub-array
Rap m Aperture radius
RP,m m Euclidian distance from element to coordinate
∆R m Distance difference
∆T s Time difference
∆φ rad Phase difference
τ s Time delay due to distance
s0 (t) Base excitation signal
A(t) Amplitude envelope of base excitation signal
∆A Amplitude difference
stx,rx Array transmit / received signals vector
Stx,rx Array transmit / received signals matrix
srx, bp Bandpass-filtered received signals vector
Srx, bp Bandpass-filtered received signals matrix
146
Symbol Unit Description
p(rP , r0 ) Superimposed wave / Spatial filtered signal value
p(rP , r0 ) Superimposed wave / Spatial filtered signal vector
P(rP , r0 ) Superimposed wave / Spatial filtered signal matrix
Penv Envelope of superimposed wave / spatial filtered signal matrix
a(rP ) Steering vector
A(rP ) Steering matrix
H
wfar (θ0 , ϕ0 ) Far-field beamforming vector
WH
far (θ0 , ϕ0 ) Far-field beamforming matrix
wH far (r0 ) Beamforming vector
WH far (r0 ) Beamforming matrix
L Number of mesh point sources
l Mesh point source index
G Number of reflectors/sources
g Reflector/source index
R m Range of view vector
Rmin , Rmax m Minimum / maximum view distance
Tmin , Tmax s Minimum / maximum acquisition time window
D rad Field of view vector
θmin , , θmax rad Minimum, maximum azumuth angle of field of view
θstep rad Azimuth angular step size of field of view
K Number of scanning directions
Kθ Number of azimuth scanning directions
Kθ Number of elevation scanning directions
k Scanning direction index
fs Hz Sample rate
n Time sample index
N Number of time samples
MLW3,6,null rad Main lobe width at -3 dB, -6 dB, to first order minima
Dmin m Minimum spacing for separation
∆θres rad Angular resolution
CRpp Peak-to-peak contrast ratio
Bpulse Hz Bandwidth of transmit pulse
TPRF s Pulse repitition time
fPRF Hz Pulse repitition frequency / Frame rate
147
List of Figures
1.1 Examples of next-gen mobile robots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1 Coordinate system and array terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 Sketch of unfocused beam steering with a 1x2 array . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Unfocused beam steering with an 8x8 array . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4 Unfocused beam steering with different aperture sizes . . . . . . . . . . . . . . . . . . . . . . 12
2.5 Sketch of focused beam steering with an (1 × 8)-(λ/2)-array . . . . . . . . . . . . . . . . . . . 13
2.6 Focused beam forming at different focal distances . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.7 Sketch of grating lobe formation with a 1x2 array . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.8 Grating lobe formation with a 8x8 0.6 lambda array . . . . . . . . . . . . . . . . . . . . . . . . 16
2.9 Grating lobe beam patterns highlighting the impact of the grating lobe width . . . . . . . . . . 17
2.10 Meshing and beam pattern of non-zero element aperture . . . . . . . . . . . . . . . . . . . . . 18
2.11 Impact of the element aperture size on beam steering . . . . . . . . . . . . . . . . . . . . . . . 19
2.12 Unfocused receive beamforming to θ = 0◦ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.13 Unfocused receive beamforming to θ0 = 20◦ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.14 Focused receive beamforming to Rfoc = 1 λ, θ = 0◦ . . . . . . . . . . . . . . . . . . . . . . . . 23
2.15 Spatial filter response in θϕ- and uv-domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.16 Discretization of the region-of-interest in θϕ- and uv-domain . . . . . . . . . . . . . . . . . . . 26
2.17 Pulse-echo image formation process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.18 Image quality metrics: range resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.19 Image quality metrics: angular resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.20 Image quality metrics: contrast ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.21 Point spread function in θϕ-domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.22 Convolution of point spread function in uv-domain . . . . . . . . . . . . . . . . . . . . . . . . 35
2.23 Circular convolution characteristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.24 Multi Line Acquisition vs. Single Line Transmission . . . . . . . . . . . . . . . . . . . . . . . . 38
3.1 Waveguided phased array geometry with inserted PUT transducers and electronics. . . . . . . 41
3.2 Transceiver phased array system block diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.3 Anechoic chamber measurement setups for the waveguided array. . . . . . . . . . . . . . . . . 46
3.4 Pulse envelope of the waveguided and bare transducer. . . . . . . . . . . . . . . . . . . . . . . 47
3.5 Transmit characteristic of a single bare and waveguided transducer. . . . . . . . . . . . . . . . 48
3.6 Receive characteristics of the waveguide array using the MLA method. . . . . . . . . . . . . . . 48
3.7 B-Scans and sectional patterns of two radial and angular spheres for measuring range and
angular resolution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.1 Sparse spiral air-coupled ultrasonic PUT array. . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.2 Anechoic chamber measurement setups for evaluating the sparse spiral array. . . . . . . . . . . 56
4.3 Two-dimensional directivity patterns for TX, RX, PE mode and simulation. . . . . . . . . . . . 58
4.4 Sectional directivity patterns for for varying focal angles . . . . . . . . . . . . . . . . . . . . . 60
148
4.5 Key parameter behaviour of the sectional directivity patterns for for varying focal angles . . . 61
4.6 Radial on-axis pattern and key parameters for varying focal distances . . . . . . . . . . . . . . 63
4.7 Angular resolution evaluation and simulation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.8 Far-field multi-reflector scene and corresponding image (TUDA). . . . . . . . . . . . . . . . . . 66
4.9 Near-field multi-reflector scene and corresponding image (hand). . . . . . . . . . . . . . . . . 66
4.10 2D sectional image of the hand for different threshold levels. . . . . . . . . . . . . . . . . . . . 67
4.11 Amplitude and phase errors of the sparse spiral PUT array. . . . . . . . . . . . . . . . . . . . . 69
4.12 MA40S4S Butterworth-van-Dyke model, frequency and phase response variations. . . . . . . . 70
4.13 Ideal vs. degraded beam pattern due to amplitude and phase errors. . . . . . . . . . . . . . . 70
4.14 Results of the MC simulations for either multiple amplitude or phase deviation limits. . . . . . 72
4.15 Duty-cycled excitation to reduce the transmit amplitude and resulting phase shifts. . . . . . . 73
4.16 Relative amplitude and phase errors before and after the calibration. . . . . . . . . . . . . . . 74
4.17 Comparison of the ideal beam pattern to the non-calibrated case and the beam pattern of the
calibrated array. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.1 Sonar head of the embedded 3D system consisting of a four-layer PCB stack. . . . . . . . . . . 78
5.2 Block diagram of the electronics and system components. . . . . . . . . . . . . . . . . . . . . . 78
5.3 Experimental measurement setup and angular resolution results. . . . . . . . . . . . . . . . . 79
5.4 Measured frame rates vs. number of directions for the USB and WiFi communication. . . . . . 80
5.5 Waveguided single-microcontroller sonar system with T-array configuration. . . . . . . . . . . 81
5.6 Blockdiagram of the electronics and system components of the single-microcontroller sonar
system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.7 Measurement setup in the anechoic chamber, directivity and angular resolution measurements. 83
6.1 MSLL and MLW over aperture diameter for multiple number of array elements. . . . . . . . . . 88
6.2 Two sunflower arrays with a large and small diameter and corresponding PSFs. . . . . . . . . 88
6.3 Voronoi tesselation and Delaunay triangulatio of a sunflower array and PSF zone estimation. . 90
6.4 Example two-scale 64-element array nesting a denser inner sub-array with a sparser outer
sub-array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.5 Comparison of angular resolution of a classic sunflower and two-scale spiral array. . . . . . . . 93
6.6 Different density-tapered spiral array geometries of the benchmark and their corresponding
density window functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.7 Benchmark results of the different density-tapered array geometries. . . . . . . . . . . . . . . 96
6.8 Examples of optimum two-scale array geometry types. . . . . . . . . . . . . . . . . . . . . . . 98
6.9 Three PUT transducers using different resonant frequencies. . . . . . . . . . . . . . . . . . . . 101
6.10 Pulse packet echo signal and matched filtering. . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.11 Multi-frequency signal processing architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.12 Selected prototype geometry based on the optimum array comparison. . . . . . . . . . . . . . 103
6.13 Two-scale multi-frequency system design front view. . . . . . . . . . . . . . . . . . . . . . . . . 104
6.14 System components and electronic modules on the backside of the PCB. . . . . . . . . . . . . . 105
6.15 Transmit measurement setup and waveguides. . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.16 DirectivityWaveguid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.17 Comparison of SPL loss due to waveguides. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.18 Comparison of pulse shape alterations due to the waveguides. . . . . . . . . . . . . . . . . . . 109
6.19 Measurement setup for the receive mode characteristics and microphone directivity results. . . 110
6.20 Broad-band frequency response of the MEMS microphone sensitivity. . . . . . . . . . . . . . . 111
6.21 Analysis of relative receive amplitude errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.22 Analysis of the compensated relative phase errors. . . . . . . . . . . . . . . . . . . . . . . . . . 112
149
6.23 Measured receive mode directivity patterns. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.24 Setup for measuring the directivity in the pulse-echo mode and results. . . . . . . . . . . . . . 114
6.25 Comparison of the pulse-echo to the simulated and receive-only PSF. . . . . . . . . . . . . . . 115
6.26 Angular resolution measurement and results. . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.27 Multi-reflector measurement setup forming a peace logo and imaging results. . . . . . . . . . 117
6.28 Comparison of the peace pattern imaged with the two-scale and the classic sunflower geometry.118
7.1 Auto-encoder network architecture for image enhancement. . . . . . . . . . . . . . . . . . . . 122

7.2 Example of a 2D normalized CBF sonar image including multiple targets with different reflection
characteristics and Gaussian noise. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.3 Histogram of the angular error across the entire field of view for every single localization. . . 125
7.5 Error of the detected number of reflectors over the angular spacing between two adjacent
reflectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.4 Example of object separation capabilities for close adjacent reflectors. . . . . . . . . . . . . . . 126
8.1 Summary of sonar system concepts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

8.2 Future work using mobile robots and Doppler velocity estimation. . . . . . . . . . . . . . . . . 130
150
List of Tables
2.1 Far-field angular resolution approximations for one- and two-way beamforming using ULA and
circular apertures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.1 Summary of the measurement results for the MLA and SLA method . . . . . . . . . . . . . . . 51
4.1 Main characteristics of the spiral and waveguide array. . . . . . . . . . . . . . . . . . . . . . . 68

4.2 Worst-case MC simulation results for the PUT-typical combined amplitude (p̂) and phase (φ)
deviation limits (±0.5, ±60◦ ). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.3 Experimental calibration results of the example PUT array (64 pre-selected transducers). . . . 75
7.1 Simulation settings for training data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

7.2 Test results for random scenes similar to training data. . . . . . . . . . . . . . . . . . . . . . . 125
151

2023-03-17 Dissertation Gianni Allevato (1)

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

2023-03-17 Dissertation Gianni Allevato (1)

Hochgeladen von

Copyright:

Verfügbare Formate

Ultrasonic Phased Arrays for

3D Sonar Imaging in Air

1. Gutachten: Prof. Dr. mont. Mario Kupnik

Electrical Engineering and

Accepted doctoral thesis by Gianni Allevato

Date of submission: 28. März 2023

Darmstadt, Technische Universität Darmstadt

Bitte zitieren Sie dieses Dokument als:

Dieses Dokument wird bereitgestellt von tuprints,

Die Veröffentlichung steht unter folgender Creative Commons Lizenz:

§ 8 Abs. 1 lit. d PromO

Darmstadt, 28. März 2023

Darmstadt, March 2023

• G. Allevato, M. Rutsch, J. Hinrichs, C. Haugwitz, R. Müller, M. Pesavento, and M. Kupnik, ”Air-Coupled

• C. Haugwitz, C. Hartmann, G. Allevato, M. Rutsch, J. Hinrichs, J. Brötz, D. Bothe, P. F. Pelz, and M.

• G. Allevato, T. Frey, C. Haugwitz, M. Rutsch, J. Hinrichs, R. Müller, M. Pesavento, and M. Kupnik,

• S. Schulte, G. Allevato, C. Haugwitz, and M. Kupnik, ”Deep-Learned Air-Coupled Ultrasonic Sonar

• M. Rutsch, G. Allevato, J. Hinrichs, C. Haugwitz, R. Augenstein, T. Kaindl, and M. Kupnik, ”A compact

• M. Rutsch, L. Schultz-Fademrecht, G. Allevato, C. Haugwitz, J. Hinrichs, and M. Kupnik, ”Simulation

• J. Hinrichs, C. Haugwitz, M. Rutsch, G. Allevato, J. H. Dörsam, and M. Kupnik, Simulation of Lamb

• T. Maier, G. Allevato, M. Rutsch and M. Kupnik, ”Single Microcontroller Air-coupled Waveguided

• M. Rutsch, F. Krauß, G. Allevato, J. Hinrichs, C. Hartmann, and M. Kupnik, ”Simulation of protection

• C. Hartmann, C. Haugwitz, G. Allevato, M. Rutsch, J. Hinrichs, J. Brötz, D. Bothe, P. F. Pelz and M.

• J. Hinrichs, M. Sachsenweger, M. Rutsch, G. Allevato, W. M. D. Wright, and M. Kupnik, ”Lamb waves

• G. Allevato, J. Hinrichs, M. Rutsch, J. Adler, A. Jager, M. Pesavento, and M. Kupnik, ”Real-time 3D

• G. Allevato, M. Rutsch, J. Hinrichs, M. Pesavento, and M. Kupnik, ”Embedded Air-coupled Ultrasonic

• R. Müller, D. Schenck, G. Allevato, M. Rutsch, J. Hinrichs, M. Kupnik, and M. Pesavento, ”Dictionary

• J. Hinrichs, Y. Bendel, M. Rutsch, G. Allevato, M. Sachsenweger, A. Jäger, and M. Kupnik, ”Schlieren

• A. Jäger, J. Hinrichs, G. Allevato, M. Sachsenweger, S. Kadel, D. Stasevich, W. Gebhard, G. Hübschen, T.

2 Fundamentals of beamforming and imaging 7

3 Uniform dense waveguided transceiver PUT arrays 40

4 Non-uniform sparse spiral transceiver PUT arrays 53

5 Embedded low-cost sonar system concepts 76

6 Two-scale multi-frequency sparse spiral arrays 85

7 Deep-learned sonar image enhancement 121

List of acronyms 145

List of symbols 147

List of figures 150

List of tables 151

1.1 Challenges and objectives of sonar imaging in air

2. high frame-rates to perceive dynamically changing environments,

3. precise angular resolution and high contrast to obtain detailed images,

4. while maintaining a reasonable size, cost, computational and system complexity.

1.2 Original work

1.3 Thesis structure

2.1 Coordinate system and array terms

resulting in an aperture width/height of Dap,x/y = d · (Mx/y − 1). In the case of My = 1 or Mx = 1, the

2.2.1 Unfocused beam steering in the far-field

stx,m (t) = s0 (t − ∆Ttx,m ) = e−j∆φtx,m · s0 (t), (2.8)

the vector of excitation signals stx ∈ CM ×1 can be expressed by

stx (t) = w∗far (θ0 , ϕ0 )s0 (t). (2.10)

In summary, it is possible to steer a higher-amplitude wave front compared to a single-element wave to a

the equation for p(rP , θ0 , ϕ0 ) is compactly expressed as

p(rP , θ0 , ϕ0 ) = s⊺tx (θ0 , ϕ0 ) · a(rP ) = s0 (t) · wH

0 0.25 0.5 0.75 1 Natural focus

θ to obtain half of the MLW [45], [51].

2.2.3 Focused beam steering in the near-field

-1 -0.5 0 0.5 1 0 0.25 0.5 0.75 1 Rpeak

2.2.4 Impact of the inter-element spacing - grating lobe formation

d = 1.5 λ d = 1.2 λ d = 0.6 λ

∆R = sin(θGL )d = N λ, such that (2.24)

-1 -0.5 0 0.5 1 0 0.25 0.5 0.75 1

Partial grating lobe

2.2.5 Impact of the element aperture size - pattern multiplication