The Urban Soundscapes of the World database currently contains more than 100 high-quality audiovisual recordings performed within 8 cities worldwide; more recordings are underway. One-minute fragments are available online. Contact us in case you need fragments of longer duration. The recordings are free to use for research and educational purposes.
Each recording consists of a 360-degree video file (4096 x 2048 resolution, 30 fps), a 4-channel first-order ambisonics (ACN/SN3D) audio file and/or a binaural audio file. All audio files have a sample rate of 48 kHz and are 24-bit PCM encoded. All audio and video files are time-synchronized. A YouTube preview is available for all recordings.
Recordings are made during the day, in favorable weather conditions with little to no wind. Note that the recordings always present a snapshot in time.
Combined and simultaneous audio and video recordings are performed using a portable, stationary recording setup as shown on the picture. The setup consists of the following components (from top to bottom):
- First order ambisonics: Core Sound TetraMic with windshield and Tascam DR-680 MkII 4-channel recording device;
- 360-degree video camera: GoPro Omni spherical camera system.
- Binaural audio: HEAD acoustics HSU III.2 artificial head with windshield and SQobold 2-channel recording device;
The ears of the artificial head, the video camera system and the ambisonics microphone are located at heights of about 1.5m, 1.7m and 1.9m, respectively. The recording setup is highly portable and takes only about 10 minutes to assemble/disassemble.
At each location, the recording system is oriented towards the most important sound source and/or the most prominent visual scene—this orientation defines the initial frontal viewing direction for the 360-degree video and ambisonics recordings, and the fixed orientation for the binaural recordings.
More details on the recording setup and protocol can be found in our publications.