Music Cognition, And
Computerized Sound
Sound Waves and Sine Waves
Introduction to Pitch Perception
Gwangyu Lee
Sound Waves and Sine Waves
Sound and Sine Waves
Sound is a Longitudinal Wave
How Fast Does Sound Travel?
344
meters per second
at 20 °C in air (1,128 ft/s)
Why can we hear
each instrument separately?
🎸 🎹
Because Air is a Linear System
Linear System
In1
→
Linear System
→
Out1
In1 + In2
→
Same System
→
Out1 + Out2
Linear Amp
Gain ×2, clean
Nonlinear Amp
Gain ×2, clipping
Nonlinearity in Percussion
Harder strike → new high frequencies appear (nonlinear)
Air Has Limits
Fully Nonlinear
💣
Explosion
Sound in air is linear at musical intensities
All regular, repeating motion is a sine wave.
0.5 Hz
Three Properties of a Sine Wave
| Property |
Units |
Description |
| Amplitude |
cm, Pa, dB |
Maximum displacement from center |
| Frequency |
Hz (cycles/sec), Period (1/f) |
Number of cycles per second |
| Phase |
Degrees (°), Radians |
Starting position in the cycle |
Sound Pressure
Atmospheric pressure: 101,325 Pa
Sound adds tiny pressure variations on top of this.
| Source |
Sound Pressure |
| Threshold of hearing |
0.00002 Pa |
| Normal conversation |
0.02 Pa |
| Concert |
2 Pa |
| Jet engine |
200 Pa |
Range: 0.00002 ~ 200 Pa — a 10,000,000× difference
Why Decibels?
The range of sound pressure is too wide for a linear scale.
| Pa |
dB SPL |
|
| 0.00002 |
0 dB |
Threshold |
| 0.002 |
40 dB |
|
| 0.02 |
60 dB |
Conversation |
| 2 |
100 dB |
Concert |
| 200 |
140 dB |
Jet engine |
10,000,000× compressed into 0 – 140 dB
Decibel Formula
dB = 20 log10( A1 / A2 )
A1 = measured sound pressure
A2 = reference sound pressure = 0.00002 Pa (threshold of hearing)
Conversation: 20 log10( 0.02 / 0.00002 ) = 20 log10( 1000 ) = 20 × 3 = 60 dB SPL
Concert: 20 log10( 2 / 0.00002 ) = 20 log10( 100,000 ) = 20 × 5 = 100 dB SPL
60 dB at 100 Hz and 60 dB at 1000 Hz
Do they sound equally loud?
Equal Loudness Curves
Same dB, different frequency → different perceived loudness
Musical Sound = Sum of Sine Waves
Because sound in air is a linear system
String Vibration Modes
Musical Notes, Frequencies, and Wavelengths
| Note Name |
Frequency (Hz) |
Wavelength |
| A0 | 27.5 | 12.5 m |
| A1 | 55 | 6.3 m |
| A2 | 110 | 3.1 m |
| A3 | 220 | 1.6 m |
| A4 | 440 | 0.78 m |
| A5 | 880 | 0.39 m |
| A6 | 1760 | 0.20 m |
| A7 | 3520 | 0.10 m |
Additive Synthesis
Harmonics: All vs Odd Only
All Harmonics
Piano-like timbre
Odd Harmonics Only
Clarinet-like timbre
What if the motion is not a simple sine?
A Brief History of FFT
1805
Gauss
Discovered the core idea of FFT — but never published
1807
Fourier
"Any function can be expressed as a sum of sine waves"
1965
Cooley & Tukey
Published the FFT algorithm — instant impact
📡 Signal Processing
🔬 Spectrum Analysis
🏥 MRI
🎧 Audio Engineering
📱 Telecommunications
Fourier Transform: Winding Machine
Center of Mass vs Frequency
Discrete Fourier Transform
X[k] = Σ x[n] · e-j2πkn/N
The Problem: O(N²) Computation
DFT requires N × N operations
DFT O(N²) vs FFT O(N log N)
DFT vs FFT: How They Compute
DFT
Visit every cell — O(N²)
FFT
Divide & conquer, reuse — O(N log N)
DFT: 0 ops
FFT: 0 ops
Gibbs Phenomenon
The overshoot at discontinuities never disappears, no matter how many harmonics you add.
Overshoot = (2/π) ∫0π (sin x / x) dx − 1 ≈ 8.95% (independent of N)
Gibbs in Digital Audio
Sharp filter cutoff causes ringing — Signal: 110 Hz + 275 Hz + 495 Hz
The Solution: Windowed Filters
Sharp in frequency → Ringing in time / Smooth in frequency → Clean in time
Oversampling
More samples per second → smoother filter, less artifacts
1× (44.1 kHz)
Nyquist at 22.05 kHz — needs sharp filter
4× (176.4 kHz)
Nyquist at 88.2 kHz — gentle filter is enough
The Effect of Phase on Waveform
16 harmonics, equal amplitudes — only the phase differs
Equal phasesall start at 0°
Schroeder phasesp(n) = n(n-1)Ï€/N
Random phasesdifferent each time
Same amplitude spectrum — waveforms look and sound different at low frequencies
How Phase Changes the Waveform
Equal Phases (all 0°)
All harmonics peak at the same time
Schroeder Phases
Each harmonic shifted by πn(n-1)/N
How can we
present a changing spectrum
in a way that is informative to the eye?
Spectrogram
Amplitude represented by degree of darkness — use your microphone
Windowing
How do we cut a signal for analysis?
Sampling & Aliasing
Sample Rate: 64 Hz
Filter Bank
Splitting a signal into frequency bands
Phase Vocoder: Pitch Shifting
Time Stretching: Phase Correction
How the phase vocoder stretches time without changing pitch
Wavelet vs Fourier
Fourier Transform
Equal bandwidth — same resolution at all frequencies
Wavelet Transform
Doubling bandwidth — adaptive time-frequency resolution
Introduction to
Pitch Perception
"How do we hear pitch?"
Are these two sounds
the same pitch?
Pitch
A perceptual quality, not a physical measurement
A machine that detects pitch must match human judgments
A simple frequency detector is not enough
Pitch vs Brightness
Pitch
→ Depends on periodicity
→ "Higher / Lower"
Brightness
→ Depends on high-frequency energy
→ "Brighter / Darker"
Same pitch: /i/ (beet) > /u/ (boot)
Trumpet > French Horn
How Many Cycles to Hear Pitch?
4-cycle tone burst vs 25-cycle tone burst
Missing Fundamental
Fletcher (1924): Remove f₀ with a filter — pitch doesn’t change!
Small radio can’t play low bass — yet you hear the pitch correctly
Why Does the Missing Fundamental Work?
Fletcher: 3 consecutive harmonics → brain infers fundamental
Virtual Pitch
Terhardt (1972)
Two Pitch Mechanisms
10 harmonics: large peaks = low-freq mechanism, wiggles = high-freq mechanism
Evidence for Two Mechanisms
Same three waveforms, heard at two different base frequencies
Odd Harmonics Only
Clarinet, closed organ pipes: f₀, 3f₀, 5f₀, 7f₀…
Chimes & Bells
Hemony bell partials (1644)
Shepard Tone
Roger Shepard (1964): Endlessly rising pitch illusion
Risset Paradox
Jean-Claude Risset: Double all frequencies — pitch goes down
Pitch perception is not absolute — it depends on context and partial relationships
1 / 63