Skip to content

5 · mix — addition, decibels, and click-free changes

Three small building blocks that show up in every audio program: mixing (adding signals), volume in decibels, and parameter smoothing (changing a value without a click).

The original controls volume live over the network with OSC messages from a tablet. Offline we have no live controller, so we treat a parameter as something that changes along the timeline (automation). The smoothing math is identical either way — and it is the real lesson here.

To combine two sounds, add them sample by sample. That is the whole idea — sound is a sequence of pressure numbers, and adding numbers adds the waves.

fn mix(in1: []const f32, in2: []const f32, out: []f32) void {
for (out, in1, in2) |*o, a, b| o.* = a + b;
}

Zig note — multi-sequence for. for (out, in1, in2) |*o, a, b| walks three slices together; Zig requires them to be the same length and checks it. *o is a pointer (we write the result), while a and b are read-only copies of each input sample.

Three sine oscillators at the ratio 1 : 1.25 : 1.5 (e.g. 220, 275, 330 Hz) sum to a major chord. But beware: adding can push the total outside [1,1][-1, 1], which clips. The fix is to turn things down — volume control.

Math note — headroom. Two full-scale signals sum to ±2.0, double the ceiling. So leave headroom: scale each voice down (the × 0.20.3 we used in earlier chapters) or scale the sum by about 1/N for N similar voices. We will quantify “turn down” properly with decibels next.

Changing volume is multiplying by a gain:

for (out, in) |*o, x| o.* = vol * x;

vol = 1 is unity gain (unchanged); vol = 0.5 halves the amplitude. But 0.5 does not sound “half as loud,” because hearing is logarithmic — the pressure at the threshold of pain is about a million times that at the threshold of hearing. So audio level is measured in decibels (dB):

dB=10log10 ⁣(PP0)=20log10 ⁣(AA0)\text{dB} = 10\log_{10}\!\left(\frac{P}{P_0}\right) = 20\log_{10}\!\left(\frac{A}{A_0}\right)

(Power is amplitude squared, and log(A2)=2logA\log(A^2) = 2\log A, which is where the 10 becomes 20.) In digital audio the reference A0A_0 is full scale (1.0), so the scale is called dBFS. Converting a plain gain factor to/from dB:

fn db2lin(g: f32) f32 {
return std.math.pow(f32, 10.0, g * 0.05); // 0.05 = 1/20
}
fn lin2db(g: f32) f32 {
return 20.0 * std.math.log10(g);
}

Math note — the 6 dB rule (again). 20log10(0.5)620\log_{10}(0.5) \approx -6 dB and 20log10(2)+620\log_{10}(2) \approx +6 dB, so ±6 dB = half/double the amplitude. Halving twice is −12 dB, and so on. Working in dB matches how you hear: equal dB steps feel like equal loudness steps, which is why every fader is calibrated in dB. (Note g * 0.05 is just g / 20 written as a multiply.)

A fixed −6 dB volume stage:

const gain = db2lin(-6.0); // ≈ 0.501
for (out, in) |*o, x| o.* = gain * x;

The original receives live volume changes over OSC (Open Sound Control — small UDP messages like /vol f -6 from a phone or tablet). It is essentially HTTP-for-control: an address (/vol) plus typed arguments. A handler updates a global vol whenever a message arrives.

Offline we have no live sender, so a parameter is simply a value we change at known points on the timeline — for example, “−6 dB for the first second, then 0 dB.” The instant we change it, though, we hit the same problem the live version has: a click.

Jumping vol from one value to another mid-signal creates a step — a discontinuity — in the output. Discontinuities are broadband energy: they click. The cure is to glide the value over a few milliseconds instead of jumping. Two standard ways.

Ramp from the current value to the target over a fixed number of samples (≈ 50 ms, ~2000 samples at 48 kHz, kills the click). Split vol into a target and a current, plus a step size and a countdown:

const VOL_STEPS = 2000;
const LinearSmoother = struct {
curr: f32 = 1.0,
target: f32 = 1.0,
step: f32 = 0.0,
ctr: u32 = 0,
fn setTarget(self: *LinearSmoother, g: f32) void {
self.target = g;
self.step = (g - self.curr) / @as(f32, VOL_STEPS);
self.ctr = VOL_STEPS;
}
fn tick(self: *LinearSmoother) f32 {
if (self.ctr > 0) {
self.ctr -= 1;
self.curr += self.step;
}
return self.curr;
}
};

Use it per sample: out[i] = smoother.tick() * in[i];.

Math note — a straight line. step = (target − curr) / N is “total distance ÷ number of steps,” so adding step each sample draws a straight line from curr to target over exactly N samples, then stops. Simple and predictable; it always finishes in a known time.

There is an even shorter method that you will meet everywhere in DSP. The entire smoother is one line:

const OnePole = struct {
mem: f32 = 0.0,
fn tick(self: *OnePole, target: f32) f32 {
self.mem = 0.001 * target + 0.999 * self.mem;
return self.mem;
}
};

Each sample the output takes a tiny step (0.1 %) toward the target and keeps the rest of its old value, so a sudden jump in target barely moves the output — it eases in. Writing a=0.999a = 0.999:

y[n]=(1a)x[n]+ay[n1]y[n] = (1-a)\,x[n] + a\,y[n-1]

Math note — why it is an exponential. Suppose the target is held constant at 0 and we start at y[0]=1y[0] = 1. Then y[1]=ay[1] = a, y[2]=a2y[2] = a^2, and in general: y[n]=any[n] = a^{\,n} That is exponential decay. For a general jump from AA to BB, the same recurrence gives y[n]=B+(AB)any[n] = B + (A-B)\,a^{\,n} — it heads toward BB, covering a fixed fraction of the remaining distance each sample (like a cooling cup of coffee). Smaller a → faster glide; larger a → gentler. If clicks remain, raise a toward 1.

Zig note — one struct, reused everywhere. This OnePole is the single most copy-pasted object in practical DSP — smooth a gain, a cutoff, a pan, a delay time, anything. Make one per parameter. (The original writes the constants as double on purpose, a quick trick to dodge denormal slowdowns; a cleaner fix is its own bonus chapter.)

Sample-rate independence. y[n]=any[n] = a^n is counted in samples, so the same a glides twice as fast at 96 kHz as at 48 kHz. To fix the time of the glide, compute a from a time constant and the sample rate — exactly the tool the next chapter introduces.

Slider scaling (cube it). A linear-in-dB fader spends as much travel on inaudible −60 dB as on the useful region near 0 dB. A common fix maps a linear slider position 0..1 through a cube:

fn sliderToGain(pos: f32) f32 {
return pos * pos * pos; // amp = position³ — more resolution near unity
}

Cubing bunches the fine control near the top (unity gain) and stretches out the quiet end, matching how hardware faders feel. PulseAudio and OBS both use this.

The value never quite arrives. Floating-point steps eventually get too small to change mem, so the one-pole can stall just short of the target. A pragmatic guard: if mem stopped changing, snap it to the target.

fn tickSnap(self: *OnePole, target: f32) f32 {
const prev = self.mem;
self.mem = 0.001 * target + 0.999 * self.mem;
if (self.mem == prev) self.mem = target; // resolution reached → snap
return self.mem;
}
  1. Sum three sines at 1 : 1.25 : 1.5, scale by 1/3, and write a chord. Remove the 1/3 and listen for clipping.
  2. Render a tone whose gain jumps −∞ → 0 dB at the halfway point, once raw (click) and once through OnePole (smooth). Hear the difference.
  3. Print lin2db(pos³) for pos = 0, 0.25, 0.5, 0.75, 1.0 and see how cube scaling spaces the dB values.
  4. Compare a = 0.999 vs a = 0.99 in OnePole: which glides faster, and why (think ana^n)?

The one-pole smoother is also called a leaky integrator or single-pole low-pass filter; the time-domain view here (giving an explicit ana^n formula) is the most intuitive. Original chapter with the full OSC/liblo plumbing and decay plots: mu.krj.st/mix. Julius O. Smith, “One-Pole,” and The Scientist and Engineer’s Guide to DSP, ch. 19, for the filter perspective.


Next: 6 · adsr — envelopes, and making that smoother sample-rate independent.