Non-music is an umbrella category for recorded audio whose primary purpose is not musical performance. It encompasses spoken word, speeches, interviews, poetry readings, comedy, audio documentaries, instructional recordings, field recordings, sound effects, and other forms of organized sound meant to inform, narrate, document, or entertain without relying on melody or conventional song structure.
Rather than emphasizing harmony, rhythm, and instrumentation, non-music foregrounds voice, text, ambient sound, narrative flow, informational content, and sonic texture. Its aesthetics range from raw, unedited actuality to highly produced studio works, and its scope spans archival preservation, education, performance art, and mass entertainment.
With the advent of commercial cylinders and discs in the early 20th century, record companies issued a significant number of non-musical titles: speeches, comic monologues, recitations, language lessons, and dramatic sketches. These releases leveraged the gramophone as a medium for documentation and education as much as entertainment, laying the foundation for a category distinct from music.
As radio matured, scripted dramatic works and reportage sensibilities influenced recorded audio. Labels pressed educational records, self-help and language LPs, and curated historic speeches. Poets and authors began issuing live and studio readings, while comedy albums turned stand-up into a popular recorded form.
Portable recorders encouraged field recording, sound effects libraries, and ethnographic documentation. Performance art and conceptual practices embraced voice, tape, and location sound, blurring lines with sound art and spoken word. Meanwhile, comedy, interviews, and instructional materials flourished on LP and cassette, later migrating to CD with expansive archival and effects compilations.
Digital distribution enabled an explosion of non-music formats: long-form interviews, audio documentaries, ASMR, lecture series, archival transfers, and curated field recordings. Production values diversified—from ultra-clean studio dialog to intentionally raw actuality—while metadata and searchability made libraries of speech, ambience, and effects integral to media production.
Decide whether the piece is a speech, interview, documentary segment, comedy monologue, poetry reading, field recording, or sound-effects construction. This determines script depth, recording environment, and post-production needs.
Write for the ear: use clear structure, signposting, and concise sentences. Coach vocal delivery for pacing, diction, dynamics, and microphone technique. For interviews, prepare open-ended prompts and allow space for natural pauses and follow-ups.
Choose a mic suited to the voice or environment (e.g., dynamic broadcast mic for studio speech, shotgun or mid–side for location, lavaliers for mobility). Capture clean room tone and minimize HVAC and handling noise. Record at 24-bit/48 kHz or higher to preserve headroom for dialogue editing.
Edit for clarity and narrative flow: remove redundancies and tighten pauses while preserving natural speech rhythm. Use fades, crossfades, and gentle dynamics control; apply EQ to enhance intelligibility (e.g., presence boost around 3–5 kHz) and de-ess sibilance. Maintain consistent loudness across segments.
Augment with ambient beds, archival clips, or foley only when they add context and do not distract from the voice. For field recordings and effects, document location, date, mic setup, and conditions; avoid overprocessing unless aesthetics demand it.
Normalize to an appropriate dialog loudness target for the medium. Include thorough metadata (speakers, topics, locations, rights) and transcripts for accessibility. Sequence tracks to support a coherent arc, even in non-narrative compilations.