Audio waveform visualizations with WebAudio API for Mic Input

In this article, we are going to see what is WebAudio API and how to build an audio waveform chart with the WebAudio API for Mic input. The focus of this article is mainly on audio waveforms.

Let’s get into it…

To start with,

What is WebAudio API?

The Web Audio API is a high-level JavaScript API for processing and synthesizing audio in web applications. This API provides a powerful and versatile system for controlling audio on the Web, and it allows developers to choose audio sources, add effects to audio, create audio visualizations, apply spatial effects (such as panning) and much more.

What shall we read in this article?

WebAudio API provides features to handle audio in web. One of the most interesting features of the Web Audio API is the ability to extract frequency, waveform, and other data from your audio source, which can then be used to create visualizations. This article explains how the WebAudio API works, and provides a couple of basic use cases… Much like a tutorial on WebAudio API. We will focus on using the WebAudio API for Mic input.

WebAudio concepts:

The WebAudio API involves handling audio operations inside an audio context, and has been designed to allow modular routing. Basic audio operations are performed with audio nodes, which are linked together to form an audio routing graph. Several sources — with different types of channel layout — are supported even within a single context. This modular design provides the flexibility to create complex audio functions with dynamic effects.

A simple, typical workflow for web audio looks follows:

wave

webaudioAPI_en

The WebAudio API as lot of interfaces to control audio functionality and events of an audio.

For more information you can check this link: https://developer.mozilla.org/en-US/docs/Web/API/Web_Audio_API

Here we are going to use the AudioContext to connect audio files.

So, what is AudioContext?

The AudioContext interface represents an audio-processing graph built from audio modules linked together, each represented by an AudioNode. An audio context controls both the creation of the nodes it contains and the execution of the audio processing, or decoding. You need to create an AudioContext before you do anything else, as everything happens inside a context.

Let us look at an example:

Before starting to extract data we need to access the user media input (Like a Mic, for example). For that,


function getUserMedia(dictionary, callback) {
try {
navigator.getUserMedia =
navigator.getUserMedia ||
navigator.webkitGetUserMedia ||
navigator.mozGetUserMedia;
navigator.getUserMedia(dictionary, callback, error);
} catch (e) {
alert('getUserMedia threw exception :' + e);
}
}
getUserMedia(
{
"audio": {
"mandatory": {
"googEchoCancellation": "false",
"googAutoGainControl": "false",
"googNoiseSuppression": "false",
"googHighpassFilter": "false"
},
"optional": [] },
}, gotStream);

getUserMedia uses secure connection so it won’t work on HTTP. Therefore, we should use HTTPS.

To extract audio data, we need to create an AnalyserNode. We are using the AudioContext.createAnalyser() method as:


var aCtx = new (window.AudioContext || window.webkitAudioContext)();
var analyser = aCtx.createAnalyser();

To connect audio source we need to use createMediaStreamSource method in the AudioContext.


function gotStream(stream)
{
source = aCtx.createMediaStreamSource(stream);
source.connect(analyser);
analyser.connect(distortion);
updatePitch();
}

The analyser node will then return audio data using a Fast Fourier Transform (fft) in a certain frequency domain, depending on what you specify as the AnalyserNode.fftSize property value (if no value is specified, the default is 2048.)

For capturing audio data, we can use the collection methods AnalyserNode.getFloatFrequencyData() and AnalyserNode.getByteFrequencyData() to capture audio frequency and AnalyserNode.getByteTimeDomainData() and AnalyserNode.getFloatTimeDomainData() to capture waveform data. These methods will copy the data to specified UnitArray() or FloatArray based on the methods. Here we going to user AnalyserNode.getFloatTimeDomainData() so we using float array to copy data.


analyser.fftSize = 2048;
var bufferLength = analyser.frequencyBinCount;
var dataArray = new Float32Array(bufferLength);

To retrieve the data and copy to our array, we have to call the data collection method, with the array passed as it’s argument. For example:


analyser.getByteTimeDomainData(dataArray);

We now have the audio data, and now for visualization, we need a canvas HTML5 element.

 

Let’s look at the example below:

Creating a waveform with live input:

To create oscilloscope visualisation we need to setup the buffer.


analyser.fftSize = 2048;
var bufferLength = analyser.frequencyBinCount;
var dataArray = new Uint8Array(bufferLength)

Next, we need to create and clear canvas:


wavecanvas = document.getElementById( "wavecanvas" );
canvasCtx = avgformcanline.getContext("2d");

canvasCtx.clearRect(0, 0, wavecanvas.width, wavecanvas.height);

We now define the updatePitch() function:


function updatePicth()
{
......
......
}

We use requestAnimationFrame() to keep looping the drawing:


updatePicthVisual = requestAnimationFrame(updatePicth);

Here the updatePitch() will do the audio analyzing and will draw the wave line.


function updatePitch()
{
updatePicthVisual = requestAnimationFrame(updatePicth);
analyser.fftSize = 1028;
analyser.getFloatTimeDomainData( dataArray );
canvasCtx.strokeStyle = "red";
for (var i=0;i<512;i+=2)
{
x = j*5;
if(DEBUGCANVAS.width < x)
{
x = DEBUGCANVAS.width - 5;
previousImage = canvasCtx.getImageData(5, 0, DEBUGCANVAS.width, DEBUGCANVAS.height);
canvasCtx.putImageData(previousImage, 0, 0);
canvasCtx.beginPath();
canvasCtx.lineWidth = 2;
canvasCtx.strokeStyle = "red";
prex = prex - 5;
canvasCtx.lineTo(prex,prey);
prex = x;
prey = 128+(dataArray[i]*128);
canvasCtx.lineTo(x,128+(dataArray[i]*128));
canvasCtx.stroke();
}
else
{
prex = x;
prey = 128+(dataArray[i]*128);
canvasCtx.lineWidth = 2;
canvasCtx.lineTo(x,128+(dataArray[i]*128));
canvasCtx.stroke();
}
j++;
}
}
The Output looks like this below:
wave-content

And that’s it! We have successfully created the audio waveform visualization in the Web for the audio that we have taken as input from a mic source. You can try this out with the code that I have provided below:

GITHUB ACCESS for EXAMPLE: https://github.com/agiratech-mani/picth-liveinput.git
I hope this has been useful. In the next blog, we shall see how to do the same with audio files (input as recorded audio) and display waveform visualisations for such audio.