Working with multi-channel audio-signals

Todays digital-mixing-consoles are offering high number of audio-channels for processing. I was curious of how modern devices are dealing with realtime digital audio-data and had a look into some of these protocols.

I2S, TDM and UltraNet

Even if you are not working with professional audio-equipment, at least one of the following systems has been used by you: 5.1, 7.1 (surround) or up to 11.1.4 audio (dolby atmos) in your living-rooms AVR. Here 6 to 16 audio-channels are embedded in a compressed dolby-digital or DTS audio-stream. As typical audio-mixing-consoles are dealing with even more digital channels, a compressed data-stream would have some drawbacks next to the additional computational load. So, most of the time, uncompressed PCM-samples with 16- or 24-bit are transferred at sample-rates of 44.1kHz to 96kHz. This leads to signals with data-rates of at least 1.152 Mbps for a mono-signal with 48kHz and 24-bit embedded in a suitable protocol for the individual connection.

I2S is a common connection for Audio-ADCs. In the picture above the important signals can be seen: a clock-signal (bitclock), an identifier for left and right (wordclock) and the data itself. TDM is extending this principle to 8 or more signals, while the wordclock is used to indicate the start of the transmission.

Next to this signals with separated data- and clock-lines a more compact signal-type has been developed by Sony and Philips: the Sony/Philips Digital Interface (S/PDIF) was the basis for the later AES/EBU or AES3, that combines clock- and data-information inherently in one single signal: AES3 The information of I2S is encoded in a single signal using a differential manchester encoded stream. Behringer implemented its UltraNet-signal based on this AES3 as it was specified for more than 2 channels. Using two AES3-streams with each 4 stereo-channels, UltraNet is transmitting 16 24-bit audio-signals with 48kHz.

Implementation

Receiving I2S (Stereo) or UltraNet (8 channels per connection) using VHDL is straight-forward using three processes clocked by a 100MHz clock:

Detect the edges of the bitclock-signal


detect_edge : process(clk)
begin
  if rising_edge(clk) then
    zbclk <= bclk; -- save current bit-clock
    if (zbclk = '1' and bclk = '0') then
      bclk_pos_edge <= '1';
    else
      bclk_pos_edge <= '0';
    end if;
  end if;
end process;

Read the individual data-bits on the rising-edge of the clock into a shift-register


get_data : process(clk)
begin
  if rising_edge(clk) then
    if (bclk_pos_edge = '1') then
      -- in AES3/EBU the first bit after preamble is LSB,
      -- so we have to shift from the left to the right
      sample_data <= sdata & sample_data(sample_data'high downto 1);
    end if;
  end if;
end process;

Copy data to output shift-register on rising- or falling-edge of wordclock


detect_sample : process(clk)
begin
  if rising_edge(clk) then
    zwclk <= wclk; -- save current word-clock
    if (wclk /= zwclk) then
      -- rising or falling edge of word-clock
      
      -- increase channel-counter (should be reset on Z-Preamble of AES3-signal)
      if (chn_cnt < 7) then
        chn_cnt <= chn_cnt + 1; -- increase channel-counter on each rising wordclock-edge
      else
        chn_cnt <= 0; -- reset channel-counter
      end if;
      
      -- set output-signals
      sample_out <= sample_data(23 downto 0); -- copy 24-bit audio-sample to output
      channel_out <= to_unsigned(chn_cnt, channel_out'length); -- output current channel-counter
      sync_out <= '1'; -- set sync-output
    else
      sync_out <= '0';
    end if;
  end if;
end process;

From then on, we have the audio-data as 24-bit-logic-vectors within the FPGA and can do whatever we want. At 48kHz samplerate, each sample-vector is then updated every 20.833us.

I2S uses only 2 channels, so the word-clock gives us direct information about which sample is for left- or right. Speaking about Behringers UltraNet, here all 8 channels of audio is encoded as a pseudo 192kHz AES3-stream, resulting in a pseudo 192kHz I2S data. But instead of having alternating left/right channels, we receive consecutive channels: ch1/ch2, ch3/ch4, ch5/ch6 and ch7/ch8 - each pair with alternating word-clock. So you have to keep track on a channel-counter, to assign the right channels. You can take the Z-Preamble of the AES3-signal to detect channel 1 and reset the channel-counter with this signal.

Youtube-Video

In 10/2023 I've published a video about this UltraNet-receiver on Youtube. In this episode I bring a broken Behringer P16-I back to life to build a FPGA-based audiomixer:

Projects where I used this receiver

I've implemented a working UltraNet-receiver in several of my projects at GitHub. You can have a look at the UltraNet-Receiver project directly. Here I'm using the Arduino MKR Vidor4000 to decode a 3.3V-UltraNet signal into 16 individual audio-samples and mix it into a stereo-signal for output. Within the Audioplayer project, I'm using this receiver to get additional audio-channels into my audiomixer, besides the Stereo-ADC and an SD-Card-audiostream. This receiver-function could be added to the FPGA-Audio-over-IP-Sender, to send all 16 audio-channels of the UltraNet to a specific computer for recording or live-streaming or something totally different.

Chris.Dev.Blog

Working with multi-channel audio-signals

I2S, TDM and UltraNet

Implementation

Youtube-Video

Projects where I used this receiver

Comments

Recent posts

Comments

Search