Digital Audio: Part 13 - Disk-based Audio
The random-access characteristic of the disk drive made it attractive for audio editing purposes and when drive prices fell as computers became popular the attraction was even stronger.
Compared to a traditional audio medium, a hard disk drive is completely alien. A traditional medium has a beginning and an end, and a speed. If the medium is propelled at the right speed from the beginning, the sound comes out right.
In contrast the hard drive doesn't have the equivalent of the speed of a traditional medium. Whilst it spins at some nominal rpm, the speed is not closely controlled and it has nothing to do with the time base inherent in any audio data stored there. The high speed of the disk is intended to shorten the time it takes for a wanted block of data to present itself under the heads. The heads sit on an air film and are not in contact with the disk. As a result there is no wear mechanism and disk drives can stay on line for extended periods.
The hard drive came into being in the early days of computing when memory was very expensive and the only alternative was tape, which was too slow. The disk fits somewhere in between. The disk has its entire data surface exposed. The surface is broken up into concentric rings each of which is divided into a number of data blocks. Early drives had one head per track and finding a specific block was a matter of selecting the correct head and waiting for the block to pass under it; a process known as a search.
It was found to be much more cost-effective to have a single head per surface that could be moved to the required track by a positioner in a process called a seek. A combination of a seek and a search will locate any block on the disk and blocks can be retrieved in any order.
Fig.1 - The data retrieval from a hard drive is interrupted by gaps between blocks and by longer gaps due to head positioning. A substantial area of memory is used as a buffer to allow a steady flow of audio data in real time.
The hard drive is best described as a rotating block-based random-access memory. In order to get a block of data from a hard drive, it must be addressed. A specific physical location on the drive must be selected and a combination of the action of a positioner and the rotation of the disk will position the head(s) at the selected block. The time taken to do that, the latency, is not constant but follows a distribution.
In comparison with traditional linear media such as tape, which have to wind forwards or backwards to locate a wanted part of the recording, the latency of the hard drive is practically negligible and results in a saving of time.
The demand for hard drives resulted in intense competition and the recording density and the cost per bit improved at a fantastic rate. Instead of having drives with unbelievable capacity, progress resulted in drives getting smaller, as that resulted in faster access. A number of small drives works better than one big one, because, with suitable arrangement of the data, one drive can be seeking and getting ready whilst another is transferring data. Overlapped seeks improve data throughput.
When a data block has been read, there will be an interruption to the data transfer for a time that also follows a distribution. If the next physical block happens to be a continuation of the same audio file, the interruption will be brief, whereas if the drive needs to seek to a different cylinder to get the next block the interruption will be longer.
In order to play an audio file at a constant sampling rate, the instantaneous transfer rate from the drive must be higher than the sampling rate and there must be a buffer memory between the drive and the DAC. Fig.1 shows how a series of data transfers with random gaps between them are converted into a continuous flow of audio samples by the buffer memory. The capacity of the memory must be enough to allow it to provide audio samples during the longest anticipated latency.
As soon as the memory has free capacity equal to one block, the drive provides another block so the memory is kept as full as possible. This mechanism is not automatic. Left to it's own devices, the drive will do nothing and every data block has to be requested.
Fig.2 shows how the timing structure of a hard drive based audio store operates. There is a master clock that produces the audio sampling rate and this is divided down to operate a time-code generator. The format of the disk is arranged so that a whole number of disk blocks, known as a cluster, stores the audio data corresponding to one time code frame.
When the recording is made, whenever a cluster of data is available in memory the disk directory is consulted to find a set of free blocks. When the recording is complete, there will be a table linking each time code frame with a physical address of a cluster of data blocks on the drive.
In order to play back the recording without interruption, the buffer memory must be pre-filled with audio data so that samples can then be read out at the sampling rate and sent to the DAC in an unbroken stream. As the buffer memory acts as a delay, there are essentially two time codes involved. The earlier, or advanced time code is responsible for fetching data from the disk to the buffer memory and the later time code is locked to the output of the memory.
The offset between the two time codes is the average delay of the memory. A single time code generator can be used with a constant subtracted so that two time codes with a fixed shift between them can be generated.
Fig.2 - The timing structure of a disk-based system locks the sampling rate to a time code generator. Time code then drives the addressing system so that the drive is mad to access audio data blocks along a time line.
In order to replay a recording, the time code generator is jammed to the starting time code of the clip. This results in the table being accessed to find the address of a cluster of data corresponding to that time code and a read command being issued to the drive. The drive will provide the data, which will start to fill up the memory.
At this stage nothing happens to the data because the time code controlling the memory output has an offset subtracted so the starting time code has yet to be reached.
One time code frame later the process repeats and the read of another cluster of data is initiated. Eventually the time code at the memory output corresponds to the starting time code of the clip and samples are sent to the DAC so the audio can be heard. By that time the memory will have been pre-filled by the advanced time code.
As the playback process is completely controlled by time code, a number of things immediately follow:
Firstly the time code can come from an external source, such as a video recorder or an automation system, and the audio replay will automatically synchronize to it. Secondly playback can take place from anywhere in the clip simply by jamming the time code generator to the appropriate code.
In a file server intended to store radio station jingles and commercials the mechanism described above would be enough to access and play them. Where editing is contemplated, there are some additional complexities.
In editing, one of the key processes is the location of edit points, where an audio clip will begin or end or be faded into another one. Editing machines need larger memories. The mechanism outlined above still takes place, but when the playback is stopped, the disk drive carries on fetching data to memory for a while. The result is that the memory then contains audio data both before and after the point where playback was stopped.
Using a scrub wheel that the operator can turn forwards or backwards, audio samples can be read from the buffer memory at any speed and in either direction. These are supplied to a rate convertor that then feeds the DAC so the audio can be heard. It is common to display the samples in the area as a waveform on the screen so that sonic features can be seen as well as heard.
Using this system, the buffer memory is arranged logically as a ring where samples are stored both before and after the scrub wheel pointer. The ideal is where the gap in the ring memory is directly opposite the scrub wheel pointer. If the scrub wheel is turned forward the system works much as it does in playback and future audio data are brought from the hard drive to a memory area after the scrub pointer. However, if the scrub wheel is reversed, earlier audio data must be brought from the hard drive to a memory area prior to the scrub wheel pointer.
In this way audio can be reproduced forwards and backwards as desired to locate an event or an edit point.
Broadcast Bridge Survey
You might also like...
Designing IP Broadcast Systems
Designing IP Broadcast Systems is another massive body of research driven work - with over 27,000 words in 18 articles, in a free 84 page eBook. It provides extensive insight into the technology and engineering methodology required to create practical IP based broadcast…
NDI For Broadcast: Part 3 – Bridging The Gap
This third and for now, final part of our mini-series exploring NDI and its place in broadcast infrastructure moves on to a trio of tools released with NDI 5.0 which are all aimed at facilitating remote and collaborative workflows; NDI Audio,…
Designing An LED Wall Display For Virtual Production - Part 2
We conclude our discussion of how the LED wall is far more than just a backdrop for the actors on a virtual production stage - it must be calibrated to work in harmony with camera, tracking and lighting systems in…
Microphones: Part 2 - Design Principles
Successful microphones have been built working on a number of different principles. Those ideas will be looked at here.
Expanding Display Capabilities And The Quest For HDR & WCG
Broadcast image production is intrinsically linked to consumer displays and their capacity to reproduce High Dynamic Range and a Wide Color Gamut.