

## FEATURES

$8 \times 8$ Transform size.
$8 \times 8$ DCT calculation time $=3.2 \mu \mathrm{~s}$.
DC to 20 MHz pixel rate.
9 bit add/subtract input.
12 bit input/output.
14 bit fixed coefficients.
Multifunction capability (DCT, IDCT, Filter).
Full internal precision, for each dimension.
Fully synchronous interface.
High speed CMOS implementation.
TTL compatible.
Single $+5 \mathrm{~V} \pm 10 \%$.
Power dissipation < 1.5 Watt.
44 pin plastic package.

## DESCRIPTION

The IMS A121 is a device for computing the Discrete Cosine Transform (DCT \& IDCT). It will also function as a 2-D linear filter or perform matrix transposition. These 4 functions operate on blocks of data with a fixed size of 64 samples $(8 \times 8)$. The IMS A121 has other functions aimed specifically at the implementation of video codecs; on-chip subtraction and addition functions may be selected to reduce system chip count.

The main computation is performed by two identical multiplication arrays, each of which perform an $8 \times 8$ matrix multiplication in 64 cycles, with no internal rounding. The DCT/filter coefficients ( 14 bit) are stored in 4 banks of fixed ROM. The intermediate $8 \times 8$ matrix result is rounded to 16 bits and stored in the transposition RAM between each multiplication array. The device is fully pipelined with data sampled on the input at the clock frequency and the resultant output appearing 128 clock cycles later.

### 5.1 OVERALL DEVICE OPERATION

The IMS A121 is a device for computing the Discrete Cosine Transform (DCT) and the Inverse Discrete Cosine Transform (IDCT). It can also perform a simple low-pass filter operation.
The IMS A121 processes blocks of data which are 64 samples long and represent an $8 \times 8$ matrix. Data is sampled on the Din port every cycle and data is output every cycle on the Dout port.
The GO signal is used to indicate the start of a block. When it is sampled high the data on the Din port is the first sample of the block. The mode select signals SEL[2-0] are sampled at the same time. The remainder of the block of data is sampled on the Din port for the subsequent 63 cycles and during this time the GO signal and the SEL port are ignored. Each consecutive group of eight samples is treated as a column, eight such columns making a block.
The computation is in two stages, between which the block of 64 intermediate samples is stored in the transposition RAM. The transposition RAM serves a dual function of storing the intermediate results and transposing the data from column order into row order. This permits the two matrix computation elements to be identical although the first stage does the column computations and the second stage does the row computations.
Data is output on the Dout port in blocks of 64 samples. However, each consecutive group of eight samples now represents a row because of the internal transposition of data. The first sample of the block is output on the Dout port 128 cycles after the first sample of the block was sampled on the Din port.

An auxiliary port, Dx is provided. The data on the Dx port is optionally subtracted from the data on the Din port (DCT mode) or added to the output (IDCT mode).

The IMS A121 views input data in column order and (because of the internal transposition) output data in row order. However, this convention is only used to define the arithmetic which the IMS A121 performs. The system in which the IMS A121 is a component may well view the data going into the IMS A121 in row order and the data coming out in column order.

### 5.1.1 The fixed ROM coefficients

There are four sets of fixed ROM coefficients, each corresponding to one of the four possible functions the device can perform. The two main functions which the device can perform are the DCT and the IDCT. The other two functions provide assistance for the implementation of a video codec. The filter function is provided at very little overhead because the device is essentially a 2-D filter. The transposition function which is a unity multiplication, enables a simple method of switching out the filter without any external logic.

### 5.1.2 Number formats

All numbers input to the IMS A121 are signed integers. The Din and Dout ports use 12 bit signed integers, while the Dx port uses 9 bit signed integers. In both cases the number format is twos complement binary. Little Endian format is assumed throughout, so that, for example, Din[0] is the least significant bit of the Din port and Din[11] the most significant (sign) bit. When a nine bit number is transfered over one of the 12 bit ports the most significant nine bits are used. The lowest three bits of the Din port are ignored and the lowest three bits of the Dout port will be zero.

### 5.1.3 Internal Bit-field Selectors and Rounding

The transforms are implemented by a matrix multiplication with no truncation or rounding. This yields a 33 bit result, with bit-field selectors provided to select the parts of the result which are of interest. 16 bits are selected from the output of the first matrix multiplication, which are stored in the matrix transposition RAM. Either 9 bits or 12 bits are selected from the output of the second matrix multiplication (depending on the selected mode).

Bits below the selected range are discarded although the result is rounded not truncated. This is a simple round towards $+\infty$; if the most significant bit of those bits which have been discarded is set then one is added to the bits which are retained.

### 5.1.4 Overflow, Saturation and Clipping

Overflow can occur in the subtraction unit, the two bit-field selectors or the addition unit. Overflow occurs whenever there are insufficient bits in the result to represent the number. When overflow occurs the result is replaced by the most positive or the most negative number which can be represented (depending on the sign of the correct result).

The device will normally be used in a feedback system. If either positive or negative overflow occurs, then inaccuracies have been introduced. However, the system will remain stable.

In some of the IDCT modes the output is clipped so that all results are positive and all negative numbers are replaced by zero. This ensures that the output is a valid ( 8 -bit) pixel, between 0 and 255.

### 5.1.5 Subtraction with the DCT function

When the IMS A121 is used to perform the DCT, it is possible to enable the on-chip subtraction unit, so that before the DCT the data on the Dx port is subtracted from the data on the Din port. The data is presented to the Dx port at exactly the same time as to the Din port.

In DCT mode the data on the Din port is a nine bit number (the lowest 3 of 12 bits are ignored). The result of the subtraction is saturated to nine bits before being passed to the matrix multiplier.

### 5.1.6 Addition with the IDCT function

When the IMS A121 is used to perform the IDCT, it is possible to enable the on-chip addition unit, so that after the IDCT of the data has been done, the result may be added to the data on the Dx port. The timing requires careful consideration because of the latency of the device ( 128 cycles). The first sample of a block must be presented on the Dx port 124 cycles after the first sample was presented to Din. The data presented to the Dx port should be transposed and is thus in the same order as it will come out of Dout four cycles later.

The result of the addition is saturated to nine bits and then clipped so that all negative numbers are replaced by zero. The nine bit result is presented on Dout[11-3], while Dout[2-0] will be zero. Dout[11] will be zero because all the numbers are positive.

Two modes are provided which perform the IDCT without addition. One of these modes disables the adder completely so that nine bit signed results appear on Dout. The other mode does NOT add on the value on the Dx port but still clips the result so that only positive values appear on Dout.

### 5.1.7 Resetting

The IMS A121 does not have a reset pin. At power-on the internal state will be undefined and as a result the first three blocks processed are not guaranteed correct. GO must be held low for at least 63 cycles to ensure that when it does go high it is interpreted as the start of a block.

### 5.2 DCT FUNCTION

The DCT function is selected when SEL[2-0]=000 or 100 (mode 0 or 4 ).

### 5.2.1 Internal number format

The input for the DCT is a 9 bit signed integer in the range $\mathbf{- 2 5 6}$ to $\boldsymbol{+ 2 5 5}$. This is either an external input or the output of the on-chip subtractor depending on SEL[2-0]. The input is multiplied in the matrix multiplication array by 14 bit signed fixed point numbers in the range -2 to $\left(2-2^{-12}\right)$. The accumulated result of 8 multiply operations is a 26 bit signed integer, the bottom 8 bits of which are rounded (see section 5.1.3) and the top 2 bits used to saturate the output (see section 5.1.4). The result of the first matrix multiply is stored as a 16 bit signed integer and the second matrix multiply performed in exactly the same manner, yielding 33 bit results. The output rounds the bottom 19 bits, saturates the top 2 bits giving a 12 bit signed integer in the range -2048 to +2047 .


Figure 5.1 DCT internal number format

### 5.2.2 Internal data flow



### 5.2.3 The mathematical basis for the DCT

The 1 dimensional equation for the DCT is as follows:

$$
\begin{aligned}
& \text { Forward transform } X(k)=\sqrt{\frac{2}{N}} c(k) \sum_{m=0}^{N-1} x(m) \cos \left[\frac{(2 m+1) k \pi}{2 N}\right] \quad k=0,1, \cdots, N-1 \\
& \text { where } c(k)= \begin{cases}\frac{1}{\sqrt{2}} & \text { for } k=0 \\
1 & \text { for } k=1 \cdots N-1\end{cases}
\end{aligned}
$$

where $x(m)$ represent the input samples and $X(k)$ is the resulting output. The special case for the IMS A121 is with $\boldsymbol{N}=8$ and the actual filter coefficients are then calculated. The following equation is used to calculate the actual filter coefficients.

$$
\text { DCT coefficients Coeff } f_{k m}=\sqrt{2} c(k) \cos \left[\frac{(2 m+1) k \pi}{2 N}\right]
$$

It should be noticed that the coefficients are $2 \sqrt{2}$ times bigger than in the forward transform equation. This means that the output after the 2 dimensional DCT is 8 times too big (The 1 dimensional transform is applied twice giving $(2 \sqrt{2})^{2}$ magnitude increase). This is in accordance with the 3 bit shift of the output data necessary to give the correct 12 bit signed output.

### 5.2.4 DCT coefficients

$\left[\begin{array}{rrrrrrrr}1.0000 & 1.0000 & 1.0000 & 1.0000 & 1.0000 & 1.0000 & 1.0000 & 1.0000 \\ 1.3870 & 1.1759 & 0.7857 & 0.2759 & -0.2759 & -0.7857 & -1.1759 & -1.3870 \\ 1.3066 & 0.5412 & -0.5412 & -1.3066 & -1.3066 & -0.5412 & 0.5412 & 1.3066 \\ 1.1759 & -0.2759 & -1.3870 & -0.7857 & 0.7857 & 1.3870 & 0.2759 & -1.1759 \\ 1.0000 & -1.0000 & -1.0000 & 1.0000 & 1.0000 & -1.0000 & -1.0000 & 1.0000 \\ 0.7857 & -1.3870 & 0.2759 & 1.1759 & -1.1759 & -0.2759 & 1.3870 & -0.7857 \\ 0.5412 & -1.3066 & 1.3066 & -0.5412 & -0.5412 & 1.3066 & -1.3066 & 0.5412 \\ 0.2759 & -0.7857 & 1.1759 & -1.3870 & 1.3870 & -1.1759 & 0.7857 & -0.2759\end{array}\right]$

### 5.2.5 DCT coefficients (14 bit signed integers)

$\left[\begin{array}{rrrrrrrr}4096 & 4096 & 4096 & 4096 & 4096 & 4096 & 4096 & 4096 \\ 5681 & 4816 & 3218 & 1130 & -1130 & -3218 & -4816 & -5681 \\ 5352 & 2217 & -2217 & -5352 & -5352 & -2217 & 2217 & 5352 \\ 4816 & -1130 & -5681 & -3218 & 3218 & 5681 & 1130 & -4816 \\ 4096 & -4096 & -4096 & 4096 & 4096 & -4096 & -4096 & 4096 \\ 3218 & -5681 & 1130 & 4816 & -4816 & -1130 & 5681 & -3218 \\ 2217 & -5352 & 5352 & -2217 & -2217 & 5352 & -5352 & 2217 \\ 1130 & -3218 & 4816 & -5681 & 5681 & -4816 & 3218 & -1130\end{array}\right]$

### 5.3 IDCT FUNCTION

The IDCT function is selected when SEL[2-0]=001, 101 or 111 (modes 1,5 or 7).

### 5.3.1 Internal number format

The input for the IDCT is a 12 bit signed integer in the range -2048 to +2047 . The input is multiplied in the matrix multiplication array by 14 bit signed fixed point numbers in the range -2 to $2-2^{-12}$. The accumulated result of 8 multiply operations is a 29 bit signed integer, the bottom 8 bits of which are rounded (see section 5.1.3) and the top 5 bits used to saturate the output (see section 5.1.4). The result of the first matrix multiply is stored as a 16 bit signed integer and the second matrix multiply performed in exactly the same manner, yielding 33 bit results. The output rounds the bottom 19 bits, saturates the top 5 bits giving a 9 bit signed integer in the range -256 to +255


Figure 5.2 IDCT internal number format

### 5.3.2 Internal data flow



### 5.3.3 The mathematical basis for the IDCT

The 1 dimensional equation for the IDCT is as follows:

$$
\begin{aligned}
& \text { Inverse transform } x(m)=\sqrt{\frac{2}{N}} \sum_{k=0}^{N-1} X(k) c(k) \cos \left[\frac{(2 m+1) k \pi}{2 N}\right] m=0,1, \cdots, N-1 \\
& \text { where } c(k)= \begin{cases}\frac{1}{\sqrt{2}} & \text { for } k=0 \\
1 & \text { for } k=1 \cdots N-1\end{cases}
\end{aligned}
$$

where $x(m)$ represent the output samples and $X(k)$ is the input. The special case for the IMS A121 is for $N=8$ and the actual filter coefficients are then calculated. The following equation is used to calculate the actual filter coefficients.

$$
\text { IDCT coefficients Coeff } f_{m k}=\sqrt{2} c(k) \cos \left[\frac{(2 m+1) k \pi}{2 N}\right]
$$

It should be noticed that the coefficients are $2 \sqrt{2}$ times bigger than in the inverse transform equation. This means that the output after the 2 dimensional IDCT is 8 times too big (The 1 dimensional transform is applied twice giving $(2 \sqrt{2})^{2}$ magnitude increase). This is in accordance with the 3 bit shift of the output data necessary to give the correct result.

### 5.3.4 IDCT coefficients

$\left[\begin{array}{rrrrrrrr}1.0000 & 1.3870 & 1.3066 & 1.1759 & 1.0000 & 0.7857 & 0.5412 & 0.2759 \\ 1.0000 & 1.1759 & 0.5412 & -0.2759 & -1.0000 & -1.3870 & -1.3066 & -0.7857 \\ 1.0000 & 0.7857 & -0.5412 & -1.3870 & -1.0000 & 0.2759 & 1.3066 & 1.1759 \\ 1.0000 & 0.2759 & -1.3066 & -0.7857 & 1.0000 & 1.1759 & -0.5412 & -1.3870 \\ 1.0000 & -0.2759 & -1.3066 & 0.7857 & 1.0000 & -1.1759 & -0.5412 & 1.3870 \\ 1.0000 & -0.7857 & -0.5412 & 1.3870 & -1.0000 & -0.2759 & 1.3066 & -1.1759 \\ 1.0000 & -1.1759 & 0.5412 & 0.2759 & -1.0000 & 1.3870 & -1.3066 & 0.7857 \\ 1.0000 & -1.3870 & 1.3066 & -1.1759 & 1.0000 & -0.7857 & 0.5412 & -0.2759\end{array}\right]$

### 5.3.5 IDCT coefficients ( 14 bit signed integers)

$\left[\begin{array}{rrrrrrrr}4096 & 5681 & 5352 & 4816 & 4096 & 3218 & 2217 & 1130 \\ 4096 & 4816 & 2217 & -1130 & -4096 & -5681 & -5352 & -3218 \\ 4096 & 3218 & -2217 & -5681 & -4096 & 1130 & 5352 & 4816 \\ 4096 & 1130 & -5352 & -3218 & 4096 & 4816 & -2217 & -5681 \\ 4096 & -1130 & -5352 & 3218 & 4096 & -4816 & -2217 & 5681 \\ 4096 & -3218 & -2217 & 5681 & -4096 & -1130 & 5352 & -4816 \\ 4096 & -4816 & 2217 & 1130 & -4096 & 5681 & -5352 & 3218 \\ 4096 & -5681 & 5352 & -4816 & 4096 & -3218 & 2217 & -1130\end{array}\right]$

### 5.4 FILTER FUNCTION

The filter function is selected with $\operatorname{SEL}[2-0]=010$. (mode 2)
This filter is intended to be used for image data, taking 9 bit signed input data and giving a 9 bit signed result.

### 5.4.1 Internal number format

The input to the filter is a 9 bit signed integer in the range -256 to +255 . The input is multiplied in the matrix multiplication array by 14 bit signed fixed-point numbers in the range -2 to $2-2^{-12}$. The accumulated result of 8 multiply operations is a 26 bit signed integer, the bottom 5 bits of which are rounded (see section 5.1.3) and the top 5 bits are used to saturate the output (see section 5.1.4). The result of the first matrix multiply is stored as a 16 bit signed integer and the second matrix multiply performed in exactly the same manner, yielding 33 bit results. The output rounds the bottom 19 bits, saturates the top 5 bits giving a 9 bit signed integer in the range -256 to +255 .


Figure 5.3 Filter and Transpose internal number format

### 5.4.2 Internal data flow



### 5.4.3 Definition of filter

The filter is a simple $\frac{1}{4}-\frac{1}{2}-\frac{1}{4}$ filter applied in both dimensions which means that the overall filter kernel is:

$$
\frac{1}{16}\left[\begin{array}{lll}
1 & 2 & 1 \\
2 & 4 & 2 \\
1 & 2 & 1
\end{array}\right]
$$

i.e. an output pixel is calculated from the corresponding pixel in the input field and its eight closest neighbours by evaluating

$$
\frac{1}{16}\left(4 \times \text { pixel }+2 \times\left(\sum \text { four adjacent pixels }\right)+1 \times\left(\sum \text { four diagonal pixels }\right)\right)
$$

However, at the block edges, where some of the pixels would fall outside the block boundary, the filter is modified to $0-1-0$ which means that along the edge the kernel would be:

$$
\frac{1}{16}\left[\begin{array}{lll}
0 & 0 & 0 \\
2 & 4 & 2 \\
1 & 2 & 1
\end{array}\right] \text { (rotated to suit) }
$$

and the corner pixels are passed through unmodified.

### 5.4.4 Fitter coefficients

$\left[\begin{array}{llllllll}1.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 \\ 0.2500 & 0.5000 & 0.2500 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 \\ 0.0000 & 0.2500 & 0.5000 & 0.2500 & 0.0000 & 0.0000 & 0.0000 & 0.0000 \\ 0.0000 & 0.0000 & 0.2500 & 0.5000 & 0.2500 & 0.0000 & 0.0000 & 0.0000 \\ 0.0000 & 0.0000 & 0.0000 & 0.2500 & 0.5000 & 0.2500 & 0.0000 & 0.0000 \\ 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.2500 & 0.5000 & 0.2500 & 0.0000 \\ 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.2500 & 0.5000 & 0.2500 \\ 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 1.0000\end{array}\right]$

### 5.4.5 Filter coefficients (14 bit signed integers)

$\left[\begin{array}{rrrrrrrr}4096 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 1024 & 2048 & 1024 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1024 & 2048 & 1024 & 0 & 0 & 0 & 0 \\ 0 & 0 & 1024 & 2048 & 1024 & 0 & 0 & 0 \\ 0 & 0 & 0 & 1024 & 2048 & 1024 & 0 & 0 \\ 0 & 0 & 0 & 0 & 1024 & 2048 & 1024 & 0 \\ 0 & 0 & 0 & 0 & 0 & 1024 & 2048 & 1024 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 4096\end{array}\right]$

### 5.5 TRANSPOSER FUNCTION

The transposition function is selected with SEL[2-0]=011. (mode 3)
This is intended to be used for filtering image data, taking 9 bit signed input data and giving a 9 bit signed result. Data is passed through unmodified and is intended to be used in conjunction with the filter function (SEL[2-0]=010), so that by toggling SEL[0] the filter can be switched in and out.

### 5.5.1 Internal number format and data flow

The internal number format and data flow for the transpose function are the same as for the filter function. Refer to sections 5.4.1 and 5.4.2.

### 5.5.2 Transposition coefficients

$\left[\begin{array}{llllllll}1.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 \\ 0.0000 & 1.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 \\ 0.0000 & 0.0000 & 1.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 \\ 0.0000 & 0.0000 & 0.0000 & 1.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 \\ 0.0000 & 0.0000 & 0.0000 & 0.0000 & 1.0000 & 0.0000 & 0.0000 & 0.0000 \\ 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 1.0000 & 0.0000 & 0.0000 \\ 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 1.0000 & 0.0000 \\ 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 0.0000 & 1.0000\end{array}\right]$

### 5.5.3 Transposition coefficients (14 bit signed integers)

$\left[\begin{array}{rrrrrrrr}4096 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 4096 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 4096 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 4096 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 4096 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 4096 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 4096 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 4096\end{array}\right]$

## 5.6 PIN DESIGNATIONS

## System services

| Pin | In/out | Function |
| :--- | :---: | :--- |
| VCC, GND <br> CLK | In | Power supply and return <br> Input clock |

## Synchronous input/output

| Pin | In/out | Function |
| :--- | :---: | :--- |
| GO | In/out | Initiate input/computation/output cycle |
| Din[11-0] | In | Data input port |
| Dout[11-0] | Out | Data output port |
| Dx[11-3] | In | Addition/subtraction port |
| SEL[2-0] | In | Mode select input port |

### 5.6.1 System services

## Power

Power is supplied to the device via the VCC and GND pins. Several of each are provided to minimise inductance within the package. All supply pins must be connected. The supply must be decoupled close to the chip by at least one 100 nF low inductance (e.g. ceramic) capacitor between VCC and GND. Four layer boards are recommended; if two layer boards are used, extra care should be taken in decoupling.

Input voltages must not exceed specification with respect to VCC and GND.

## CLK

The clock input signal CLK controls the timing of input and the output on the three dedicated interfaces, and controls the progress of data through the addition/subtraction units, multipliers and transposition RAM. Since the IMS A121 is fully static, the clock can be stopped in either phase without corrupting data.

### 5.6.2 Synchronous input/output

## GO

The GO signal is active high and is sampled on the rising edge of the input clock. If the device is processing a previous block of data, the GO signal is ignored. Otherwise, the processing of a block of 64 pixels commences and the GO signal is ignored for a further 63 cycles. Data is always assumed to be valid for the 64 cycles from the start of a major cycle. Blocks of data may be processed at any time and with any spacing between the major blocks, by toggling the GO signal as necessary.

## Din[11-0]

The data input port is sampled 64 times on successive clock cycles, commencing when GO is sampled high. Data must be valid on the rising edge of CLK for each of the 64 cycles. The block of data may be considered as an $8 \times 8$ matrix, where each group of 8 samples represents a column, and the 8 columns are sampled consecutively until the block is complete. The data is twos complement, Little Endian so that Din[11] gives sign information, and Din[0] is the least significant bit.

## Dout[11-0]

The data output port will be valid for periods spanning 64 clock cycles. The data will be valid on the rising edge of the clock, exactly 128 cycles (the latency) after the data was sampled on the input. This output data, which may be considered as an $8 \times 8$ matrix, is transposed with respect to the input data. The data is twos complement, Little Endian like the input data.

Blocks of data may follow directly after one-another so that the first data of a block is presented exactly 64 cycles after the first data of the preceding block. However, if there is a gap between blocks zero will appear on the data output port between blocks of data.

## Dx[11-3]

The addition/subtraction port is sampled on each clock cycle in exactly the same way as the data input port. The data on this port will either be subtracted from the signal on the data input port before matrix multiplication, or, added to the result of matrix multiplication prior to output. The addition and subtraction functions can never be used together. The function selected is determined by the SEL[2-0] signals. The data is twos complement, Little Endian like the Din/Dout data. Note however, that although the Dx port has a different width, $\operatorname{Dx[10]}$ has the same bitwise significance as Din[10]/Dout[10].

The timing of data on the Dx port is different depending on the selected mode.
In the case of subtraction in the DCT mode, $\operatorname{SEL}[1-0]=\mathbf{0 0}$, data is presented on the $\mathbf{D x}$ port on the same cycle as the corresponding data (from which it will be subtracted) is presented on the Din port.

In the case of addition in the IDCT mode, SEL[1-0]=01, data is presented on the Dx port exactly 4 cycles before the corresponding data (to which it will have been added) appears on the Dout port.

## SEL[2-0]

The mode select input port is sampled on the rising edge of CLK, when GO is active, at the start of a block of data. This fixes the selected mode for the entire block of data.

| SEL[2-0] | Mode | Function | PreSubtract | PostAdd | Clipping | Din width | Dout width |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 000 | 0 | DCT | Disabled | Disabled | Disabled | 9 | 12 |
| 001 | 1 | IDCT | Disabled | Disabled | Disabled | 12 | 9 |
| 010 | 2 | Filter | Disabled | Disabled | Disabled | 9 | 9 |
| 011 | 3 | Transpose | Disabled | Disabled | Disabled | 9 | 9 |
| 100 | 4 | DCT | Enabled | Disabled | Disabled | 9 | 12 |
| 101 | 5 | IDCT | Disabled | Enabled | Enabled | 12 | 9 |
| 110 | 6 | Reserved - Do not use |  |  |  |  |  |
| 111 | 7 | IDCT | Disabled | Disabled | Enabled | 12 | 9 |

### 5.7 ELECTRICAL SPECIFICATION

### 5.7.1 DC electrical characteristics

## Absolute maximum ratings

| Symbol | Parameter | Min | Max | Units | Notes (1) |
| :--- | :--- | :---: | :---: | :---: | :---: |
| VCC | DC supply voltage | 0 | 7.0 | V | 2 |
| VI, VO | Voltage on input and output pins | -1.0 | VCC +0.5 | V | 2 |
| TA | Temperature under bias | -40 | 85 | ${ }^{\circ} \mathrm{C}$ | 2 |
| TS | Storage temperature | -65 | 150 | ${ }^{\circ} \mathrm{C}$ | 2 |
| PDmax | Power dissipation |  | 1.5 | W | 2 |

1 All voltages are with respect to GND.
2 This is a stress rating only and functional operation of the device at these or any other conditions above those indicated in the operational sections of this specification is not implied. Stresses greater than those listed may cause permanent damage to the device. Exposure to absolute maximum rating conditions for extended periods may affect reliability.

DC operating conditions

| Symbol | Parameter | Min. | Nom. | Max. | Units | Notes (1) |
| :--- | :--- | :---: | :---: | :---: | :---: | :---: |
| VCC | DC supply Voltage | 4.5 | 5.0 | 5.5 | V |  |
| VIH | Input Logic '1' Voltage | 2.0 |  | VCC +0.5 | V | 2 |
| VIL | Input Logic '0' Voltage | -0.5 |  | 0.8 | V | 2 |
| TA | Ambient Operating Temperature | 0 |  | 70 | ${ }^{\circ} \mathrm{C}$ | 3 |

## Notes

1 All voltages are with respect to GND. All GND pins must be connected to GND.
2 Input signal transients 10 ns wide, are permitted in the voltage ranges GND - 0.5 V to GND - 1.0 V and VCC + 0.5 V to VCC + 1.0 V .

3400 linear ft/min transverse air flow.

## DC characteristics

| Symbol | Parameter | Min. | Max. | Units | Notes (1,2) |
| :--- | :--- | :---: | :---: | :---: | :---: |
| VOH | Output Logic '4' Voltage | 2.4 | VCC | V | $10 \leq-4.4 \mathrm{~mA}$ |
| VOL | Output Logic '0' Voltage | 0 | 0.4 | V | $10 \leq 4.4 \mathrm{~mA}$ |
| IIN | Input leakage current (any input) |  | $\pm 10$ | $\mu \mathrm{~A}$ | 3 |
| ICC | Average power supply current |  | 300 | mA | 4 |

## Notes

1 All voltages are with respect to GND. All GND pins must be connected to GND.
2 Under the conditions specified by the DC operating conditions.
3 VCC $=$ VCC (max), GND $\leq$ VIN $\leq$ VCC
4 This applies at 20 MHz and will be less at slower clock rates

### 5.7.2 A.C. timing characteristics

All timings are given for a load of 30pF unless otherwise stated.

## Clock requirements

| Symbol | Parameter | Min | Max | Units | Notes |
| :---: | :--- | :---: | :---: | :---: | :---: |
| tchCL | Clock Pulse High Width | 20 |  | ns |  |
| tclCh | Clock Pulse Low Width | 20 |  | ns |  |
| tchch | Clock Period | 50 |  | ns |  |
| tr | Clock rise time | 0 | 50 | ns | 1 |
| tf | Clock fall time | 0 | 50 | ns | 1 |

## Notes

1 The clock edges should be monotonic between VIL and VIH.


Synchronous input and output (Din, Dout, Dx)

| Symbol | Parameter | Min | Max | Units | Notes |
| :---: | :--- | :---: | :---: | :---: | :---: |
| tCHOW | CLK high to Dout Valid |  | 38 | ns |  |
| tCHOx | Dout hold time after CLK | 2 |  | ns |  |
| tDvCH | Din/Dx setup time to CLK high | 10 |  | ns |  |
| tCHOX | Din/Dx hold time to CLK high | 0 |  | ns |  |



Synchronous control (GO, SEL[2-0])

| Symbol | Parameter | Min | Max | Units | Notes |
| :---: | :--- | :---: | :---: | :---: | :---: |
| t GHCH | GO/SEL hold to clock high | 0 |  | ns |  |
| t Gsch | GO/SEL setup to clock high | 10 |  | ns |  |



5.8 PACKAGE SPECIFICATIONS
5.8.1 44 pin PLCC package


Figure 5.4 IMS A121 44 pin PLCC J-bend package pinout

## Note

All VCC pins must be connected to the 5 Volt power supply.
All GND pins must be connected to ground.


Figure 5.544 pin PLCC J-bend package dimensions

| DIM | Millimetres |  | Inches |  | Notes |
| :---: | ---: | :---: | :---: | :---: | :---: |
|  | NOM | TOL | NOM | TOL |  |
| A | 17.577 | $\pm$ | 0.692 | $\pm$ |  |
| B | 16.612 | $\pm$ | 0.654 | $\pm$ |  |
| C | 17.577 | $\pm$ | 0.692 | $\pm$ |  |
| D | 16.612 | $\pm$ | 0.654 | $\pm$ |  |
| F | 1.143 |  | 0.045 |  |  |
| G | 3.861 |  | 0.152 |  |  |
| H | 4.369 | $\pm$ | 0.172 | $\pm$ |  |
| J | 15.748 | $\pm$ | 0.620 | $\pm$ |  |
| K | 15.748 | $\pm$ | 0.620 | $\pm$ |  |
| L | 0.457 |  | 0.018 |  |  |
| M | 1.270 |  | 0.050 |  |  |

Table 5.144 pin PLCC J-bend package dimensions
PLCC thermal characteristics

| Symbol | Parameter | Min | Nom | Max | Units | Notes |
| :---: | :--- | :---: | :---: | :---: | :---: | :---: |
| $\theta$ JA | Junction to ambient thermal resistance |  |  |  | ${ }^{\circ} \mathrm{C} / \mathrm{W}$ | 1,2 |

## Notes

1 Measured at 400 linear $\mathrm{ft} / \mathrm{min}$ transverse air flow.
2 This parameter is sampled and not $100 \%$ tested.

### 5.9 ORDERING DETAILS

The following table indicates the designation of the IMS A121 variants.

| INMOS designation | Package | Clock speed | Military/commercial |
| :---: | :--- | :---: | :--- |
| IMS A121-J20S | Plastic LCC | 20 MHz | commercial |

