Tag: VLSI

A Multiplication-Free Algorithm and A Parallel Architecture for Affine Transformation

Affine transformation is widely used in image processing. Recently, it is recommended by MPEG-4 for video motion compensation. This paper presents a novel low power parallel architecture for texture warping using affine transformation (AT). The architecture uses a novel multiplication-free algorithm that employs the algebraic properties of the AT. Low power has been achieved at different levels of the design. At the algorithmic level, replacing multiplication operations with bit shifting saves the power and delay of using a multiplier. At the architecture level, low power is achieved by using parallel computational units, where the latency constraints and/or the operating latency can be reduced. At the circuit level, using low power building blocks (such as low power adders) contributes to the power savings. The proposed architecture is used as a computational kernel in video object coders. It is compatible with MPEG-4 and VRML standards. The architecture has been prototyped in 0.6 μm CMOS technology with three layers of metal. The performance of the proposed architecture shows that it can be used in mobile and handheld applications.

Wael Badawy and Magdy Bayoumi, “A Multiplication-Free Algorithm and A Parallel Architecture for Affine Transformation,” The Journal of VLSI Signal Processing-Systems, Kluwer Academic Publishers, Vol. 31, No 2, May 2002, pp. 173-184.

wbadmin September 10, 2015 Affine transformation, low-power, MPEG-4, texture mapping, video object, VLSI Journal Papers Comments Off

Architectures for Finite Radon Transform

Two VLSI architectures for the finite Radon transform are presented. The first is a reference architecture using memory blocks and the second is a memoryless architecture. The proposed architectures use 7×7 size image blocks and are prototyped for processing the CIF image sequence. The simulation and synthesis results show that the core speeds of the two proposed architectures are around 100 and 82 MHz, respectively.

Published in:

Electronics Letters (Volume:40 , Issue: 15 )

Page(s):: 931 – 932
ISSN :: 0013-5194
INSPEC Accession Number:: 8068176
DOI:: 10.1049/el:20040566

Date of Publication :: 22 July 2004
Date of Current Version :: 02 August 2004
Issue Date :: 22 July 2004
Sponsored by :: Institution of Engineering and Technology
Publisher:: IET

C. A. Rahman and W. Badawy, “Architectures for Finite Radon Transform“, The IEE Electronics Letters, Vol. 40, Issue 15, July 2004, pp. 931-932.

wbadmin September 4, 2015 100 MHz, 82 MHz, CIF image sequence processing, finite Radon transform, image blocks, image resolution, image sequences, integrated logic circuits, memory blocks, memoryless architecture, memoryless systems, parallel architectures, Radon transforms, reference architecture, VLSI, VLSI architectures Journal Papers Comments Off

Algorithm-Based Low Power VLSI Architecture For 2d-Mesh Video Object Motion Tracking

The new VLSI architecture for video object (VO) motion tracking uses a novel hierarchical adaptive structured mesh topology. The structured mesh offers a significant reduction in the number of bits that describe the mesh topology. The motion of the mesh nodes represents the deformation of the VO. Motion compensation is performed using a multiplication-free algorithm for affine transformation, significantly reducing the decoder architecture complexity. Pipelining the affine unit contributes a considerable power saving. The VO motion-tracking architecture is based on a new algorithm. It consists of two main parts: a video object motion-estimation unit (VOME) and a video object motion-compensation unit (VOMC). The VOME processes two consequent frames to generate a hierarchical adaptive structured mesh and the motion vectors of the mesh nodes. It implements parallel block matching motion-estimation units to optimize the latency. The VOMC processes a reference frame, mesh nodes and motion vectors to predict a video frame. It implements parallel threads in which each thread implements a pipelined chain of scalable affine units. This motion-compensation algorithm allows the use of one simple warping unit to map a hierarchical structure. The affine unit warps the texture of a patch at any level of hierarchical mesh independently. The processor uses a memory serialization unit, which interfaces the memory to the parallel units. The architecture has been prototyped using top-down low-power design methodology. Performance analysis shows that this processor can be used in online object-based video applications such as MPEG-4 and VRML

Wael Badawy and Magdy Bayoumi, “Algorithm-Based Low Power VLSI Architecture For 2d-Mesh Video Object Motion Tracking,” The IEEE Transaction on Circuits and Systems for Video Technology, Vol. 12, No. 4, April 2002, pp. 227-237

wbadmin August 31, 2015 2D mesh video-object motion tracking, Affine transformation, block matching, computational complexity, decoder architecture, decoding, image matching, image texture, low power VLSI architecture, memory serialization, mesh generation, Motion compensation, Motion estimation, MPEG-4, multiplication-free algorithm, online object-based video applications, optical tracking, parallel processing, parallel threads, pipeline processing, pipelined chain, power consumption, structured mesh topology, video coding, VLSI, VRML Journal Papers Comments Off

A Proposed Hardware Reference Model for Spatial Transformation and Quantization in H.264,

This paper presents three Very Large Scale Integration prototypes to exploit spatial redundancy in the H.264 standard. The proposed architectures are: (1) forward 4 × 4 integer approximation of DCT transform and quantization, which is applied to all blocks of a frame, (2) the 4 × 4 Hadamard transform and quantization that is applied to the DC coefficients of the luma component when the macroblock is encoded in 16 × 16 intra prediction mode, and (3) the 2 × 2 Hadamard transform and quantization that is applied to the DC coefficients of the chroma component as a second level in the transformation hierarchy. The developed algorithms are adopted by the H.264 standard. A performance analysis shows that the architectures satisfy the real-time constraints required by different digital video applications.

I. Amer, W. Badawy, G. Jullien, “A Proposed Hardware Reference Model for Spatial Transformation and Quantization in H.264,” Elsevier Journal of Visual Communication and Image Representation, Volume 17, Issue 2, April 2006, Pages 533-552.

wbadmin August 26, 2015 advanced video coding, ASIC, DCT, FPGA, H.264, Hadamard, hardware, quantization, transform, video coding, VLSI Journal Papers Comments Off

A Simplified 8×8 Transformation And Quantization Real-Time Ip-Block For Mpeg-4 H.264/Avc Applications: A New Design Flow Approach

Abstract

Current multimedia design processes suffer from the excessively large time spent on testing new IP-blocks with references based on large video encoders specifications (usually several thousands lines of code). The appropriate testing of a single IP-block may require the conversion of the overall encoder from software to hardware, which is difficult to complete in the short time required by the competition-driven reduced time-to-market demanded for the adoption of a new video coding standard. This paper presents a new design flow to accelerate the conformance testing of an IP-block using the H.264/AVC software reference model. An example block of the simplified 8 × 8 transformation and quantization, which is adopted in FRExt, is provided as a case study demonstrating the effectiveness of the approach.

To Download A SIMPLIFIED 8 × 8 TRANSFORMATION AND QUANTIZATION REAL-TIME IP-BLOCK FOR MPEG-4 H.264/AVC APPLICATIONS: A NEW DESIGN FLOW APPROACH

Ihab Amer, Wael Badawy, Graham Jullien, Marco Mattavelli, And Robert Turney, “A Simplified 8×8 Transformation And Quantization Real-Time Ip-Block For Mpeg-4 H.264/Avc Applications: A New Design Flow Approach,” Journal of Circuits, Systems, and Computers Vol. 16, No. 6 (2007) 1011–1026

Link to the list of other Peer Journal Publications

wbadmin August 23, 2015 advanced video coding, DCT, Design flow, FPGA, H.264, hardware, IP-block, platform, quantization, rapid prototyping, SystemC, transform, video coding, VLSI Journal Papers Comments Off

CAVLC Encoder Design for Real-Time Mobile Video Applications

Abstract

This brief presents a new context-based adaptive variable length coding (CAVLC) architecture. The prototype is designed for the H.264/AVC baseline profile entropy coder. The proposed design offers area savings by reducing the size of the statistic buffer. The arithmetic table elimination technique further reduces the area. The split VLC tables simplify the process of bit-stream generation and also help in reducing some area. The proposed architecture is implemented on Xilinx Virtex II field-programmable gate array (2v3000fg676-4). Simulation result shows that the architecture is capable of processing common/quarter-common intermediate format frame sequences in real-time at a core speed of 50 MHz with 6.85-K logic gates.

Published in:

Circuits and Systems II: Express Briefs, IEEE Transactions on (Volume:54 , Issue: 10 )

Page(s):: 873 – 877
ISSN :: 1549-7747
INSPEC Accession Number:: 9633945
DOI:: 10.1109/TCSII.2007.902215

Date of Publication :: Oct. 2007
Date of Current Version :: 15 October 2007
Issue Date :: Oct. 2007
Sponsored by :: IEEE Circuits and Systems Society
Publisher:: IEEE

C. A. Rahman and W. Badawy, “CAVLC Encoder Design for Real-time Mobile Video Applications”, The IEEE Trans. on Circuits and Systems II, Oct. 2007 Vol 54, Issue: 10, pp. 873-877.
Link to the list of other Peer Journal Publications

wbadmin August 22, 2015 Arithmetic, arithmetic table elimination, AUTHOR KEYWORDS, Automatic voltage control, bit-stream generation, CAVLC encoder design, Context-based adaptive variable length coding (CAVLC), context-based adaptive variable length coding architecture, digital arithmetic, entropy coding, field programmable gate arrays, field-programmable gate array, frequency 50 MHz, H.264/AVC baseline profile entropy coder, H264/AVC, hardware, hardware encoder, IEEE TERMS, INSPEC: CONTROLLED INDEXING, INSPEC: NON CONTROLLED INDEXING, logic design, logic gate, Logic gates, mobile communication, Prototypes, real-time mobile video application, real-time VLSI architecture, statistic buffer, Statistics, Table lookup, video coding, Video compression, VLSI, Xilinx Virtex II Journal Papers Comments Off

Dr. Wael Badawy, P.Eng. SIEEE SACM

Elevate your Message, Increase your Audience and Maximize your Profit

Tag: VLSI

A Multiplication-Free Algorithm and A Parallel Architecture for Affine Transformation

Architectures for Finite Radon Transform

Published in:

Algorithm-Based Low Power VLSI Architecture For 2d-Mesh Video Object Motion Tracking

A Proposed Hardware Reference Model for Spatial Transformation and Quantization in H.264,

A Simplified 8×8 Transformation And Quantization Real-Time Ip-Block For Mpeg-4 H.264/Avc Applications: A New Design Flow Approach

CAVLC Encoder Design for Real-Time Mobile Video Applications

Published in: