Skip to content

abisheksethu/opencl-implementation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

opencl-implementation

This project describes the implementation of OpenCL standard for Zedboard using Portable Computing Language, POCL. This implementation supports both CPU and FPGA as a OpenCL device. OpenCL Implementation for FPGA includes supporting OpenCL-C APIs using xillybus project.

Table of Contents

1) General Information

1.1: OpenCL

Open Computing Language, OpenCL is a Open royalty-free standard for general purpose parallel programming across CPUs, GPUs and accelerators. It enables software developers to take full advantage of heterogeneous processing platforms.

1.2: POCL

Portable Computing Language, POCL aims to become a MIT-licensed open source implementation of OpenCL standard. It can be easily adapted for new devices. OpenCL Compiler option are supported by using

  • Clang as an OpenCL C Frontend
  • LLVM for kernel compiler implementation
  • Easily targeted for a device with LLVM backend

1.3: Xilinux Distribution

This project is implemented and tested using Xilinux distribution. Xilinux Distribution is a demo kit with ubuntu 12.04 for Processing System (PS) and xillybus IP core for Programmable Logic (PL). The PS can send and receive data to Programmable Logic part using linux drivers.

2) Pre-Requisites

2.1: Zedboard Pre-Requisites

  • Booting Xilinux on zedboard: In order to install the xilinux on zedboard with the base xillybus bitstream please follow the instructions given in the "Getting Started" guide provided in the xillybus website

  • Xillybus bitstream: With the xillybus IP core kit, Any application logic can be connected to plain FIFO's xillybus in PL.

2.2: POCL Pre-Requisites

In this project, we use pocl-0.11 version with llvm-3.6. To support llvm-3.6, host c++ toolchain version should be greater than gcc-3.7. Following are the required dependencies need to be installed before pocl.

  • LLVM, Clang and compiler-rt(pre-compiled binaries available)
  • Host compiler toolchain version: gcc-4.9 g++-4.9
  • libhwloc-dev 1.8 (available only when compiled from source)
  • ocl-icd 2.2.10 (available only when compiled from source)
  • To build software package: cmake-3.7
  • other dependencies like libz-dev, libffi-dev, autoconf, libtool, ruby1.8-dev, libtinfo-dev
  • Timing and Profiling information using PAPI and XLWT

Reference wiki page: Installing-POCL-dependencies-on-Ubuntu-linux-based-targets

All dependency packages are stored in this repository, we also developed a script file to install necessary packages with pocl-0.11. The script file configures the default host toolchain as gcc-4.9 and g++-4.9

3) Adding new device in POCL

ACCELERATOR as a device:

  • The POCL software architecture has device layer implementation in pocl-0.11/lib/CL/devices. The CPU device implementation can be found in pthread. Similarly, POCL has provided a basic device implementation to add our custom device, which can be customized in pocl-0.11/lib/CL/devices/basic/basic.c.
  • We use xillybus linux drivers to develop basic device layer in POCL.
  • Currently, we support two OpenCL APIs clEnqueueWriteBuffer, clEnqueueReadBuffer. These APIs are independent of OpenCL Compiler implementation.
  • In POCL's basic layer, The above two API's corresponding hardware definition can be found in pocl_basic_read() and pocl_basic_write().
  • We configured the device name as "xillybus" and device type as CL_DEVICE_TYPE_ACCELERATOR.
  • After compiling POCL using host toolchain, we get the implementation as a shared object in /usr/local/lib/libpocl.so, which supports both CPU and ACCELERATOR as a device.

4) OpenCL C Application Example

The customized OpenCL C APIs are tested using host_app/pocl_test.c, a OpenCL C application. The given application can be compiled using pocl library. These APIs are profiled using performance API (PAPI). This can be executed on CPU and FPGA device by exporting respective device name for POCL_DEVICES variable.

The example uses a OpenCL Kernel that adds the input to itself. The respective xillybus bitstream is available in /xillybus directory.

  • CPU as a device : POCL_DEVICES=pthread

  • FPGA as a device : POCL_DEVICES=xillybus

The script hostapp/run.sh automates the test case for variable input length of 1K to 16K. The comparision of two APIs for pthread and xillybus are shown below.

Timing graph for clEnqueueWriteBuffer alt tag

Timing graph for clEnqueueReadBuffer alt tag

About

OpenCL Implementation using pocl for zedboard

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published