UniBic: Universal Biclustering Algorithm

Algorithm

UniBic is an elementary method by which biologically meaningful trend-preserving biclusters can be readily identified from noisy and complex large data. The basic idea is to apply the longest common subsequence (LCS) framework to selected pairs of rows in an index matrix derived from an input data matrix to locate a seed for each bicluster to be identified.

Citing us: Wang, Z., Li, G., Robinson, R. W., Huang, X. (2016). UniBic: Sequential row-based biclustering algorithm for analysis of gene expression data. Scientific Reports, 6.

Usage

This software provides a biclustering module for microarray data. For a set of genes and a set of conditions, the program outputs a block-like structure which shows uniform trending-preserving pattern within the block, the block would contain only subsets of all given genes under subsets of all given conditions.

Certain parts of the code uses open-source data structure library codes, including:

fib http://resnet.uoregon.edu/~gurney_j/jmpc/fib.html, copyright information in fib.c
Mark A. Weiss's data structure codes http://www.cs.fiu.edu/~weiss/

Installation

Simply put "unibic1.0.tar.gz" in any directory,

$ tar zxvf unibic1.0.tar.gz

enter the folder "unibic1.0" and type "make" then the compiled codes are within the same directory as the source.

Inputs and outputs

The major program in the provided package is unibic, it can parse two formats of files, discrete data and continuous data, and examples for each are provided. See help and look at all available options.

$ ./unibic -h

Take a look at toy_example (discrete data) first. And try to run clustering

$ ./unibic -i toy_example -d

-d is important here since it tells the program that this is discrete data.

Then look at a continuous data "example". Try to run

$ ./unibic -i example -f .25

This restricts no two blocks overlap more than 0.25 of the size of each one. And the other parameters are default value.

For each input file, our program generates three output files, namely, '.blocks' file, '.chars'file and '.rules' file.

In '.blocks' file, you can see all the biclusters the program found, especially, we use a blank line to separate the positively and the negatively (if any) correlated genes in each bicluster.

As to '.chars' file, it provides the qualitative matrix of the microarray data to users with some details of how to discrete the data in '.rules' file. You can find further details about how to represent a microarray dataset with a qualitative matrix in our paper.

Change log

Version 1.0

latest version

Contact

Any questions, problems, bugs are welcome and should be dumped to Zhenjia Wang '[email protected]'

Creation: Dec. 22, 2014

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
README		README
README.md		README.md
cluster.c		cluster.c
cluster.h		cluster.h
example		example
expand.c		expand.c
expand.h		expand.h
fib.c		fib.c
fib.h		fib.h
get_options.c		get_options.c
get_options.h		get_options.h
lcs.c		lcs.c
lcs.h		lcs.h
main.c		main.c
main.h		main.h
make_graph.c		make_graph.c
make_graph.h		make_graph.h
makefile		makefile
read_array.c		read_array.c
read_array.h		read_array.h
struct.c		struct.c
struct.h		struct.h
toy_example		toy_example
write_block.c		write_block.c
write_block.h		write_block.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UniBic: Universal Biclustering Algorithm

Algorithm

Usage

Installation

Inputs and outputs

Change log

Contact

About

Releases

Packages

Languages

zhenjiawang157/UniBic

Folders and files

Latest commit

History

Repository files navigation

UniBic: Universal Biclustering Algorithm

Algorithm

Usage

Installation

Inputs and outputs

Change log

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages