Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export counted values in petals #13

Open
RipollJ opened this issue Oct 15, 2021 · 0 comments
Open

Export counted values in petals #13

RipollJ opened this issue Oct 15, 2021 · 0 comments

Comments

@RipollJ
Copy link

RipollJ commented Oct 15, 2021

Hi,

pyvenn is very useful but for my use it lacks an export function of information on the groups in each petal.
The code below is a suggestion for this type of function. It exports the results as a table (dataframe).
You can improve it, if you don't want to use pandas.

import pandas as pd
from venn import *

def generate_logics(n_sets):
    """Generate intersection identifiers in binary (0010 etc)"""
    for i in range(1, 2**n_sets):
        yield bin(i)[2:].zfill(n_sets)
        
def generate_group_labels(datasets, fmt="{size}"):
    """Generate petal descriptions for venn diagram based on set sizes"""
    datasets = list(datasets)
    n_sets = len(datasets)
    dataset_union = set.union(*datasets)
    universe_size = len(dataset_union)
    petal_labels = {}
    petal_set2 = {}
    for logic in generate_logics(n_sets):
        included_sets = [
            datasets[i] for i in range(n_sets) if logic[i] == "1"
        ]
        excluded_sets = [
            datasets[i] for i in range(n_sets) if logic[i] == "0"
        ]
        petal_set = (
            (dataset_union & set.intersection(*included_sets)) -
            set.union(set(), *excluded_sets)
        )
        petal_labels[logic] = fmt.format(
            logic=logic, size=len(petal_set),
            percentage=(100*len(petal_set)/max(universe_size, 1))
        )
        petal_set2[logic] = fmt.format(
            logic=logic, size = (dataset_union & set.intersection(*included_sets)) -
            set.union(set(), *excluded_sets)
        )
        labelset = pd.DataFrame(petal_labels.items())
        grpset = pd.DataFrame(petal_set2.items())
        dfset = labelset.merge(grpset, on = 0)
        dfset.columns= ["Comparison", "Count", "Groups"]
    
    return dfset

Example:

groupset = {
    "Cells 1": {'RBM15', 'FAM76B', 'TNIK', 'TUSC3', 'ZFYVE27'},
    "Cells 2": {'RBM15', 'TMEM8A', 'ARRDC2', 'TUSC3', 'ZFYVE27'},
    "Cells 3": {'ARRDC2', 'UNC50', 'TNIK', 'TUSC3', 'TMEM8A'},
    "Cells 4": {'RBM15', 'UNC50', 'TNIK', 'TUSC3'}
}

dfset = generate_group_labels(groupset.values())
dfset

Output:

Comparison Count Groups
0001 0 set()
0010 0 set()
0011 1 {'UNC50'}
0100 0 set()
0101 0 set()
0110 2 {'TMEM8A', 'ARRDC2'}
0111 0 set()
1000 1 {'FAM76B'}
1001 0 set()
1010 0 set()
1011 1 {'TNIK'}
1100 1 {'ZFYVE27'}
1101 1 {'RBM15'}
1110 0 set()
1111 1 {'TUSC3'}

Best regards,

JR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant