Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sliding window across spatial neighbourhood #829

Open
FrancescaDr opened this issue May 22, 2024 · 2 comments
Open

Sliding window across spatial neighbourhood #829

FrancescaDr opened this issue May 22, 2024 · 2 comments

Comments

@FrancescaDr
Copy link
Contributor

Description

Implementation of sliding window (inspired by Kasumi) over a spatial neighbourhood or view.

Input:

  • window size
  • overlap

Output:

  • assignment of cells to sliding window
@richierip
Copy link

Hi Francesca,

If all you need is a labeling function and not anything fancier I think I can help. I'm a bioinformatics analyst at the KF Cancer Center in Boston - I've done this before in R for a lung cancer analysis. Wrote up new code that does what you ask and takes in an AnnData object. I haven't tested it that thoroughly so keep that in mind. Two pictures below after running apply_grid_to_cells(adata, window_size=2500, overlap =500) on example data from a NanoString CosMx dataset. Since each cell can be in multiple windows when overlap>0, we need multiple columns to keep track of these assignments (up to 25 per cell for these parameters). The output is 25 new columns in adata with unique labels for the windows. You can use pd.melt to convert adata.obs to a long-form dataframe and treat each window's cells as unique, so that later collapsing into window-per-row data is easy. Here's the function:

import math
import anndata as ad

''' Label cells into "windows" for a coarser spatial analysis 
    If overlap ==0, the windows for a grid and each cell is assigned to one unique window.
    Otherwise, each cell will be assigned to multiple windows. Multiple columns are needed as a result.
    Change  coord_columns to match columns names for XY coordinates in adata.obs '''
def apply_grid_to_cells(adata, window_size = 2500, overlap = 0, coord_columns = ('globalX', 'globalY')):
    x,y = coord_columns
    # No overlap
    if overlap ==0:
        xgrid, ygrid = xgrid, ygrid = (adata.obs[x] - min(adata.obs[x]))// window_size, (adata.obs[y]- min(adata.obs[y])) // window_size
        xrows = xgrid.max()
        adata.obs['Window grid'] = ((ygrid*xrows)+xgrid).astype(int).astype('category')
    else:
        grid_len = math.ceil(window_size / overlap)
        num_grid_systems = grid_len**2
        for i in range(num_grid_systems):
            grid_col_name = f'Sliding window grid {i+1}'
            row, col = (i // grid_len) , (i % grid_len)
            x_grid_start = min(adata.obs[x]) + (overlap * col)
            y_grid_start = min(adata.obs[y]) + (overlap * row)
            xgrid, ygrid = (adata.obs[x] - x_grid_start)// window_size, (adata.obs[y]- y_grid_start) // window_size
            xrows = xgrid.max()
            adata.obs[grid_col_name] = (f'{1+i}_' + ((ygrid*xrows)+xgrid).astype(int).astype(str)).astype('category')
    # adata modified in place

grid_cells_overlap1
grid_cells_overlap10

@FrancescaDr
Copy link
Contributor Author

@richierip thanks for starting this, I will check it out!

To continue the brain storm, some additional implementation features that need to be considered:

  • z-coordinate
  • storing the sliding windows in .obs will explode the storage space. Alternatively storing in .obsm
  • Move implementation to SpatialData instead

Getter functions:

  • statistics about min, max, avg number of nodes per window

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants