Access gated raw data

EasyFlowQ offers essential tools for exporting gated samples as raw data in widely-used formats. These formats are compatible with popular programming languages, ensuring seamless integration and accessibility for more complex analysis.

Export as csv, npy(z) and mat

The options for exporting the currently selected samples and subpopulation samples, with selected gate applied, can be access from the menu bar: Data --> Export raw (current gate)

RawExport

Popular formats including csv, npy/npz (numpy array) and mat (matlab data file) are available for export. When exported, each sample (or subpopulation) will be exported as a single file with names of the samples. If the file name exist, either due to file pre-existing in the file system or another sample exported earlier has the same name, EasyFlowQ will attempt to export it under a different name (e.g. sample1_1, sample1_2, etc.).

Access the exported raw data in npy format

npy is the standard file format used by numpy to store single matrix. Due to that regular numpy array does not store metadata (e.g. channel names), the exported npy is in fact a structured numpy array, with channel names as the field name in the array.

Here we provide a simple example importing an exported, gated sample in the npy format, and subsequently performs a simple clustering analysis with a mixed Gaussian model. The notebook of this example is also available in the github repo (here).

Import the required packages

import numpy as np
import matplotlib.pyplot as plt
from sklearn import mixture

Load the npy file (can be downloaded here) and access to the channel names:

flow_data = np.load('./01-Well-B1.npy')
print(flow_data.dtype.names)

This output: ('FSC-H', 'FSC-A', 'SSC-H', 'SSC-A', 'FL1-H', 'FL1-A', 'FL6-H', 'FL6-A', 'FL10-H', 'FL10-A', 'FL11-H', 'FL11-A', 'FSC-Width', 'Time')

Plot the data in the channels that are gated in EasyFlow

fig, ax = plt.subplots(figsize=(5, 3))
fig.dpi = 150
ax.plot(flow_data['FSC-A'], flow_data['SSC-A'], '.', alpha=0.3)
plt.xlabel('FSC-A')
plt.ylabel('SSC-A')

RawExportExample1

Clustering the data based on FL1-A (FITC) vs FSC-A data with mixed Gaussian mode

# convert to regular numpy array, and filter out <0 elements
flow_2d = np.vstack([flow_data['FSC-A'], flow_data['FL1-A']]).T
flow_2d = flow_2d[np.all(flow_2d > 0, axis=1), :]

# transform and standardize data before clustering
cluster_input = flow_2d.copy()
cluster_input = np.log(cluster_input)
cluster_input = (cluster_input - np.mean(cluster_input, axis=0)) / np.std(cluster_input, axis=0)

# fit the Gaussian Mixture Model
clf2 = mixture.GaussianMixture(n_components=2, covariance_type="full")
clf2.fit(cluster_input)

Plot the clustering result with sub-sampling

sub_sample_index = np.random.choice(np.arange(len(flow_2d)), 10000, replace=False)

fig, ax = plt.subplots(figsize=[5, 3])
fig.dpi = 150
ax.scatter(x=flow_2d[sub_sample_index,0], y=flow_2d[sub_sample_index,1], 
           c=clf2.predict(cluster_input[sub_sample_index, :]), 
           alpha=0.3, s=1)
ax.set_yscale('log')
ax.set_xlabel('FSC-A')
ax.set_ylabel('FITC-A')

RawExportExample2