Nodes¶
Data¶
These nodes are for performing general data related operations
LoadFile¶
Loads a save Transmission file. If you have a Project open it will automatically set the project path according to the open project. Otherwise you must specify the project path. You can specify a different project path to the project that is currently open (this is untested, weird things could happen). You should not merge Transmissions originating from different projects.
Note
You can also load a saved Transmission file by dragging & dropping it into the Flowchart area. It will create a LoadFile node with the name of the dropped.
Terminal
Description
Out
Transmission loaded from the selected file.
Parameters
Description
load_trn
Button to choose a .trn file (Transmission) to load
proj_trns
Load transmission file located in the project’s “trns” directory
proj_path
Button to select the Mesmerize project that corresponds to the chosen .trn file.
Note
The purpose of specifying the Project Path when you load a save Transmission file is so that interactive plots and the Datapoint Tracer can find raw data that correspond to datapoints.
LoadProjDF¶
Load the entire Project DataFrame (root) of the project that is currently open, or a sub-DataFrame that corresponds a tab that you have created in the Project Browser.
Output Data Column (numerical): _RAW_CURVE
Each element in this output column contains a 1-D array representing the trace extracted from an ROI.
Terminal
Description
Out
Transmission created from the Project DataFrame or sub-DataFrame.
Parameters
Description
DF_Name
DataFrame name. List correponds to Project Browser tabs.
Update
Re-create Transmission from corresponding Project Browser tab.
Apply
Process data through this node
Note
The DF_Name options do not update live with the removal or creation of tabs in the Project Browser, you must create a new node to reflect these types of changes.
Save¶
Save the input Transmission to a file so that the Transmission can be used re-loaded in the Flowchart for later use.
Usage: Connect an input Transmission to this node’s In terminal, click the button to choose a path to save a new file to, and then click the Apply checkbox to save the input Transmission to the chosen file.
Terminal
Description
In
Transmission to be saved to file
Parameters
Description
saveBtn
Button to choose a filepath to save the Transmission to.
Apply
Process data through this node
Note
You must always save a Transmission to a new file (pandas with hdf5 exihibts weird behavior if you overwrite, this is the easiest workaround). If you try to overwrite the file you will be presented with an error saying that the file already exists.
Merge¶
Merge multiple Transmissions into a single Transmission. The DataFrames of the individual Transmissions are concatenated using pandas.concat and History Traces are also merged. The History Trace of each indidual input Transmission is kept separately.
Warning
At the moment, if you create two separate data streams that originate from the same Transmission and then merge them at a later point, the analysis log (History Trace) of the individual data streams are not maintained. See the information about data blocks in the Transmission.
Terminal
Description
In
Transmissions to be merged
Out
Merged Transmission
ViewTransmission¶
View the input Transmission object using the spyder Object Editor. For example you can explore the Transmission DataFrame and HistoryTrace.
TextFilter¶
Include or Exclude Transmission DataFrame rows according to a text filter in a categorical column.
Usage Example: If you want to select all traces that are from photoreceptor cells and you have a categorical column, named cell_types for example, containing cell type labels, choose “cell_type” as the Column parameter and enter “photoreceptor” as the filter parameter, and select Include. If you want to select everything that are not photoreceptors select Exclude.
Note
It is recommended to filter and group your data beforehand using the Project Browser since it allows much more sophisticated filtering.
Terminal
Description
In
Input Transmission
Out
Transmission its DataFrame filtered accoring parameters
Parameters
Description
Column
Categorical column that contains the text filter to apply
filter
Text filter to apply
Include
Include all rows matching the text filter
Exclude
Exclude all rows matching the text filter
Apply
Process data through this node
HistoryTrace output structure: Dict of all the parameters for this node
SpliceArrays¶
Splice arrays derived in the specified numerical data column and place the spliced output arrays in the output column.
Output Data Column (numerical): _SPLICE_ARRAYS
Terminal
Description
In
Input Transmission
Out
Transmission with arrays from the input column spliced and placed in the output column
Parameters
Description
data_column
Numerical data column containing the arrays to be spliced
indices
The splice indices, “start_index:end_index”
Apply
Process data through this node
DropNa¶
Drop NaNs and Nones (null) from the Transmission DataFrame. Uses DataFrame.dropna and DataFrame.isna methods.
If you choose “row” or “column” as axis, entire rows or columns will be dropped if any or all (see params) of the values are NaN/None.
If you choose to drop NaNs/Nones according to a specific column, it will drop the entire row if that row has a NaN/None value for the chosen column.
Terminal
Description
In
Input Transmission
Out
Transmission NaNs and None’s removed according to the params
Parameters
Description
axis
Choose to rows, columns, or a rows according to a specific column.
how
any: Drop if any value in the row/column is NaN/None
all: Drop only if all values in the row/column are Nan/Noneignored if “axis” parameter is set to a specific columnApply
Process data through this node
NormRaw¶
Source
Scale the raw data such that the min and max values are set to the min and max values derived from the raw spatial regions of the image sequences they originate from. Only for CNMFE data.
The arrays in the _RAW_CURVE column are scaled and the output is placed in a new column named _NORMRAW
Terminal
Description
In
Input Transmission
Out
Transmission with the result placed in the output column
Parameter
Description
option
Derive the raw min & max values from one of the following options:top_5: Top 5 brightest pixelstop_10: Top 10 brighest pixelstop_5p: Top 5% of brightest pixelstop_10p: Top 10% of brightest pixelstop_25p: Top 25% of brightest pixelsfull_mean: Full mean of the min and max arrayApply
Process data through this node
Note
If the raw min value is higher than the raw max value the curve will be excluded in the output. You will be presented with a warning box with the number of curves that were excluded due to this.
Display¶
These nodes connect input Transmission(s) to various plots for visualization
The actual Plot Widget instance that these nodes use can be accessed through the plot_widget
attribute in the flowchart console.
For example
# Get a heatmap node that is named "Heatmap.0"
>>> hn = get_nodes()['Heatmap.0']
# the plot widget instance
>>> hn.plot_widget
<mesmerize.plotting.widgets.heatmap.widget.HeatmapTracerWidget object at 0x7f26e5d29678>
BeeswarmPlots¶
Based on pqytgraph Beeswarm plots.
Visualize data points as a pseudoscatter and as corresponding Violin Plots. This is commonly used to visualize peak features and compare different experimental groups.
For information on the plot widget see Beeswarm Plots
Terminal
Description
In
Input Transmission
The DataFrame column(s) of interest must have single numerical values, not arrays
Heatmap¶
Used for visualizing numerical arrays in the form of a heatmap. Also used for visualizing a hieararchical clustering tree (dendrogram) along with a heatmap with row order corresponding to the order leaves of the dendrogram.
For information on the plot widget see Heat Plot
Terminal
Description
In
Input Transmission
The arrays in the DataFrame column(s) of interest must be of the same lengthNote
Arrays in the DataFrame column(s) of interest must be of the same length. If they are not, you must splice them using the SpliceArrays node.
CrossCorr¶
Perform Cross-Correlation analysis. For information on the plot widget see CrossCorrelation Plot
Plot¶
For information on the plot widget see Simple Plot
A simple plot.
Terminal
Description
In
Input Transmission
Parameters
Description
data_column
Data column to plot, must contain numerical arrays
Show
Show/hide the plot window
Apply
Process data through this node
Proportions¶
Plot stacked bar chart of one categorical variable vs. another categorical variable.
For information on the plot widget see Proportions Plot
ScatterPlot¶
Create scatter plot of numerical data containing [X, Y] values
For information on the plot widget see Scatter Plot
Signal¶
Routine signal processing functions
I recommend this book by Tom O’Haver if you are unfamiliar with basic signal processing: https://terpconnect.umd.edu/~toh/spectrum/TOC.html
Butterworth¶
Source
Creates a Butterworth filter using scipy.signal.butter and applies it using scipy.signal.filtfilt.
The Wn parameter of scipy.signal.butter is calculated by dividing the sampling rate of the data by the freq_divisor parameter (see below).
Output Data Column (numerical): _BUTTERWORTH
Terminal
Description
In
Input Transmission
Out
Transmission with filtered signals in the output data column
Parameters
Description
data_column
Data column containing numerical arrays to be filtered
order
Order of the filter
freq_divisor
Divisor for dividing the sampling frequency of the data to get Wn
Apply
Process data through this node
SavitzkyGolay¶
Savitzky Golay filter. Uses scipy.signal.savgol_filter.
Output Data Column (numerical): _SAVITZKY_GOLAY
Terminal
Description
In
Input Transmission
Out
Transmission with filtered signals in the output data column
Parameters
Description
data_column
Data column containing numerical arrays to be filtered
window_length
Size of windows for fitting the polynomials. Must be an odd number.
polyorder
Order of polynomials to fit into the windows. Must be less than window_length
Apply
Process data through this node
PowSpecDens¶
Resample¶
Resample the data in numerical arrays. Uses scipy.signal.resample.
Output Data Column (numerical): _RESAMPLE
Terminal
Description
In
Input Transmission
Out
Transmission with resampled signals in the output data column
Parameters
Description
data_column
Data column containing numerical arrays to be resampled
Rs
New sampling rate in Tu units of time.
Tu
Time unit
Apply
Process data through this node
Note
If Tu = 1, then Rs is the new sampling rate in Hertz.
ScalerMeanVariance¶
Uses tslearn.preprocessing.TimeSeriesScalerMeanVariance
Output Data Column (numerical): _SCALER_MEAN_VARIANCE
Terminal
Description
In
Input Transmission
Out
Transmission with scaled signals in the output column
Parameters
Description
data_column
Data column containing numerical arrays to be scaled
mu
Mean of the output time series
std
Standard Deviation of the output time series
Apply
Process data through this node
Note
if mu = 0 and std = 1, the output is the z-score of the signal.
Normalize¶
Normalize the signal so that all values are between 0 and 1 based on the min and max of the signal.
Output Data Column (numerical): _NORMALIZE
Terminal
Description
In
Input Transmission
Out
Transmission with scaled signals in the output column
Parameters
Description
data_column
Data column containing numerical arrays to be scaled
Apply
Process data through this node
RFFT¶
Uses scipy.fftpack.rfft. “Discrete Fourier transform of a real sequence”
Output Data Column (numerical): _RFFT
Terminal
Description
In
Input Transmission
Out
Transmission with the RFT of signals in the output column
Parameters
Description
data_column
Data column containing numerical arrays
Apply
Process data through this node
iRFFT¶
Uses scipy.fftpack.irfft. “inverse discrete Fourier transform of real sequence x”
Output Data Column (numerical): _IRFFT
PeakDetect¶
Simple Peak Detection using derivatives. The “Differentiation” chapter of Tom O’Haver’s book has a section on Peak Detection which I recommend reading. https://terpconnect.umd.edu/~toh/spectrum/TOC.html
Output Data Column (DataFrame): peaks_bases
See also
Terminal
Description
Derivative
Transmission with derivatives of signals. Must have _DERIVATIVE column.
It’s recommended to use a derivative from a normalized filtered signal.Normalized
Transmission containing Normalized signals, used for thresholding
See Normalize nodeCurve
Transmission containing original signals.
Usually not filtered to avoid distortions caused by filteringPB_Input (optional)
Transmission containing peaks & bases data (peaks_bases column).
Useful for visualizing a saved Transmission that has peaks & bases dataOut
Transmission with the detected peaks & bases as DataFrames in the output column
Warning
The PB_Input terminal overrides all other terminals. Do not connect inputs to PB_Input and other terminals simultaneously.
Parameter
Description
data_column
Data column of the input Curve Transmission for placing peaks & bases onto
Fictional_Bases
Add bases to beginning and end of signal if first or last peak is lonely
Edit
Open Peak Editor GUI, see Peak Editor
SlopeThr
Slope threshold
AmplThrAbs
Absolute amplitude threshold
AmplThrRel
Relative amplitude threshold
Apply
Process data through this node
PeakFeatures¶
Compute peak features. The DataFrame of the ouput Transmission contains one row for each peak.
Output Data Column
Description
_pf_peak_curve
array representing the peak
_pf_ampl_rel_b_ix_l
peak amplitude relative to its left base
_pf_ampl_rel_b_ix_r
peak amplitude relative to its right base
_pf_ampl_rel_b_mean
peak amplitude relative to the mean of its bases
_pf_ampl_rel_zero
peak amplitude relative to zero
_pf_area_rel_zero
_pf_area_rel_min
Simpson’s Rule Integral relative to the minimum value of the curveSubstracts the minimum values of the peak curve before computing the integral_pf_rising_slope_avg
slope of the line drawn from the left base to the peak
_pf_falling_slope_avg
slope of the line drawn from the right base to the peak
_pf_duration_base
distance between the left and right base
_pf_p_ix
index of the peak maxima in the parent curve
_pf_uuid
peak UUID
_pf_b_ix_l
index of the left base in the parent curve
_pf_b_ix_r
index of the right base in the parent curve
See also
mesmerize/analysis/compute_peak_features
for the code that computes the peak features.
Terminal
Description
In
Input Transmission. Must contain peak_bases column that contains peak_bases DataFrames.
Out
Transmission with peak features in various output columns
Parameter
Description
data_column
Data column containing numerical arrays from which to compute peak features.
Apply
Process data through this node
Warning
If there are issues with a particular peak a user warning will be displayed in the terminal that Mesmerize is running and the peak will be ignored. This happens when a peak is 1) not flanked by bases on both sides, 2) a peak or base is out of bounds for the parent curve from teh chosen data_column or 3) other index issues w.r.t. the peak. In the terminal, the number after the progress bar will show the index of the parent curve, for example here the parent curve is 319: 41%|████▏ | 319/771. The index of the offending peak within the parent curve will be printed below the progress bar along with a statement that may specific the issue with the peak.
Math¶
Nodes for performing basic Math functions
Derivative¶
Computes the first derivative.
Output Data Column (numerical): _DERIVATIVE
Terminal
Description
In
Input Transmission
Out
Transmission with the derivative placed in the output column
Parameter
Description
data_column
Data column containing numerical arrays
Apply
Process data through this node
TVDiff¶
Based on Numerical Differentiation of Noisy, Nonsmooth Data. Rick Chartrand. (2011).. Translated to Python by Simone Sturniolo.
XpowerY¶
Raises each element of the numerical arrays in the data_column to the exponent Y
Output Data Column (numerical): _X_POWER_Y
Terminal
Description
In
Input Transmission
Out
Transmission with the result placed in the output column
Parameter
Description
data_column
Data column containing numerical arrays
Y
Exponent
Apply
Process data through this node
AbsoluteValue¶
Element-wise absolute values of the input arrays. Computes root mean squares if input arrays are complex.
Output Data Column (numerical): _ABSOLUTE_VALUE
Terminal
Description
In
Input Transmission
Out
Transmission with the result placed in the output column
Parameter
Description
data_column
Data column containing numerical arrays
Apply
Process data through this node
LogTransform¶
Perform Logarithmic transformation of the data.
Output Data Column (numerical): _LOG_TRANSFORM
Terminal
Description
In
Input Transmission
Out
Transmission with the result placed in the output column
Parameter
Description
data_column
Data column containing numerical arrays
transform
log10: Base 10 logarithm
ln: Natural logarithmmodlog10: \(sign(x) * \log_{10} (|x| + 1)\)modln: \(sign(x) * \ln (|x| + 1)\)Apply
Process data through this node
ArrayStats¶
Perform a few basic statistical functions.
Output Data Column (numerical): Customizable by user entry
Output data are single numbers, not arrays
Terminal
Description
In
Input Transmission
Out
Transmission with the result placed in the output column
The desired function is applied to each 1D array in the data_column and the output is placed in the Output Data Column.
Parameter
Description
data_column
Data column containing numerical arrays
function
amin: Return the minimum of the input arrayamax: Return the maximum of the input arraynanmin: Return the minimum of the input array, ignore NaNsnanmax: Return the maximum of the input array, ignore NaNsptp: Return the range (max - min) of the values of the input arraymedian: Return the median of the input arraymean: Return the mean of the input arraystd: Return the standard deviation of the input arrayvar: Return the variance of the input arraynanmedian: Return the median of the input array, ignore NaNsnanmean: Return the mean of the input array, ignore NaNsnanstd: Return the standard deviation of the input array, ignore NaNsnanvar: Return the variance of the input array, ignore NaNsgroup_by (Optional)
Group by a categorial variable, for example get the mean array of a group
group_by_sec (Optional)
Group by a secondary categorical variable
output_col
Enter a name for the output column
Apply
Process data through this node
ArgGroupStat¶
Group by a categorial variable and return the value of any other column based on a statistic. Basically creates sub-dataframes for each group and then returns based on the sub-dataframe.
Group by column “group_by” and return value from column “return_col” where data in data_column fits “stat”
Output Data Column (Any): ARG_STAT
Terminal
Description
In
Input Transmission
Out
Transmission with the result placed in the output column
Parameter
Description
data_column
Data column containing single numbers (not arrays for now)
group_by
Group by column (categorical variables)
return_col
Return value from this column (any data)
stat
“max” or “min”
Apply
Process data through this node
ZScore¶
Compute Z-Scores of the data. Uses scipy.stats.zscore. The input data are are divided into groups according to the group_by parameter. Z-Scores are computed for the data in each group with respect to the data only in that group.
Output Data Column (numerical): _ZSCORE
Terminal
Description
In
Input Transmission
Out
Transmission with the result placed in the output column
Parameter
Description
data_column
Input data column containing numerical arrays
group_by
Categorial data column to group by.
Apply
Process data through this node
LinRegress¶
Basically uses scipy.stats.linregress
Performs Linear Regression on numerical arrays and returns slope, intercept, r-value, p-value and standard error
Terminal
Description
In
Input Transmission
Out
Transmission with the result placed in the output column
Parameter
Description
data_column
Data column containing 1D numerical arrays.The values are used as the y values and indices as the x values for the regressionOutput Columnns: Single numbers, _SLOPE, _INTERCEPT, _R-VALUE, _P-VALUE, _STDERR as decribed in scipy.stats.linregress
Biology¶
Nodes for some biologically useful things which I couldn’t categorize elsewhere
ExtractStim¶
Extract the portions of a trace corresponding to stimuli that have been temporally mapped onto it. It outputs one row per stimulus period.
Note: Stimulus extraction is currently quite slow, will be optimized after some planned changes in the Transmission object.
Output Data Column
Description
ST_TYPE
Stimulus type, corresponds to your Project Config
ST_NAME
Name of the stimulus
_ST_CURVE
The extracted array based on the parameters
_ST_START_IX
Start index of the stimulus period in the parent curve
_ST_END_IX
End index of the stimulus period in the parent curve
ST_uuid
UUID assigned for the extracted stimulus period
Parameter
Description
data_column
Data column containing the signals to be extracted based on the stimulus maps
Stim_Type
Type of stimulus to extract
start_offset
Offset the start index of the stimulus mapping by a value (in frames)
end_offset
Offset the end index of the stimulus mapping by a value (in frames)
zero_pos
Zero index of the extracted signal
start_offset: extraction begins at the start_offset value, stops at the end_offsetstim_end: extraction begins at the end of the stimulus, stops at the end_offset.stim_center: extraction begins at the midpoint of the stimulus period plus the start_offset, stops at end_offset
DetrendDFoF¶
Source
Uses the detrend_df_f function from the CaImAn library. This node does not use any of the numerical data in a Transmission DataFrame to compute the detrended \(\Delta F / F_0\). It directly uses the CNMF output data for the Samples that are present in the Transmission DataFrame.
Output Data Column (numerical): _DETREND_DF_O_F
StaticDFoFo¶
Perform \(\frac{F - F_0}{F_0}\) without a rolling window. \(F\) is an input array and \(F_0\) is the minimum value of the input array.
Output Data Column (numerical): _STATIC_DF_O_F
Terminal
Description
In
Input Transmission
Out
Transmission with the result placed in the output column
Parameter
Description
data_column
Data column containing numerical arrays
Apply
Process data through this node
Clustering¶
KShape¶
Perform KShape clustering. For more information see KShape plot.
Hierarchical¶
These nodes allow you to perform Hierarchical Clustering using scipy.cluster.hierarchy.
If you are unfamiliar with Hierarchical Clustering I recommend going through this chapter from Michael Greenacre: http://www.econ.upf.edu/~michael/stanford/maeb7.pdf
Note
Some of these nodes do not use Transmission objects for some inputs/outputs.
Linkage¶
Compute a linkage matrix which can be used to form flat clusters using the FCluster node.
Based on scipy.cluster.hierarchy.linkage
Terminal
Description
In
Input Transmission
Out
dict containing the Linkage matrix and parameters, not a Transmission object
Parameters
Description
data_column
Numerical data column used for computing linkage matrix
method
linkage method
metric
metric for computing distance matrix
optimal_order
minimize distance between successive leaves, more intuitive visualization
Apply
Process data through this node
FCluster¶
“Form flat clusters from the hierarchical clustering defined by the given linkage matrix.”
Based on scipy.cluster.hierarchy.fcluster
Output Data Column (categorial): FCLUSTER_LABELS
Terminal
Description
Linkage
Linkage matrix, output from Linkage node.
Data
Input Transmission, usually the same input Transmission used for the Linkage node.
IncM (optional)
Inconsistency matrix, output from Inconsistent
Monocrit (optional)
Output from MaxIncStat or MaxInconsistent
Out
Transmission with clustering data that can be visualized using the Heatmap
Parameters: Exactly as desribed in scipy.cluster.hierarchy.fcluster
HistoryTrace output structure: Dict of all the parameters for this node, as well as the parameters used for creating the linkage matrix and the linkage matrix itself from the Linkage node.
Inconsistent¶
MaxIncStat¶
MaxInconsistent¶
Transform¶
Nodes for transforming data
LDA¶
Perform Linear Discriminant Analysis. Uses sklearn.discriminant_analysis.LinearDiscriminantAnalysis
Terminal
Description
train_data
Input Transmission containing the training data
predict
Input Transmission containing data on which to predict
T
Transmission with Transformed data and decision function. Output columns outlined below:_LDA_TRANSFORM: The transformed data, can be visualized with a Scatter Plot for instance_LDA_DFUNC: Decision function (confidence scores). Can be visualized with a Heatmapcoef
Transmission with LDA Coefficients. Output columns outlined below:classes: The categorical labels that were trained against_COEF: LDA Coefficients (weight vectors) for the classes. Can be visualized with a Heatmapmeans
Transmission with LDA Means. Output columns outlined below:classes: The categorical labels that were trained against_MEANS: LDA means for the classes. Can be visualized with a Heatmappredicted
Transmission containing predicted class labels for the data.The class labels are placed in a column named LDA_PREDICTED_LABELSThe names of the class labels correspond to the labels from the training labelsoptional
Parameter
Description
train_data
Single or multiple data columns that contain the input features.
labels
Data column containing categorical labels to train to
solver
svd: Singular Value Decompositionlsqr: Least Squares solutioneigen: Eigen decompositionshrinkage
Can be used with lsqr or eigen solvers.
shrinkage_val
shrinkage value if shrinkage is set to “value”
n_components
Number of components to output
tol
Tolereance threshold exponent. The used value is 10^<tol>
score
Displays mean score of the classification (read only)
predict_on
Single or multiple data columns that contain the data that are used for predicting onUsually the same name as the data column(s) used for the training data.optionalHistoryTrace output structure: Dict of all the parameters for this node