The JRCLUST spike-sorting pipeline

The JRCLUST pipeline can be broken down into the following steps:

  1. config file creation (bootstrapping)
  2. spike detection/feature extraction
  3. clustering
  4. manual curation of automatic results

Bootstrapping

Your config file encapsulates the choices in parameters that you make, as well as describing the relevant probe configuration. You can create a config file by invoking jrc bootstrap or jrc bootstrap /path/to/metafile.meta from the MATLAB command window.

If you don’t specify a SpikeGLX meta file (.meta), you will be asked to select one. This will collect recording-specific information from your meta file and set default parameters. (The default parameters may be inspected on the JRCLUST parameters page.) The location of your raw recording will be inferred from your meta file, so be sure they are similarly named (this is the SpikeGLX default) and placed together in the same directory!

If you don’t have a .meta file or your .meta file is missing some data, JRCLUST requests the following information:

  • Sampling rate (Hz): read from imSampRate (or niSampRate for NI recordings) in .meta file
  • Number of channels in file: read from nSavedChans in .meta file
  • μV/bit: computed from imAiRangeMax, imAiRangeMin (or niAiRangeMax, niAiRangeMin, and niMNGain for NI recordings) in .meta file
  • Header offset (bytes): set to 0 for SpikeGLX recordings since no header is stored in the .bin file
  • Data type: select from int16, uint16, single, or double (SpikeGLX files are saved as int16)

You will also be asked to confirm your config filename and path to your raw recording. If you have multiple recordings, full paths will be separated by commas.

Spike detection

Once you have a config file, you can detect spikes in your recording. Any of the following commands will detect spikes:

  • detect will perform spike detection/feature extraction and save results to disk.
  • detect-sort will do all of the above and additionally cluster the spikes.
  • full is the same as detect-sort, but will also pull up the curation GUI after clustering is completed.

(See the usage section for how to invoke these commands.)

For each recording you specify in your config file, JRCLUST will:

  1. Denoise and filter samples. Also perform common-average referencing on the filtered samples.
  2. Compute a detection threshold from the filtered samples (if you have not already supplied a threshold).
  3. Detect peaks, i.e., points exceeding the threshold which are also genuine turning points.
  4. Merge peaks detected at multiple neighboring sites, searching over a spatiotemporal window. Where larger peaks are detected within this limit, weaker peaks are removed.
  5. Extract spatiotemporal windows around spiking events, in both raw and filtered samples.
  6. Compute low-dimensional features from the resulting waveforms.

A detailed description of the steps in the detection process is indexed below.

Spike clustering

After the spiking events have been detected, they must be clustered by the features extracted from them. Any of the following commands will sort spikes:

  • sort will cluster spiking events using spikes you have detected previously. If JRCLUST can’t find a previous detection, it will also detect them for you.
  • detect-sort will cluster spiking events after detecting them. If you have previously detected spikes, JRCLUST will overwrite them.
  • full is the same as detect-sort, but will also pull up the curation GUI after clustering is completed.

(See the usage section for how to invoke these commands.)

JRCLUST will cluster the spike features using a variant of the clustering algorithm of Rodriguez and Laio, also known as density-peak clustering or \(\rho\)-\(\delta\) clustering. The general algorithm computes pairwise distances in the feature space \(\mathbb{R}^n\), and, given a cutoff distance \(d\), assigns to each point \(x_i\) in the feature space a density

\[\rho_i := \sum_{j} I(\|x_i - x_j\|_2 < d),\]

where \(I\) is the indicator function, giving 1 if the condition is true and 0 otherwise. In plain English, \(\rho_i\) is the number of points within a ball of radius \(d\) centered at \(x_i\).

Once each point has been assigned a density, we then find the distance to the nearest neighbor of higher density,

\[\delta_i := \min_{\rho_i < \rho_j} \|x_i - x_j\|_2.\]

(If there is no \(j\) such that \(\rho_i < \rho_j\), i.e., \(x_i\) is a maximally dense point, then \(\delta_i\) is defined to be \(\max_j \|x_i - x_j\|_2\).)

If the point \(x_i\) is both sufficiently dense (i.e., \(\rho_i\) is large enough) and sufficiently far from any other point with higher density, then \(x_i\) is deemed a cluster center, and all points with \(x_i\) as nearest neighbor of higher density will be assigned to the cluster centered around \(x_i\).

We have left the terms sufficiently dense and sufficiently far somewhat vague, to say nothing of how the cutoff distance \(d\) is determined. These, along with the variations JRCLUST makes to accommodate the large volume of data, are elaborated here.

Cluster curation

Manual verification and correction of the automatic clustering can be done using the GUI curation tool.

Any of the following commands will curate spike clusters:

  • manual will pull up the curation GUI. If JRCLUST can’t find a previous clustering, it will ask if you want to cluster your spikes.
  • detect-sort will do all of the above and additionally cluster the spikes.
  • full will detect and cluster spikes, pulling up the curation GUI after clustering is completed. If you have previously detected spikes, JRCLUST will overwrite them.

(See the usage section for how to invoke these commands.)

In addition to deleting, splitting, or merging clusters, you may also annotate them as noise, MUA (multi-unit activity), or provide arbitrary notes on clusters. For a good primer on why you might want to do these things, take a look at Phy’s documentation. A detailed description of the different views onto the data is indexed below.