Data Wrangling

Miscellaneous utilities for wrangling data from neuprint for various purposes.

syndist_matrix(syndist[, rois, syn_columns, ...])

Pivot a synapse ROI counts table (one row per body).

bilateral_syndist(syndist[, bodies, rois, ...])

Aggregate synapse counts for corresponding left and right ROIs.

assign_sides_in_groups(neurons, syndist[, ...])

Determine which side (left or right) each neuron belongs to, according to a few heuristics.

Reference

neuprint.wrangle.syndist_matrix(syndist, rois=None, syn_columns=['pre', 'post'], flatten_column_index=False)[source]

Pivot a synapse ROI counts table (one row per body).

Given a table of synapse ROI distributions as returned by fetch_neurons(), pivot the ROIs into the columns so the result has one row per body.

Parameters:
  • syndist – DataFrame in the format returned by fetch_neurons()[1]

  • rois – Optionally filter the input table to process only the listed ROIs.

  • syn_columns – Optionally process only the given columns of syndist.

  • flatten_column_index – By default, the result columns will use a MultiIndex (orig_col, roi), e.g. ('pre', 'LO(R)'). If flatten_column_index=True, then the output column index is flattened to a plain index with names like LO(R)-pre.

Returns:

DataFrame indexed by bodyId and with column count C * R, where C is the number of original columns (not counting bodId and roi), and R is the number of unique rois in the input.

Example

In [1]: from neuprint import Client, fetch_neurons, syndist_matrix
   ...: c = Client('neuprint.janelia.org', 'hemibrain:v1.2.1')
   ...: bodies = [786989471, 925548084, 1102514975, 1129042596, 1292847181, 5813080979]
   ...: neurons, syndist = fetch_neurons(bodies)
   ...: syndist_matrix(syndist, ['EB', 'FB', 'PB'])
Out[1]:
            pre           post
roi          EB   FB  PB    EB    FB   PB
bodyId
786989471     0  110  11     0  1598  157
925548084     0  542   0     0   977    0
1102514975    0  236   0     1  1338    0
1129042596    0  139   0     0  1827    0
1292847181  916    0   0  1558     0    0
5813080979  439    0   0   748     0  451
neuprint.wrangle.bilateral_syndist(syndist, bodies=None, rois=None, syn_columns=['pre', 'post'])[source]

Aggregate synapse counts for corresponding left and right ROIs.

Given a synapse distribution table as returned by fetch_neurons() (in its second return value), group corresponding contralateral ROIs (suffixed with (L) and (R)) and aggregate their synapse counts into total ‘bilateral’ counts with the suffix (LR).

ROIs without a suffix (L)/(R) will be returned in the output unchanged.

Parameters:
  • syndist – DataFrame in the format returned by fetch_neurons()[1]

  • bodies – Optionally filter the input table to include only the listed body IDs.

  • rois – Optionally filter the input table to process only the listed ROIs.

  • syn_columns – The names of the statistic columns in the input to process. Others are ignored.

Returns:

DataFrame, similar to the input table but with left/right ROIs aggregated and named with a (LR) suffix.

Example

In [1]: from neuprint import Client, fetch_neurons, bilateral_syndist
   ...: c = Client('neuprint.janelia.org', 'hemibrain:v1.2.1')
   ...: bodies = [786989471, 925548084, 1102514975, 1129042596, 1292847181, 5813080979]
   ...: neurons, syndist = fetch_neurons(bodies)
   ...: bilateral_syndist(syndist, rois=c.primary_rois)
Out[1]:
        bodyId      roi  pre  post
0    786989471  CRE(LR)   77    75
3    786989471       FB  110  1598
1    786989471  LAL(LR)    2     2
14   786989471       PB   11   157
2    925548084  CRE(LR)    1   203
22   925548084       FB  542   977
3    925548084  SMP(LR)    1   171
4   1102514975  CRE(LR)    2   190
35  1102514975       EB    0     1
37  1102514975       FB  236  1338
5   1102514975  ICL(LR)    0     1
6   1102514975  LAL(LR)    0     3
7   1102514975  SMP(LR)    0    74
8   1102514975  b'L(LR)    0     4
55  1129042596       FB  139  1827
9   1129042596  ICL(LR)    0     2
10  1292847181   BU(LR)    5   143
67  1292847181       EB  916  1558
11  1292847181  LAL(LR)    0     1
77  5813080979       EB  439   748
82  5813080979       NO  105   451
86  5813080979       PB    0   451
neuprint.wrangle.assign_sides_in_groups(neurons, syndist, primary_rois=None, min_pre=50, min_post=100, min_bias=0.7)[source]

Determine which side (left or right) each neuron belongs to, according to a few heuristics.

Assigns a column named ‘consensusSide’ to the given neurons table. The consensusSide is only assigned for neurons with an assigned group, and only if every neuron in the group can be assigned a side using the same heuristic.

The neurons are processed in groups (according to the group column). Multiple heuristics are tried:

  • If all neurons in the group have a valid somaSide, then that’s used.

  • Otherwise, if all neurons in the group have an instance ending with _L or _R, then that is used.

  • Otherwise, we inspect the pre- and post-synapse counts in ROIs which end with (L) or (R):

    • If all neurons in the group have significantly more post-synapses on one side, then the balance post-synapse is used to assign the neuron side.

    • Otherwise, if all neurons in the group have significantly more pre-synapses on one side, then that’s used.

    • But we do not use either heuristic if there is any disagreement on the relative lateral direction in which the neurons in the group project. If some seem to project contralaterally and others seem to project ipsilaterally, we do not assign a consensusSide to any neurons in the group.

Parameters:
  • neurons – As produced by fetch_neurons()

  • syndist – As produced by fetch_neurons()

  • primary_rois – To avoid double-counting synapses in overlapping ROIs, it is best to restrict the syndist table to non-overlapping ROIs only (e.g. primary ROIs). Provide the list of such ROIs here, or pre-filter the input yourself.

  • min_pre – When determining a neuron’s side via synapse counts, don’t analyze pre-synapses in neurons with fewer than min_pre pre-synapses.

  • min_post – When determining a neuron’s side via synapse counts, don’t analyze post-synapses in neurons with fewer than min_post post-synapses.

  • min_bias – When determining a neuron’s side via synapse counts, don’t assign a consensusSide unless each neuron in the group has a significant fraction of its lateral synapses on either the left or right, as specified in this argument. By default, only assign a consensusSide if 70% of post-synapses are on one side, or 70% of pre-synapses are on one side (not counting synapses in medial ROIs).

Returns:

DataFrame, indexed by bodyId, with column consensusSide (all values L, R, or None) and various auxiliary columns which indicate how the consensus was determined.