Annotation Tables

Version update

We have released a new public version 1507, as part of our quarterly release schedule. See details at Release Manifests: 1507.

Tutorials remain pinned to v1412 but will updated in coming weeks.

The minnie65_public data release includes a number of annotation tables that help label the dataset. This section describes the content of each of these tables — see here for instructions for how to query and filter tables.

Unless otherwise specificied (i.e. via desired_resolution), all positions are in units of 4,4,40 nm/voxel resolution.

Common Fields

Several fields (or column names) are common to many tables. These fall into two main classes: the spatial point columns that are how we assign annotations to cells via points in the 3d space and book-keeping columns, that are used internally to track the state of the data.

Spatial Point Columns

Most tables have one or more Bound Spatial Points, which is a location in the 3d space that tells the annotation to remain associated with the root id at that location.

Bound spatial points have will have one prefix, usually pt (i.e. “point”) and three associated columns with different suffixes: _position, _supervoxel_id, and _root_id.

For a given prefix {pt}, the three columns are as follows:

The {pt}_position indicates the location of the point in 3d space.
The {pt}_supervoxel_id indicates a unique identifier in the segmentation, and is mostly internal bookkeeping.
The {pt}_root_id indicates the root id of the annotation at that location.

Book-keeping Columns

Several columns are common to many or all tables, and mostly used as internal book-keeping. Rather than describe these for every table, they will just be mentioned briefly here:

Common columns
Column	Description
`id`	A unique ID specific to the annotation within that table
`created`	Internal bookkeeping column, should always be `t` for data you can download
`valid`	A unique ID specific to the annotation within that table
`target_id`	Some tables reference other tables, particularly the nucleus table. If present, this column will be the same as `id`
`created_ref` / `valid_ref` / `id_ref` (optional)	For reference tables, the data shows both the created/valid/id of the reference annotation and the target annotation. The values with the `_ref` suffix are those of the reference table (usually something like proofreading state or cell type) and the values without a suffix ar ethose of the target table (usually a nucleus)

Synapse Table

Table name: synapses_pni_2

The only synapse table is synapses_pni_2. This is by far the largest table in the dataset with 337 million entries, one for each synapse. It contains the following columns:

Synapse table column definitions
Column	Description
`pre_pt_position` / `pre_pt_supervoxel_id` / `pre_pt_root_id`	The bound spatial point data for the presynaptic side of the synapse.
`post_pt_position` / `post_pt_supervoxel_id` / `post_pt_root_id`	IThe bound spatial point data for the postsynaptic side of the synapse.
`size`	The size of the synapse in voxels. This correlates well, but not perfectly, with the surface area of synapse.
`ctr_pt_position`	A position in the center of the detected synaptic junction. Of all points in the synapse table, this is usually the closest point to the surface (and thus mesh) of both neurons. Because it is at the edge of cells, it is not associated with a root id.

# Synapse query: outputs
client.materialize.synapse_query(pre_ids=example_root_id)

# Synapse query: inputs
client.materialize.synapse_query(post_ids=example_root_id)

Nucleus tables

The ‘nucleus centroid’ of a cell is unlikely to change with proofreading, and so is a useful static identifier for a given cell. The results of automatic nucleus segmentation and neuron-detection are avialable in the following tables. These tables are often the ‘reference’ table for other annotations.

Nucleus Detection Table

Table name: nucleus_detection_v0

Nucleus detection has been used to define unique cells in the dataset. Distinct from the neuronal segmentation, a convolutional neural network was trained to segment nuclei. Each nucleus detection was given a unique ID, and the centroid of the nucleus was recorded as well as its volume. Many other tables in the dataset are reference tables on nucleus_detection_v0, meaning they are linked by the same annotation id. The id of the segmented nucelus, a 6-digit integer, is static across data versions and for this reason is the preferred method to identify the same ‘cell’ across time.

The key columns of nucleus_detection_v0 are:

Nucleus table column definitions
Column	Description
`id`	6-digit number of the segmentation for that nucleus; ‘nucleus ID’
`pt_position` `pt_supervoxel_id` `pt_root_id`	Bound spatial point columns associated with the centroid of the nucleus

Note that the id column is the nucleus ID, also called the ‘soma ID’ or the ‘cell ID’.

# Standard query
client.materialize.query_table('nucleus_detection_v0')

# Content-aware query
client.materialize.tables.nucleus_detection_v0(id=example_nucleus_id).query()

Nucleus brain area assignment

Table name: nucleus_functional_area_assignment

Given the nucleus detection table nucleus_detection_v0 and the transformation of 2-photon in vivo imaging to the EM structural space (see functional-coregistration-tables ), each cell in the volume has been assigned to one of four visual cortical areas.

The inferred functional brain area based on the position of the nucleus in the EM volume. Area boundaries estimated from the area-membership assignments of the 2P recorded cells, after transformation to EM space.

The table nucleus_functional_area_assignment is a reference table on nucleus_detection_v0 and adds the following columns:

Functional area assignment
Column	Description
`target_id`	Soma ID for the cell
`pt_position` `pt_supervoxel_id` `pt_root_id`	Bound spatial point columns associated with the centroid of the nucleus
`tag`	the brain area label (one of V1, AL, RL, LM)
`value`	the distance to the area boundary (in um), calculated as the mean-distance of the 10 nearest neighbors in a non-matching brain area. A larger value is further from the area boundaries, and can be interpretted as higher confidence in area assignment.

Tip

This is a reference table on nucleus_detection_v0, and can be indexed by the same nucleus id.

# Standard query
client.materialize.query_table('nucleus_functional_area_assignment')

# Content-aware query
client.materialize.tables.nucleus_functional_area_assignment(tag='V1').query()

Neuron-Nucleus Table

Important

This table is superseded by other cell typing below, but remains here for convenient reference as it is relevant to other tables such as aibs_metamodel_celltypes_v661.

Table name: nucleus_ref_neuron_svm

While the table of centroids for all nuclei is nucleus_detection_v0, this includes neuronal nuclei, non-neuronal nuclei, and some erroneous detections. The table nucleus_ref_neuron_svm shows the results of a classifier that was trained to distinguish neuronal nuclei from non-neuronal nuclei and errors. For the purposes of analysis, we recommend using the nucleus_ref_neuron_svm table to get the most broad collection of neurons in the dataset.

The key columns of nucleus_ref_neuron_svm are:

Nucleus table column definitions
Column	Description
`id`	Soma ID for the cell
`pt_position` `pt_supervoxel_id` `pt_root_id`	Bound spatial point columns associated with the centroid of the nucleus
`classification-system`	Describes how the classification was done. All values will be `is_neuron` for this table
`cell_type`	The output of the classifier. All values will be either `neuron` or `not-neuron` (glia or error) for this table

Tip

This is a reference table on nucleus_detection_v0, and can be indexed by the same nucleus id.

# Standard query
client.materialize.query_table('nucleus_ref_neuron_svm')

# Content-aware query
client.materialize.tables.nucleus_ref_neuron_svm(target_id=example_nucleus_id).query()

Cell Type Tables

There are several tables that contain information about the cell type of neurons in the dataset, with each table representing a different method of doing the classificaiton. Because each method requires a different kind of information, not all cells are present in all tables. Each of the cell types tables has the same format and in all cases the id column references the nucleus id of the cell in question.

Manual Cell Types (V1 Column)

Table name: allen_v1_column_types_slanted_ref

A subset of nucleus detections in a 100 um column (n=2204) in VISp were manually classified by anatomists at the Allen Institute into categories of cell subclasses, first distinguishing cells into classes of non-neuronal, excitatory and inhibitory; then into subclasses.

For the non-neuronal subclasses, see aibs_column_nonneuronal_ref

The key columns are:

AIBS Manual Cell Types, V1 Column
Column	Description
`id`	Soma ID for the cell
`pt_position` `pt_supervoxel_id` `pt_root_id`	Bound spatial point columns associated with the centroid of the cell nucleus
`classification-system`	One of `aibs_coarse_excitatory` or `aibs_coarse_inhibitory` for detected neurons, or `aibs_coarse_nonneuronal` for non-neurons (glia/pericytes).
`cell_type`	One of several cell types, detailed below

Tip

This is a reference table on nucleus_detection_v0, and can be indexed by the same nucleus id.

The cell types in the table are:

Manual Cell Types (neurons)

AIBS Manual Cell Type definitions (neurons)
Cell Type	Subclass	Description
`23P`	Excitatory	Layer 2/3 cells
`4P`	Excitatory	Layer 4 cells
`5P-IT`	Excitatory	Layer 5 intratelencephalic cells
`5P-ET`	Excitatory	Layer 5 extratelencephalic cells
`5P-NP`	Excitatory	Layer 5 near-projecting cells
`6P-IT`	Excitatory	Layer 6 intratelencephalic cells
`6P-CT`	Excitatory	Layer 6 corticothalamic cells
`BC`	Inhibitory	Basket cell
`BPC`	Inhibitory	Bipolar cell. In practice, this was used for all cells thought to be VIP cell, not only those with a bipolar dendrite
`MC`	Inhibitory	Martinotti cell. In practice, this label was used for all inhibitory neurons that appeared to be Somatostatin cell, not only those with a Martinotti cell morphology
`Unsure`	Inhibitory	Unsure. In practice, this label also is used for all likely-inhibitory neurons that did not match other types

# Standard query
client.materialize.query_table('allen_v1_column_types_slanted_ref')

# Content-aware query
client.materialize.tables.allen_v1_column_types_slanted_ref(id=example_nucleus_id).query()

Manual Cell Types (non-neurons)

AIBS Manual Cell Type definitions (non-neurons)
Cell Type	Subclass	Description
`OPC`	Non-neuronal	Oligodendrocyte precursor cell
`astrocyte`	Non-neuronal	Astrocyte
`microglia`	Non-neuronal	Microglia
`pericyte`	Non-neuronal	Pericyte
`oligo`	Non-neuronal	Oligodendrocyte

# Standard query
client.materialize.query_table('aibs_column_nonneuronal_ref')

# Content-aware query
client.materialize.tables.aibs_column_nonneuronal_ref(id=example_nucleus_id).query()

Predictions from soma/nucleus features

Table name: aibs_metamodel_celltypes_v661

This table contains the results of a hierarchical classifier trained on features of the cell body and nucleus of cells. This was applied to most cells in the dataset that had complete cell bodies (e.g. not cut off by the edge of the data), detected by nucleus_detection_v0, with small-objects and multi-soma errors removed. The model was run with cell-based features as of version 661 of the dataset. For more details, see (Elabbady et al. 2025). In general, this does a good job, but sometimes confuses layer 5 inhibitory neurons as being excitatory:

The key columns are:

AIBS Soma Nuc Metamodel Table
Column	Description
`id`	Soma ID for the cell
`pt_position` `pt_supervoxel_id` `pt_root_id`	Bound spatial point columns associated with the centroid of the cell nucleus
`classification-system`	One of `excitatory_neuron` or `inhibitory_neuron` for detected neurons, or `nonneuron` for non-neurons (glia/pericytes).
`cell_type`	One of several cell types, detailed below

Tip

This is a reference table on nucleus_detection_v0, and can be indexed by the same nucleus id.

The cell types in the table are:

Soma Nuc Metamodel Cell types

AIBS Soma Nuc Metamodel: Cell Type definitions
Cell Type	Subclass	Description
`23P`	Excitatory	Layer 2/3 cells
`4P`	Excitatory	Layer 4 cells
`5P-IT`	Excitatory	Layer 5 intratelencephalic cells
`5P-ET`	Excitatory	Layer 5 extratelencephalic cells
`5P-NP`	Excitatory	Layer 5 near-projecting cells
`6P-IT`	Excitatory	Layer 6 intratelencephalic cells
`6P-CT`	Excitatory	Layer 6 corticothalamic cells
`BC`	Inhibitory	Basket cell
`BPC`	Inhibitory	Bipolar cell. In practice, this was used for all cells thought to be VIP cell, not only those with a bipolar dendrite
`MC`	Inhibitory	Martinotti cell. In practice, this label was used for all inhibitory neurons that appeared to be Somatostatin cell, not only those with a Martinotti cell morphology
`NGC`	Inhibitory	Neurogliaform cell. In practice, this label also is used for all inhibitory neurons in layer 1, many of which may not be neurogliaform cells although they might be in the same molecular family
`OPC`	Non-neuronal	Oligodendrocyte precursor cell
`astrocyte`	Non-neuronal	Astrocyte
`microglia`	Non-neuronal	Microglia
`pericyte`	Non-neuronal	Pericyte
`oligo`	Non-neuronal	Oligodendrocyte

# Standard query
client.materialize.query_table('aibs_metamodel_celltypes_v661')

# Content-aware query
client.materialize.tables.aibs_metamodel_celltypes_v661(id=example_nucleus_id).query()

Previous versions of this table include: aibs_soma_nuc_metamodel_preds_v117 (run on a subset of data, the V1 column) and aibs_soma_nuc_exc_mtype_preds_v117 (using training data labeled by another classifier: see mtypes below).

Coarse prediction from spine detection

Table name: baylor_log_reg_cell_type_coarse_v1

This table contains the results of a logistic regression classifier trained on properties of neuronal dendrites. This was applied to many cells in the dataset, but required more data than soma and nucleus features alone and thus more cells did not complete the pipeline. It has very good performance on excitatory vs inhibitory neurons because it focuses on dendritic spines, a characteristic property of excitatory neurons. It is a good table to double check E/I classifications if in doubt.

For details, see (Celii et al. 2025).

The key columns are:

Baylor Coarse Cell Type Table
Column	Description
`id`	Soma ID for the cell
`pt_position` `pt_supervoxel_id` `pt_root_id`	Bound spatial point columns associated with the centroid of the cell nucleus
`classification-system`	`baylor_log_reg_cell_type_coarse` for all entries
`cell_type`	`excitatory` or `inhibitory`

Tip

This is a reference table on nucleus_detection_v0, and can be indexed by the same nucleus id.

# Standard query
client.materialize.query_table('baylor_log_reg_cell_type_coarse_v1')

# Content-aware query
client.materialize.tables.baylor_log_reg_cell_type_coarse_v1(id=example_nucleus_id).query()

Fine prediction from dendritic features

Table name: aibs_metamodel_mtypes_v661_v2

Excitatory neurons and inhibitory neurons were distinguished with the soma-nucleus model (Elabbady et al. 2025), and subclasses were assigned based on a data-driven clustering of the neuronal features. Inhibitory neurons were classified based on how they distributed they synaptic outputs onto target cells, while exictatory neurons were classified based on a collection of dendritic features.

This was applied to most detected neurons in the dataset that had complete cell bodies (e.g. not cut off by the edge of the data), detected by nucleus_detection_v0, with small-objects and multi-soma errors removed. The model was run with cell-based features as of version 661 of the dataset.For more details, see (Schneider-Mizell et al. 2025).

Note that all cell-type labels in this table come from a clustering specific to this paper, and while they are intended to align with the broader literature they are not a direct mapping or a well-established convention.

For a more conventional set of labels on the same set of cells, look at the manual table allen_v1_column_types_slanted_ref. Cell types in that table align with those in the aibs_metamodel_celltypes_v661.

The key columns are:

Allen Motif-type (mtype) Table
Column	Description
`id`	Soma ID for the cell
`pt_position` `pt_supervoxel_id` `pt_root_id`	Bound spatial point columns associated with the centroid of the cell nucleus
`classification-system`	`excitatory` or `inhibitory`
`cell_type`	One of several cell types, detailed below

Tip

This is a reference table on nucleus_detection_v0, and can be indexed by the same nucleus id.

The cell types in the table are:

M-type Cell Type definitions

Cell Type	Subclass	Description
`L2a`	Excitatory	A cluster of layer 2 (upper layer 2/3) excitatory neurons
`L2b`	Excitatory	A cluster of layer 2 (upper layer 2/3) excitatory neurons
`L3a`	Excitatory	A cluster of excitatory neurons transitioning between upper and lower layer 2/3
`L3b`	Excitatory	A cluster of layer 3 (upper layer 2/3) excitatory neurons
`L3c`	Excitatory	A cluster of layer 3 (upper layer 2/3) excitatory neurons
`L4a`	Excitatory	The largest cluster of layer 4 excitatory neurons
`L4b`	Excitatory	Another cluster of layer 4 excitatory neurons
`L4c`	Excitatory	A cluster of layer 4 excitatory neurons along the border with layer 5
`L5a`	Excitatory	A cluster of layer 5 IT neurons at the top of layer 5
`L5b`	Excitatory	A cluster of layer 5 IT neurons throughout layer 5
`L5ET`	Excitatory	The cluster of layer 5 ET neurons
`L5NP`	Excitatory	The cluster of layer 5 NP neurons
`L6a`	Excitatory	A cluster of layer 6 IT neurons at the top of layer 6
`L6b`	Excitatory	A cluster of layer 6 IT neurons throughout layer 6. Note that this is different than the label “Layer 6b” which refers to a narrow band at the border between layer 6 and white matter
`L6c`	Excitatory	A cluster of tall layer 6 cells (unsure if IT or CT)
`L6CT`	Excitatory	A cluster of tall layer 6 cells matching manual CT labels
`L6wm`	Excitatory	A cluster of layer 6 cells along the border with white matter
`PTC`	Inhibitory	Perisomatic targeting cells, a cluster of inhibitory neurons that target the soma and proximal dendrites of excitatory neurons. Approximately corresponds to basket cell
`DTC`	Inhibitory	Dendrite targeting cells, a cluster of inhibitory neurons that target the distal dendrites of excitatory neurons. Most SST cells would be DTC
`STC`	Inhibitory	Sparsely targeting cells, a cluster of inhibitory neurons that don’t concentrate multiple synapses onto the same target neurons. Many neurogliaform cells and layer 1 interneurons fall into this category
`ITC`	Inhibitory	Inhibitory targeting cells, a cluster of inhibitory neurons that preferntially target other inhibitory neurons. Most VIP cells would be ITCs

# Standard query
client.materialize.query_table('aibs_metamodel_mtypes_v661_v2')

# Content-aware query
client.materialize.tables.aibs_metamodel_mtypes_v661_v2(id=example_nucleus_id).query()

Previous versions of this table include: aibs_soma_nuc_exc_mtype_preds_v117 (run at v117), allen_column_mtypes_v1 and allen_column_mtypes_v2 (run on a subset of data, the V1 column)

Proofreading Tables

Proofreading Status and Strategy

Table name: proofreading_status_and_strategy

The table proofreading_status_and_strategy describes the status of cells that have undergone manual proofreading.

Because of the inherent difference in the difficulty and time required for different kinds of proofreading, we describe the status of axons and dendrites separately.

Each compartment status may be either:

FALSE: indicates no comprehensive proofreading has been performed, or is not applicable.
TRUE: indicates that false merges have been comprehensively removed, and the compartment is at least ‘clean’. Consult the strategy column if completeness of the compartment is relevant to your research.

An axon or dendrite labeled as status=TRUE can be trusted to be correct, but may not be complete. The degree of completion can be read from the strategy column. For more information, please see Proofreading and Data Quality; or also the microns-explorer page on proofreading strategies.

The key columns are:

Proofreading Status Table
Column	Description
`id`	Soma ID for the cell
`pt_position` `pt_supervoxel_id` `pt_root_id`	Bound spatial point columns associated with the centroid of the cell nucleus
`valid_id`	The root id of the neuron when it the proofreading assessment was made. NOTE: if this does not match the `pt_root_id` then the cell has undergone further changes. This is usually and improvement in proofreading, but proceed with caution.
`status_dendrite`	The status of the dendrite proofreading. May be `TRUE` or `FALSE`
`status_axon`	The status of the axon proofreading. May be `TRUE` or `FALSE`
`strategy_dendrite`	The strategy employed to proofread the dendrite. See strategy table below for details
`strategy_axon`	The strategy employed to proofread the axon. See strategy table below for details

The specific strategies are as follows (and will update over time):

Proofreading Strategies

Proofreading Strategy Table
Strategy	Description
`none`	No cleaning, and no extension. Indicates an entry in `proofreading_status` that is `FALSE` for that compartment
`dendrite_clean`	The dendrite had incorrectly-merged axon and dendritic segments comprehensively removed, meaning the input synapses are accurate. The dendrite may be incorrectly truncated by segmentation error. Not all dendrite tips have been checked for extension. No comprehensive attempt was made to re-attach spine heads.
`dendrite_extended`	The dendrite had incorrectly-merged axon and dendritic segments comprehensively removed, meaning the input synapses are accurate. Every tip was identified, manually inspected, and extended if possible. No comprehensive attempt was made to re-attach spine heads.
`axon_column_truncated`	AThe axon was extended within the V1 cortical column, with a preference for local connections. In some cases the axon was cut at the column boundary and/or the layer boundary, especially the boundary between layers 2/3 and layer 4. Output synapses represent a sampling of potential partners
`axon_interareal`	The axon was extended with a preference for branches that projected to other brain areas. Some axon branches were fully extended, but local connections may be incomplete. Output synapses represent a sampling of potential partners.
`axon_partially_extended`	The axon was extended outward from the soma, following each branch to its termination. Output synapses represent a sampling of potential partners.
`axon_fully_extended`	Axon was extended outward from the soma, following each branch to its termination. After initial extension, every endpoint was identified, manually inspected, and extended again if possible. Output synapses represent a largely complete sampling of partners.

This table, proofreading_status_and_strategy, supercedes proofreading_status_public_release.

# Standard query
client.materialize.query_table('proofreading_status_and_strategy')

# Content-aware query
client.materialize.tables.proofreading_status_and_strategy(status_axon='t').query()

Proofreading status at public release

Important

This table is out-of-date, and remains here for convenient reference

Table name: proofreading_status_public_release

The table proofreading_status_public_release describes the status of cells selected for manual proofreading.

Because of the inherent difference in the challenge and time required for different kinds of proofreading, we describe the status of axons and dendrites separately. Further, we distinguish three different categories of proofreading:

non: No proofreading has been comprehensively performed.
clean: Proofreading has comprehensively removed false merges, but not necessarily added missing parts.
extended: Proofreading has comprehensively removed false merges and attempted to add all or most missing parts.

The key columns are:

Proofreading Status Table
Column	Description
`id`	Soma ID for the cell
`pt_position` `pt_supervoxel_id` `pt_root_id`	Bound spatial point columns associated with the centroid of the cell nucleus
`valid_id`	The root id of the neuron when it the proofreading assessment was made
`status_dendrite`	The status of the dendrite proofreading. One of the three categories described above
`status_axon`	The status of the axon proofreading. One of the three categories described above

Deprecated table

This table is deprecated, use proofreading_status_and_strategy instead.

For archival analysis, use:

# Set version to 1078 or before
client.version = 1078

# Standard query
client.materialize.query_table('proofreading_status_public_release')

# Content-aware query
client.materialize.tables.proofreading_status_public_release(status_axon='extended').query()

Functional Coregistration Tables

To relate the structural data to functional data, cell bodies must be coregistered between the functional imaging and EM volumes. As with cell-typing, there are manual and automated methods for doing this, with the former having higher accuracy but lower throughput.

coregistration_manual_v4 : The results of manually verified coregistration. This table is well-verified, but contains fewer ROIs (N=15,352 root ids, 19,181 ROIs).
coregistration_auto_phase3_fwd_apl_vess_combined_v2 : The results of automated functional matching between the EM and 2-p functional data. This table is not manually verified, but contains more ROIs (N=35,466 root ids, N=83,046 ROIs).

Manual coregistration

Table name: coregistration_manual_v4

A table of EM nucleus centroids manually matched to Baylor functional units. A unique functional unit is identified by its session, scan_idx and unit_id. An EM nucleus centroid may have matched to more than one functional unit if it was scanned on more than one imaging field.

The key columns are:

Coregistration table
Column	Description
`id`	Soma ID for the cell
`pt_position` `pt_supervoxel_id` `pt_root_id`	Bound spatial point columns associated with the centroid of the cell nucleus
`session`	The session index from functional imaging
`scan_idx`	The scan index from functional imaging
`unit_id`	The functional unit index from imaging. Only unique within scan and session
`field`	The field index from functional imaging
`residual`	The residual distance between the functional and the assigned structural points after transformation, in microns
`score`	A separation score, measuring the difference between the residual distance to the assigned neuron and the distance to the nearest non-assigned neuron, in microns. This can be negative if the non-assigned neuron is closer than the assigned neuron. Larger values indicate fewer nearby neurons that could be confused with the assigned neuron.

Tip

This is a reference table on nucleus_detection_v0, and can be indexed by the same nucleus id.

Warning

This table includes duplicate entries for the same ‘pt_root_id’ and nucleus id if the coregistered cell has multiple unit recordings

# Standard query
client.materialize.query_table('coregistration_manual_v4')

# Content-aware query
client.materialize.tables.coregistration_manual_v4(id=example_nucleus_id).query()

This table coregistration_manual_v4 supercedes previous iterations of this table:

coregistration_manual_v3
coregistration_manual

Automated coregistration

Table name: coregistration_auto_phase3_fwd_apl_vess_combined_v2

A table of EM nucleus centroids automatically matched to Baylor functional units. This table reconciles the following two tables that both make a best match of the of registration using different techniques: coregistration_auto_phase3_fwd_v2 and apl_functional_coreg_vess_fwd. A unique functional unit is identified by its session, scan_idx and unit_id. An EM nucleus centroid may have matched to more than one functional unit if it was scanned on more than one imaging field.

The key columns are:

Coregistration table
Column	Description
`id`	Soma ID for the cell
`pt_position` `pt_supervoxel_id` `pt_root_id`	Bound spatial point columns associated with the centroid of the cell nucleus
`session`	The session index from functional imaging
`scan_idx`	The scan index from functional imaging
`unit_id`	The functional unit index from imaging. Only unique within scan and session
`field`	The field index from functional imaging
`residual`	The residual distance between the functional and the assigned structural points after transformation, in microns
`score`	A separation score, measuring the difference between the residual distance to the assigned neuron and the distance to the nearest non-assigned neuron, in microns. This can be negative if the non-assigned neuron is closer than the assigned neuron. Larger values indicate fewer nearby neurons that could be confused with the assigned neuron.

Tip

This is a reference table on nucleus_detection_v0, and can be indexed by the same nucleus id.

Warning

This table includes duplicate entries for the same pt_root_id and nucleus id if the coregistered cell has multiple unit recordings

# Standard query
client.materialize.query_table('coregistration_auto_phase3_fwd_apl_vess_combined_v2')

# Content-aware query
client.materialize.tables.coregistration_auto_phase3_fwd_apl_vess_combined_v2(id=example_nucleus_id).query()

This table coregistration_auto_phase3_fwd_apl_vess_combined_v2 supercedes previous iterations of this table:

coregistration_auto_phase3_fwd_apl_vess_combined_v2
apl_functional_coreg_forward_v5

Functional properties

Table name: digital_twin_properties_bcm_coreg_v4

A summary of the functional properties for each of the coregistered neurons using the coregistration information from: coregistration_manual_v4. For details, see (Wang et al. 2025) and (Ding et al. 2025).

The key columns are:

Fuctional properties of coregistered neurons
Column	Description
`cc_abs`	Test set performance of the digital twin model unit, higher is better
`cc_max`	Neuron variability score used to normalize digital twin model unit performance
`cc_norm`	Normalized model unit performance, higher is better
`OSI`	orientation selectivity index
`DSI`	direction selectivity index
`gOSI`	global orientation selectivity index
`gDSI`	global direction selectivity index
`pref_ori`	Preferred orientation in degrees (0 - 180), vertical bar moving right is 0 and orientation increases counter-clockwise
`pref_dir`	Preferred direction in degrees (0 - 360), vertical bar moving right is 0 and orientation increases counter-clockwise
`readout_loc_x`	X coordinate of the readout location, an approximation of receptive field center in stimulus space; (-1, -1) bottom-left, (1, 1) top-right
`readout_loc_y`	Y coordinate of the readout location, an approximation of receptive field center in stimulus space; (-1, -1) bottom-left, (1, 1) top-right

Tip

This is a reference table on nucleus_detection_v0, and can be indexed by the same nucleus id.

Warning

This table includes duplicate entries for the same ‘pt_root_id’ and nucleus id if the coregistered cell has multiple unit recordings

# Standard query
client.materialize.query_table('digital_twin_properties_bcm_coreg_v4')

# Content-aware query
client.materialize.tables.digital_twin_properties_bcm_coreg_v4(id=example_nucleus_id).query()

Merged to Automatic Coregistration

Table name: digital_twin_properties_bcm_coreg_auto_phase3_fwd_v2

Follows the same scheme as digital_twin_properties_bcm_coreg_v4, but uses the coregistration information from: coregistration_auto_phase3_fwd_v2.

# Standard query
client.materialize.query_table('digital_twin_properties_bcm_coreg_auto_phase3_fwd_v2')

# Content-aware query
client.materialize.tables.digital_twin_properties_bcm_coreg_auto_phase3_fwd_v2(id=example_nucleus_id).query()

Table name: digital_twin_properties_bcm_coreg_apl_vess_fwd

Follows the same scheme as digital_twin_properties_bcm_coreg_v4, but uses the coregistration information from: apl_functional_coreg_vess_fwd.

# Standard query
client.materialize.query_table('digital_twin_properties_bcm_coreg_apl_vess_fwd')

# Content-aware query
client.materialize.tables.digital_twin_properties_bcm_coreg_apl_vess_fwd(id=example_nucleus_id).query()

Overview of relevant tables

heard you like tables–here’s a table for your tables
Table Name	Number of Annotations	Description
`synapses_pni_2`	337,312,429	The locations of synapses and the segment ids of the pre and post-synaptic automated synapse detection
`nucleus_detection_v0`	144,120	The locations of nuclei detected via a fully automated method
`nucleus_alternative_points`	8,388	A reference annotation table marking alternative segment_id lookup locations for a subset of nuclei in nucleus_detection_v0 that is more accurate than the centroid location listed there
`nucleus_ref_neuron_svm`	144,120	reference annotation indicating the output of a model detecting which nucleus detections are neurons versus which are not 1
`coregistration_manual_v4`	19,181	A table indicating the association between individual units in the functional imaging data and nuclei in the structural data, derived from human powered matching. Includes residual and separation scores to help assess confidence
`coregistration_auto_phase3_fwd_apl_vess_combined_v2`	83,046	A table indicating the association between individual units in the functional imaging data and nuclei in the structural data, derived from the automated procedure. Includes residuals and separation scores to help assess confidence
`proofreading_status_and_strategy`	2020	A table indicating which neurons have been proofread on their axons or dendrites
`aibs_column_nonneuronal_ref`	542	Cell type reference annotations from a human expert of non-neuronal cells located amongst the Minnie Column
`allen_v1_column_types_slanted_ref`	1,357	Neuron cell type reference annotations from human experts of neuronal cells located amongst the Minnie Column
`allen_column_mtypes_v1`	1,357	Neuron cell type reference annotations from data driven unsupervised clustering of neuronal cells
`aibs_metamodel_mtypes_v661_v2`	72,158	Reference annotations indicating the output of a model predicting cell types across the dataset based on the labels from allen_column_mtypes_v1.1
`aibs_metamodel_celltypes_v661`	94,014	Reference annotations indicating the output of a model predicting cell classes based on the labels from allen_v1_column_types_slanted_ref and aibs_column_nonneuronal_ref
`baylor_log_reg_cell_type_coarse_v1`	55,063	Reference annotations indicated the output of a logistic regression model predicting whether the nucleus is part of an excitatory or inhibitory cell
`baylor_gnn_cell_type_fine_model_v2`	49,051	Reference annotations indicated the output of a graph neural network model predicting the cell type based on the human labels in allen_v1_column_types_slanted_ref
`vortex_astrocyte_proofreading_status`	126	This table reports the status of a manually selected subset of astrocytes within the VISP column. Astrocyte seelection and proofreading performed as part of VORTEX.

References

Celii, Brendan, Stelios Papadopoulos, Zhuokun Ding, Paul G. Fahey, Eric Wang, Christos Papadopoulos, Alexander B. Kunin, et al. 2025. “NEURD Offers Automated Proofreading and Feature Extraction for Connectomics.” Nature 640 (8058): 487–96. https://doi.org/10.1038/s41586-025-08660-5.

Ding, Zhuokun, Paul G. Fahey, Stelios Papadopoulos, Eric Y. Wang, Brendan Celii, Christos Papadopoulos, Andersen Chang, et al. 2025. “Functional Connectomics Reveals General Wiring Rule in Mouse Visual Cortex.” Nature 640 (8058): 459–69. https://doi.org/10.1038/s41586-025-08840-3.

Elabbady, Leila, Sharmishtaa Seshamani, Shang Mu, Gayathri Mahalingam, Casey M. Schneider-Mizell, Agnes L. Bodor, J. Alexander Bae, et al. 2025. “Perisomatic Ultrastructure Efficiently Classifies Cells in Mouse Cortex.” Nature 640 (8058): 478–86. https://doi.org/10.1038/s41586-024-07765-7.

Schneider-Mizell, Casey M., Agnes L. Bodor, Derrick Brittain, JoAnn Buchanan, Daniel J. Bumbarger, Leila Elabbady, Clare Gamlin, et al. 2025. “Inhibitory Specificity from a Connectomic Census of Mouse Visual Cortex.” Nature 640 (8058): 448–58. https://doi.org/10.1038/s41586-024-07780-8.

Wang, Eric Y., Paul G. Fahey, Zhuokun Ding, Stelios Papadopoulos, Kayla Ponder, Marissa A. Weis, Andersen Chang, et al. 2025. “Foundation Model of Neural Activity Predicts Response to New Stimulus Types.” Nature 640 (8058): 470–77. https://doi.org/10.1038/s41586-025-08829-y.