Data Architecture¶
Documentation covering data architecture for CytoTable.
Sources¶
Data sources for CytoTable are measurement data created from other cell biology image analysis tools.
See below for a brief overview of these sources and data types.
CellProfiler generates CSV and SQLite databases data.
Cytominer-database generates SQLite databases (which includes table data based on CellProfiler CSV’s mentioned above).
DeepProfiler generates NPZ data.
Data structure¶
Compartments and Images¶
Data are organized into tables of generally two categories:
Images: Image tables generally include data about the image itself, including metadata.
Compartments: Compartment data tables (such as Cytoplasm, Cells, Nuclei, Actin, or others) includes measurements specific to an aspect or part of cells found within an image.
Identifying or Key Fields¶
Identifying or key fields for image and compartment tables may include the following:
TableNumber: Provides a unique number based on the file referenced to build CytoTable output to help distinguish from repeated values in ImageNumber, ObjectNumber or other metadata columns which are referenced. Typically useful when using multiple SQLite or CSV-based source datasets.
ImageNumber: Provides specificity on what image is being referenced (there may be many).
ObjectNumber: Provides specificity for a specific compartment object within an ImageNumber.
Parent_Cells: Provides a related Cell compartment ObjectNumber. This field is canonically referenced from the Cytoplasm compartment for joining Cytoplasm and Cell compartment data. (see Cytoplasm Compartment Data Relationships below for greater detail)
Parent_Nuclei: Provides a related Nuclei compartment ObjectNumber. This field is canonically referenced from the Cytoplasm compartment for joining Cytoplasm and Cell compartment data. (see Cytoplasm Compartment Data Relationships below for greater detail)
Relationships¶
Image Data Relationships¶
erDiagram Image ||--o{ Cytoplasm : includes Image ||--o{ Cells : includes Image ||--o{ Nuclei : includes Image ||--o{ Others : includes
The above diagram shows an example of image data relationships found within the data that is used by CytoTable. Namely: Each image may include zero or many compartment objects (Cytoplasm, Cells, Nuclei, etc)objects.
Cytoplasm Compartment Data Relationships¶
erDiagram Cytoplasm { integer ObjectNumber "Cytoplasm object number within image" integer Parent_Cells "Related Cells compartment object number" integer Parent_Nuclei "Related Nuclei compartment object number" } Cells { integer ObjectNumber "Cells object number within image" } Nuclei { integer ObjectNumber "Nuclei object number within image" } Cells ||--|| Cytoplasm : related-to Nuclei ||--|| Cytoplasm : related-to
The above diagram shows canonical relationships of the Cytoplasm compartment data to other compartments found within the data that is used by CytoTable. Each Cytoplasm object is related to Cells via the Parent_Cells field and Nuclei via the Parent_Nuclei field. These Parent_* fields are ObjectNumbers in their respective compartments.