Skip to content

kbase_cdm

Schema for KBase CDM

URI: https://github.com/kbase/cdm-schema

Name: kbase_cdm

Classes

Class Description
Any
Association An association between an object--typically an entity such as a protein or a ...
AssociationXEntity Links associations to entities
AssociationXPublication Links associations to supporting literature
AttributeValue The value for any value of a attribute for a sample
        ControlledTermValue A controlled term or class from an ontology
                ControlledIdentifiedTermValue A controlled term or class from an ontology, requiring the presence of term w...
        Geolocation A normalized value for a location on the earth's surface
        QuantityValue A simple quantity, e
        TextValue A basic string value
        TimestampValue A value that is a timestamp
Cluster Represents an individual execution of a clustering protocol
ClusterXProtein Relationship representing membership of a cluster
Contig A contig (derived from the word "contiguous") is a set of DNA segments or seq...
ContigXContigCollection Captures the relationship between a contig and a contig collection; equivalen...
ContigXFeature Captures the relationship between a contig and a feature; equivalent to featu...
ContigCollection A set of individual, overlapping contigs that represent the complete sequence...
        Genome A contig collection with a completeness score of greater than 90% and a conta...
ContigCollectionXFeature Captures the relationship between a contig collection and a feature; equivale...
ContigCollectionXProtein Captures the relationship between a contig collection and a protein; equivale...
Contributor Represents a contributor to the resource
ContributorXExperiment
ContributorXOrganization Captures the organization(s) to which a contributor belongs
ContributorXProject
DataSource The source dataset from which data within the CDM was extracted
EncodedFeature An entity generated from a feature, such as a transcript
EncodedFeatureXFeature Captures the relationship between a feature and its transcription product
EntailedEdge A relation graph edge that is inferred
Entity A database entity
Event Something that happened
Experiment A discrete scientific procedure undertaken to make a discovery, test a hypoth...
Feature A feature localized to an interval along a contig
FeatureXProtein Captures the relationship between a feature and a protein; equivalent to feat...
FeatureAttributes Additional attributes of a feature, parsed from column 9 of a GFF file
GoldEnvironmentalContext Environmental context, described using JGI's five level system
HasHash
HasIdentifiers Adds a multivalued 'identifiers' field to an object
HasNames Adds a multivalued 'names' field to an object
Identifier An external identifier for an entity
Location
Measurement A qualitative or quantitative observation of an attribute of an object or eve...
        ProcessedMeasurement A measurement that requires additional processing to generate a result
MeasurementSet A series of qualitative or quantitative measurements
MixsEnvironmentalContext Environmental context, described using the MiXS convention of broad and local...
Name The name or label for an entity
Organization
Prefix Maps CURIEs to URIs
Project Administrative unit for collecting data related to a certain topic, location,...
Protein Proteins are large, complex molecules made up of one or more long, folded cha...
Protocol Defined method or set of methods
ProtocolParticipant Either an input or an output of a protocol
Publication A publication (e
Sample A material entity that can be characterised by an experiment
Statements Represents an RDF triple
Thing A thing in the schema
        NamedThingWithIDs

Slots

Slot Description
affiliations List of organizations with which the contributor is affiliated
aggregator_knowledge_source The knowledge source that aggregated the association
altitude_units
altitude_value
annotation_date The date when the annotation was made
asm_score A composite score for comparing contig collection quality
association_id Unique identifier for an association
attribute_name The name of the attribute
attribute_value The value of the attribute
base The base URI a prefix will expand to
cds_phase For features of type CDS, the phase indicates where the next codon begins rel...
checkm2_completeness Estimate of the completeness of a contig collection (MAG or genome), estimate...
checkm2_contamination Estimate of the contamination of a contig collection (MAG or genome), estimat...
cluster_id Internal (CDM) unique identifier
comments Any comments about the association
conditions TBD
contig_bp Total size in bp of all contigs
contig_collection_id Internal (CDM) unique identifier for a contig collection
contig_collection_type The type of contig collection
contig_id Internal (CDM) unique identifier for a contig
contributor_id Internal (CDM) unique identifier for a contributor
contributor_roles List of roles played by the contributor when working on the resource
contributor_type Must be either 'Person' or 'Organization'
created POSIX timestamp for when the entity was created or added to the CDM
created_at The time at which the event started or was created
ctg_L50 Given a set of contigs, the L50 is defined as the sequence length of the shor...
ctg_L90 The L90 statistic is less than or equal to the L50 statistic; it is the lengt...
ctg_logsum The sum of the (length*log(length)) of all contigs, times some constant
ctg_max Maximum contig length
ctg_N50 Given a set of contigs, each with its own length, the N50 count is defined as...
ctg_N90 Given a set of contigs, each with its own length, the N90 count is defined as...
ctg_powsum Powersum of all contigs is the same as logsum except that it uses the sum of ...
data_source How this entity was derived and added to the database
datatype the rdf datatype of the value, for example, xsd:string
date_accessed The date when the data was downloaded from the data source
depth_units
depth_value
description Definition or description
doi The DOI for a protocol
e_value The 'score' of the feature
ecosystem JGI GOLD descriptor representing the top level ecosystem categorization
ecosystem_category JGI GOLD descriptor representing the ecosystem category
ecosystem_subtype JGI GOLD descriptor representing the subtype of ecosystem
ecosystem_type JGI GOLD descriptor representing the ecosystem type
elevation_units
elevation_value
encoded_by The feature(s) that encode this protein
encoded_feature_id Internal (CDM) unique identifier
end The start and end coordinates of the feature are given in positive 1-based in...
entity_id Internal (CDM) unique identifier for an entity in the CDM
entity_type The class of the entity
env_broad_scale Report the major environmental system the sample or specimen came from
env_local_scale Report the entity or entities which are in the sample or specimen's local vic...
env_medium Report the environmental material(s) immediately surrounding the sample or sp...
event_id Internal (CDM) unique identifier
evidence_for_existence The evidence that this protein exists
evidence_type The type of evidence supporting the association
experiment_id Internal (CDM) unique identifier for an experiment
family_name The family name(s) of the contributor
feature_id Internal (CDM) unique identifier for a feature
gap_pct The gap size percentage of all scaffolds
gc_avg The average GC content of the contig collection, expressed as a percentage
gc_content GC content of the contig, expressed as a percentage
gc_std The standard deviation of GC content across the contig collection
generated_by The algorithm or procedure that generated the feature
given_name The given name(s) of the contributor
gold_environmental_context The environmental context for this event
has_participant Participants in an experiment
has_stop_codon Captures whether or not the sequence includes a stop coordinates
hash A hash value generated from one or more object attributes that serves to ensu...
id An identifier for an element
identifier Fully-qualifier URL or CURIE used as an identifier for an entity
identifiers URIs or CURIEs used to refer to this entity
inputs The inputs for a protocol; may be software parameters, experimental reagents,...
language the human language in which the value is encoded, e
latitude
length Length of the contig in bp
location The location for this event
longitude
maximum_numeric_value
measurement_id Internal (CDM) unique identifier
minimum_numeric_value
mixs_environmental_context The environmental context for this event
n_contigs Total number of contigs
n_scaffolds Total number of scaffolds
name A string used as a name or title
names Names, alternative names, and synonyms for an entity
negated If true, the relationship between the subject and object is negated
numeric_value
object Note the range of this slot is always a node
organization_id Internal (CDM) unique identifier
outputs The outputs of a protocol; may be physical entities, files, etc
p_value The 'score' of the feature
part_of The project to which this experiment belongs
person_id Internal (CDM) unique identifier for a contributor
predicate The predicate of the statement
prefix A standardized prefix such as 'GO' or 'rdf' or 'FlyBase'
primary_knowledge_source The knowledge source that created the association
project_id Internal (CDM) unique identifier for a project
protein_id Internal (CDM) unique identifier for a protein within a cluster
protocol_id Protocol used to generate the cluster
protocol_participant_id Internal (CDM) unique identifier
publication_id Unique identifier for a publication
publications Publications that support the association
quality The quality of the measurement, indicating the confidence that one can have i...
raw_value
sample_id Internal (CDM) unique identifier
scaf_bp Total size in bp of all scaffolds
scaf_L50 Given a set of scaffolds, the L50 is defined as the sequence length of the sh...
scaf_L90 The L90 statistic is less than or equal to the L50 statistic; it is the lengt...
scaf_l_gt50k The total length of scaffolds longer than 50,000 base pairs
scaf_logsum The sum of the (length*log(length)) of all scaffolds, times some constant
scaf_max Maximum scaffold length
scaf_N50 Given a set of scaffolds, each with its own length, the N50 count is defined ...
scaf_N90 Given a set of scaffolds, each with its own length, the N90 count is defined ...
scaf_n_gt50K The number of scaffolds longer than 50,000 base pairs
scaf_pct_gt50K The percentage of the total assembly length represented by scaffolds longer t...
scaf_powsum Powersum of all scaffolds is the same as logsum except that it uses the sum o...
score Output from clustering protocol
sequence The protein amino acid sequence
source The name of the data source
source_database ID of the data source from which this entity came
specific_ecosystem JGI GOLD descriptor representing the most specific level of ecosystem categor...
start The start and end coordinates of the feature are given in positive 1-based in...
strand The strand of the feature
subject The subject of the statement
supporting_objects Objects that support the association
term
type
unit
updated POSIX timestamp for when the entity was updated in the CDM
url The URL from which the data was loaded
value Note the range of this slot is always a string
version For versioned data sources, the version of the dataset

Enumerations

Enumeration Description
CdsPhaseType For features of type CDS (coding sequence), the phase indicates where the fea...
ContigCollectionType The type of the contig set; the type of the 'omics data set
ContributorRole The role of a contributor to a resource
ContributorType The type of contributor being represented
EntityType The type of an entity
ProteinEvidenceForExistence The evidence for the existence of a biological entity
RefSeqStatusType RefSeq status codes, taken from https://www
StrandType The strand that a feature appears on relative to a landmark

Types

Type Description
Boolean A binary (true or false) value
Curie a compact URI
Date a date (year, month and day) in an idealized calendar
DateOrDatetime Either a date or a datetime
Datetime The combination of a date and time
Decimal A real number with arbitrary precision that conforms to the xsd:decimal speci...
Double A real number that conforms to the xsd:double specification
Float A real number that conforms to the xsd:float specification
Integer An integer
Jsonpath A string encoding a JSON Path
Jsonpointer A string encoding a JSON Pointer
LiteralAsStringType
Ncname Prefix part of CURIE
NodeIdType IDs are either CURIEs, IRI, or blank nodes
Nodeidentifier A URI, CURIE or BNODE that represents a node in a model
Objectidentifier A URI or CURIE that represents an object in the model
Sparqlpath A string encoding a SPARQL Property Path
String A character string
Time A time object represents a (local) time of day, independent of any particular...
Uri a complete URI
Uriorcurie a URI or a CURIE
UUID A universally unique ID generated the KBase CDM namespace, generating using u...

Subsets

Subset Description