Class: Genome
A contig collection with a completeness score of greater than 90% and a contamination score of less than 5%.
URI: cdm:Genome
classDiagram
class Genome
click Genome href "../Genome"
ContigCollection <|-- Genome
click ContigCollection href "../ContigCollection"
Genome : asm_score
Genome : checkm2_completeness
Genome : checkm2_contamination
Genome : contig_bp
Genome : contig_collection_id
Genome : contig_collection_type
Genome --> "0..1" ContigCollectionType : contig_collection_type
click ContigCollectionType href "../ContigCollectionType"
Genome : ctg_L50
Genome : ctg_L90
Genome : ctg_logsum
Genome : ctg_max
Genome : ctg_N50
Genome : ctg_N90
Genome : ctg_powsum
Genome : gap_pct
Genome : gc_avg
Genome : gc_std
Genome : hash
Genome : identifiers
Genome --> "*" Identifier : identifiers
click Identifier href "../Identifier"
Genome : n_contigs
Genome : n_scaffolds
Genome : names
Genome --> "*" Name : names
click Name href "../Name"
Genome : scaf_bp
Genome : scaf_L50
Genome : scaf_L90
Genome : scaf_l_gt50k
Genome : scaf_logsum
Genome : scaf_max
Genome : scaf_N50
Genome : scaf_N90
Genome : scaf_n_gt50K
Genome : scaf_pct_gt50K
Genome : scaf_powsum
Inheritance
- ContigCollection [ HasNames HasIdentifiers HasHash]
- Genome
Slots
Name | Cardinality and Range | Description | Inheritance |
---|---|---|---|
asm_score | 0..1 Float |
A composite score for comparing contig collection quality | ContigCollection |
checkm2_completeness | 0..1 Float |
Estimate of the completeness of a contig collection (MAG or genome), estimate... | ContigCollection |
checkm2_contamination | 0..1 Float |
Estimate of the contamination of a contig collection (MAG or genome), estimat... | ContigCollection |
contig_collection_id | 1 UUID |
Internal (CDM) unique identifier | ContigCollection |
contig_bp | 0..1 Integer |
Total size in bp of all contigs | ContigCollection |
contig_collection_type | 0..1 ContigCollectionType |
The type of contig collection | ContigCollection |
ctg_L50 | 0..1 Integer |
Given a set of contigs, the L50 is defined as the sequence length of the shor... | ContigCollection |
ctg_L90 | 0..1 Integer |
The L90 statistic is less than or equal to the L50 statistic; it is the lengt... | ContigCollection |
ctg_N50 | 0..1 Integer |
Given a set of contigs, each with its own length, the N50 count is defined as... | ContigCollection |
ctg_N90 | 0..1 Integer |
Given a set of contigs, each with its own length, the N90 count is defined as... | ContigCollection |
ctg_logsum | 0..1 Float |
The sum of the (length*log(length)) of all contigs, times some constant | ContigCollection |
ctg_max | 0..1 Integer |
Maximum contig length | ContigCollection |
ctg_powsum | 0..1 Float |
Powersum of all contigs is the same as logsum except that it uses the sum of ... | ContigCollection |
gap_pct | 0..1 Float |
The gap size percentage of all scaffolds | ContigCollection |
gc_avg | 0..1 Float |
The average GC content of the contig collection, expressed as a percentage | ContigCollection |
gc_std | 0..1 Float |
The standard deviation of GC content across the contig collection | ContigCollection |
n_contigs | 0..1 Integer |
Total number of contigs | ContigCollection |
n_scaffolds | 0..1 Integer |
Total number of scaffolds | ContigCollection |
scaf_L50 | 0..1 Integer |
Given a set of scaffolds, the L50 is defined as the sequence length of the sh... | ContigCollection |
scaf_L90 | 0..1 Integer |
The L90 statistic is less than or equal to the L50 statistic; it is the lengt... | ContigCollection |
scaf_N50 | 0..1 Integer |
Given a set of scaffolds, each with its own length, the N50 count is defined ... | ContigCollection |
scaf_N90 | 0..1 Integer |
Given a set of scaffolds, each with its own length, the N90 count is defined ... | ContigCollection |
scaf_bp | 0..1 Integer |
Total size in bp of all scaffolds | ContigCollection |
scaf_l_gt50k | 0..1 Integer |
The total length of scaffolds longer than 50,000 base pairs | ContigCollection |
scaf_logsum | 0..1 Float |
The sum of the (length*log(length)) of all scaffolds, times some constant | ContigCollection |
scaf_max | 0..1 Integer |
Maximum scaffold length | ContigCollection |
scaf_n_gt50K | 0..1 Integer |
The number of scaffolds longer than 50,000 base pairs | ContigCollection |
scaf_pct_gt50K | 0..1 Float |
The percentage of the total assembly length represented by scaffolds longer t... | ContigCollection |
scaf_powsum | 0..1 Float |
Powersum of all scaffolds is the same as logsum except that it uses the sum o... | ContigCollection |
names | * Name |
Names, alternative names, and synonyms for an entity | HasNames |
identifiers | * Identifier |
URIs or CURIEs used to refer to this entity | HasIdentifiers |
hash | 0..1 String |
A hash value generated from one or more object attributes that serves to ensu... | HasHash |
Identifier and Mapping Information
Schema Source
- from schema: https://github.com/kbase/cdm-schema
Mappings
Mapping Type | Mapped Value |
---|---|
self | cdm:Genome |
native | cdm:Genome |
LinkML Source
Direct
name: Genome
description: A contig collection with a completeness score of greater than 90% and
a contamination score of less than 5%.
from_schema: https://github.com/kbase/cdm-schema
is_a: ContigCollection
Induced
name: Genome
description: A contig collection with a completeness score of greater than 90% and
a contamination score of less than 5%.
from_schema: https://github.com/kbase/cdm-schema
is_a: ContigCollection
attributes:
asm_score:
name: asm_score
description: A composite score for comparing contig collection quality
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: asm_score
owner: Genome
domain_of:
- ContigCollection
range: float
checkm2_completeness:
name: checkm2_completeness
description: Estimate of the completeness of a contig collection (MAG or genome),
estimated by CheckM2 tool
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: checkm2_completeness
owner: Genome
domain_of:
- ContigCollection
range: float
checkm2_contamination:
name: checkm2_contamination
description: Estimate of the contamination of a contig collection (MAG or genome),
estimated by CheckM2 tool
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: checkm2_contamination
owner: Genome
domain_of:
- ContigCollection
range: float
contig_collection_id:
name: contig_collection_id
description: Internal (CDM) unique identifier.
from_schema: https://github.com/kbase/cdm-schema
identifier: true
alias: contig_collection_id
owner: Genome
domain_of:
- Contig_X_ContigCollection
- ContigCollection_X_Feature
- ContigCollection_X_Protein
- ContigCollection
range: UUID
required: true
contig_bp:
name: contig_bp
description: Total size in bp of all contigs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: contig_bp
owner: Genome
domain_of:
- ContigCollection
range: integer
contig_collection_type:
name: contig_collection_type
description: The type of contig collection.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: contig_collection_type
owner: Genome
domain_of:
- ContigCollection
range: ContigCollectionType
ctg_L50:
name: ctg_L50
description: Given a set of contigs, the L50 is defined as the sequence length
of the shortest contig at 50% of the total contig collection length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_L50
owner: Genome
domain_of:
- ContigCollection
range: integer
ctg_L90:
name: ctg_L90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all contigs of that length or longer
contains at least 90% of the sum of the lengths of all contigs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_L90
owner: Genome
domain_of:
- ContigCollection
range: integer
ctg_N50:
name: ctg_N50
description: Given a set of contigs, each with its own length, the N50 count is
defined as the smallest number_of_contigs whose length sum makes up half of
contig collection size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_N50
owner: Genome
domain_of:
- ContigCollection
range: integer
ctg_N90:
name: ctg_N90
description: Given a set of contigs, each with its own length, the N90 count is
defined as the smallest number of contigs whose length sum makes up 90% of contig
collection size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_N90
owner: Genome
domain_of:
- ContigCollection
range: integer
ctg_logsum:
name: ctg_logsum
description: The sum of the (length*log(length)) of all contigs, times some constant.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_logsum
owner: Genome
domain_of:
- ContigCollection
range: float
ctg_max:
name: ctg_max
description: Maximum contig length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_max
owner: Genome
domain_of:
- ContigCollection
range: integer
ctg_powsum:
name: ctg_powsum
description: Powersum of all contigs is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25)
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_powsum
owner: Genome
domain_of:
- ContigCollection
range: float
gap_pct:
name: gap_pct
description: The gap size percentage of all scaffolds
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: gap_pct
owner: Genome
domain_of:
- ContigCollection
range: float
gc_avg:
name: gc_avg
description: The average GC content of the contig collection, expressed as a percentage
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: gc_avg
owner: Genome
domain_of:
- ContigCollection
range: float
gc_std:
name: gc_std
description: The standard deviation of GC content across the contig collection
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: gc_std
owner: Genome
domain_of:
- ContigCollection
range: float
n_contigs:
name: n_contigs
description: Total number of contigs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: n_contigs
owner: Genome
domain_of:
- ContigCollection
range: integer
n_scaffolds:
name: n_scaffolds
description: Total number of scaffolds
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: n_scaffolds
owner: Genome
domain_of:
- ContigCollection
range: integer
scaf_L50:
name: scaf_L50
description: Given a set of scaffolds, the L50 is defined as the sequence length
of the shortest scaffold at 50% of the total contig collection length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_L50
owner: Genome
domain_of:
- ContigCollection
range: integer
scaf_L90:
name: scaf_L90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all scaffolds of that length or longer
contains at least 90% of the sum of the lengths of all scaffolds.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_L90
owner: Genome
domain_of:
- ContigCollection
range: integer
scaf_N50:
name: scaf_N50
description: Given a set of scaffolds, each with its own length, the N50 count
is defined as the smallest number of scaffolds whose length sum makes up half
of contig collection size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_N50
owner: Genome
domain_of:
- ContigCollection
range: integer
scaf_N90:
name: scaf_N90
description: Given a set of scaffolds, each with its own length, the N90 count
is defined as the smallest number of scaffolds whose length sum makes up 90%
of contig collection size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_N90
owner: Genome
domain_of:
- ContigCollection
range: integer
scaf_bp:
name: scaf_bp
description: Total size in bp of all scaffolds
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_bp
owner: Genome
domain_of:
- ContigCollection
range: integer
scaf_l_gt50k:
name: scaf_l_gt50k
description: The total length of scaffolds longer than 50,000 base pairs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_l_gt50k
owner: Genome
domain_of:
- ContigCollection
range: integer
scaf_logsum:
name: scaf_logsum
description: The sum of the (length*log(length)) of all scaffolds, times some
constant. Increase the contiguity, the score will increase
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_logsum
owner: Genome
domain_of:
- ContigCollection
range: float
scaf_max:
name: scaf_max
description: Maximum scaffold length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_max
owner: Genome
domain_of:
- ContigCollection
range: integer
scaf_n_gt50K:
name: scaf_n_gt50K
description: The number of scaffolds longer than 50,000 base pairs.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_n_gt50K
owner: Genome
domain_of:
- ContigCollection
range: integer
scaf_pct_gt50K:
name: scaf_pct_gt50K
description: The percentage of the total assembly length represented by scaffolds
longer than 50,000 base pairs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_pct_gt50K
owner: Genome
domain_of:
- ContigCollection
range: float
scaf_powsum:
name: scaf_powsum
description: Powersum of all scaffolds is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25).
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_powsum
owner: Genome
domain_of:
- ContigCollection
range: float
names:
name: names
description: Names, alternative names, and synonyms for an entity.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: names
owner: Genome
domain_of:
- HasNames
range: Name
multivalued: true
identifiers:
name: identifiers
description: URIs or CURIEs used to refer to this entity.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: identifiers
owner: Genome
domain_of:
- HasIdentifiers
range: Identifier
multivalued: true
hash:
name: hash
description: A hash value generated from one or more object attributes that serves
to ensure the entity is unique.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: hash
owner: Genome
domain_of:
- HasHash
range: string