Class: Genome
A contigset with a completeness score of greater than 90% and a contamination score of less than 5%.
URI: kb_cdm:Genome
classDiagram
class Genome
click Genome href "../Genome"
Contigset <|-- Genome
click Contigset href "../Contigset"
Genome : asm_score
Genome : checkm2_completeness
Genome : checkm2_contamination
Genome : contig_bp
Genome : contigset_id
Genome : ctg_L50
Genome : ctg_L90
Genome : ctg_logsum
Genome : ctg_max
Genome : ctg_N50
Genome : ctg_N90
Genome : ctg_powsum
Genome : description
Genome : gap_pct
Genome : gc_avg
Genome : gc_std
Genome : hash
Genome : identifiers
Genome --> "*" Identifier : identifiers
click Identifier href "../Identifier"
Genome : n_contigs
Genome : n_scaffolds
Genome : names
Genome --> "*" Name : names
click Name href "../Name"
Genome : scaf_bp
Genome : scaf_L50
Genome : scaf_L90
Genome : scaf_l_gt50k
Genome : scaf_logsum
Genome : scaf_max
Genome : scaf_N50
Genome : scaf_N90
Genome : scaf_n_gt50K
Genome : scaf_pct_gt50K
Genome : scaf_powsum
Inheritance
Slots
Name | Cardinality and Range | Description | Inheritance |
---|---|---|---|
asm_score | 0..1 Float |
A composite score for comparing contigset quality | Contigset |
checkm2_completeness | 0..1 Float |
Estimate of the completeness of a contigset (MAG or genome), estimated by Che... | Contigset |
checkm2_contamination | 0..1 Float |
Estimate of the contamination of a contigset (MAG or genome), estimated by Ch... | Contigset |
contigset_id | 1 UUID |
Internal (CDM) unique identifier | Contigset |
contig_bp | 0..1 Integer |
Total size in bp of all contigs | Contigset |
ctg_L50 | 0..1 Integer |
Given a set of contigs, the L50 is defined as the sequence length of the shor... | Contigset |
ctg_L90 | 0..1 Integer |
The L90 statistic is less than or equal to the L50 statistic; it is the lengt... | Contigset |
ctg_N50 | 0..1 Integer |
Given a set of contigs, each with its own length, the N50 count is defined as... | Contigset |
ctg_N90 | 0..1 Integer |
Given a set of contigs, each with its own length, the N90 count is defined as... | Contigset |
ctg_logsum | 0..1 Float |
The sum of the (length*log(length)) of all contigs, times some constant | Contigset |
ctg_max | 0..1 Integer |
Maximum contig length | Contigset |
ctg_powsum | 0..1 Float |
Powersum of all contigs is the same as logsum except that it uses the sum of ... | Contigset |
gap_pct | 0..1 Float |
The gap size percentage of all scaffolds | Contigset |
gc_avg | 0..1 Float |
The average GC content of the contigset, expressed as a percentage | Contigset |
gc_std | 0..1 Float |
The standard deviation of GC content across the contigset | Contigset |
n_contigs | 0..1 Integer |
Total number of contigs | Contigset |
n_scaffolds | 0..1 Integer |
Total number of scaffolds | Contigset |
scaf_L50 | 0..1 Integer |
Given a set of scaffolds, the L50 is defined as the sequence length of the sh... | Contigset |
scaf_L90 | 0..1 Integer |
The L90 statistic is less than or equal to the L50 statistic; it is the lengt... | Contigset |
scaf_N50 | 0..1 Integer |
Given a set of scaffolds, each with its own length, the N50 count is defined ... | Contigset |
scaf_N90 | 0..1 Integer |
Given a set of scaffolds, each with its own length, the N90 count is defined ... | Contigset |
scaf_bp | 0..1 Integer |
Total size in bp of all scaffolds | Contigset |
scaf_l_gt50k | 0..1 Integer |
The total length of scaffolds longer than 50,000 base pairs | Contigset |
scaf_logsum | 0..1 Float |
The sum of the (length*log(length)) of all scaffolds, times some constant | Contigset |
scaf_max | 0..1 Integer |
Maximum scaffold length | Contigset |
scaf_n_gt50K | 0..1 Integer |
The number of scaffolds longer than 50,000 base pairs | Contigset |
scaf_pct_gt50K | 0..1 Float |
The percentage of the total assembly length represented by scaffolds longer t... | Contigset |
scaf_powsum | 0..1 Float |
Powersum of all scaffolds is the same as logsum except that it uses the sum o... | Contigset |
hash | 0..1 String |
A hash value generated from one or more object attributes that serves to ensu... | UniqueNamedThing |
identifiers | * Identifier |
URIs or CURIEs used to refer to this entity | NamedThingWithId |
description | 0..1 String |
Definition or description of the entity | NamedThing |
names | * Name |
Names, alternative names, and synonyms for an entity | NamedThing |
Identifier and Mapping Information
Schema Source
- from schema: https://github.com/kbase/cdm-schema
Mappings
Mapping Type | Mapped Value |
---|---|
self | kb_cdm:Genome |
native | kb_cdm:Genome |
LinkML Source
Direct
name: Genome
description: A contigset with a completeness score of greater than 90% and a contamination
score of less than 5%.
from_schema: https://github.com/kbase/cdm-schema
is_a: Contigset
Induced
name: Genome
description: A contigset with a completeness score of greater than 90% and a contamination
score of less than 5%.
from_schema: https://github.com/kbase/cdm-schema
is_a: Contigset
attributes:
asm_score:
name: asm_score
description: A composite score for comparing contigset quality
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: asm_score
owner: Genome
domain_of:
- Contigset
range: float
checkm2_completeness:
name: checkm2_completeness
description: Estimate of the completeness of a contigset (MAG or genome), estimated
by CheckM2 tool
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: checkm2_completeness
owner: Genome
domain_of:
- Contigset
range: float
checkm2_contamination:
name: checkm2_contamination
description: Estimate of the contamination of a contigset (MAG or genome), estimated
by CheckM2 tool
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: checkm2_contamination
owner: Genome
domain_of:
- Contigset
range: float
contigset_id:
name: contigset_id
description: Internal (CDM) unique identifier.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
identifier: true
alias: contigset_id
owner: Genome
domain_of:
- Contigset
range: UUID
required: true
contig_bp:
name: contig_bp
description: Total size in bp of all contigs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: contig_bp
owner: Genome
domain_of:
- Contigset
range: integer
ctg_L50:
name: ctg_L50
description: Given a set of contigs, the L50 is defined as the sequence length
of the shortest contig at 50% of the total contigset length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_L50
owner: Genome
domain_of:
- Contigset
range: integer
ctg_L90:
name: ctg_L90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all contigs of that length or longer
contains at least 90% of the sum of the lengths of all contigs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_L90
owner: Genome
domain_of:
- Contigset
range: integer
ctg_N50:
name: ctg_N50
description: Given a set of contigs, each with its own length, the N50 count is
defined as the smallest number_of_contigs whose length sum makes up half of
contigset size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_N50
owner: Genome
domain_of:
- Contigset
range: integer
ctg_N90:
name: ctg_N90
description: Given a set of contigs, each with its own length, the N90 count is
defined as the smallest number of contigs whose length sum makes up 90% of contigset
size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_N90
owner: Genome
domain_of:
- Contigset
range: integer
ctg_logsum:
name: ctg_logsum
description: The sum of the (length*log(length)) of all contigs, times some constant.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_logsum
owner: Genome
domain_of:
- Contigset
range: float
ctg_max:
name: ctg_max
description: Maximum contig length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_max
owner: Genome
domain_of:
- Contigset
range: integer
ctg_powsum:
name: ctg_powsum
description: Powersum of all contigs is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25)
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_powsum
owner: Genome
domain_of:
- Contigset
range: float
gap_pct:
name: gap_pct
description: The gap size percentage of all scaffolds
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: gap_pct
owner: Genome
domain_of:
- Contigset
range: float
gc_avg:
name: gc_avg
description: The average GC content of the contigset, expressed as a percentage
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: gc_avg
owner: Genome
domain_of:
- Contigset
range: float
gc_std:
name: gc_std
description: The standard deviation of GC content across the contigset
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: gc_std
owner: Genome
domain_of:
- Contigset
range: float
n_contigs:
name: n_contigs
description: Total number of contigs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: n_contigs
owner: Genome
domain_of:
- Contigset
range: integer
n_scaffolds:
name: n_scaffolds
description: Total number of scaffolds
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: n_scaffolds
owner: Genome
domain_of:
- Contigset
range: integer
scaf_L50:
name: scaf_L50
description: Given a set of scaffolds, the L50 is defined as the sequence length
of the shortest scaffold at 50% of the total contigset length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_L50
owner: Genome
domain_of:
- Contigset
range: integer
scaf_L90:
name: scaf_L90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all scaffolds of that length or longer
contains at least 90% of the sum of the lengths of all scaffolds.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_L90
owner: Genome
domain_of:
- Contigset
range: integer
scaf_N50:
name: scaf_N50
description: Given a set of scaffolds, each with its own length, the N50 count
is defined as the smallest number of scaffolds whose length sum makes up half
of contigset size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_N50
owner: Genome
domain_of:
- Contigset
range: integer
scaf_N90:
name: scaf_N90
description: Given a set of scaffolds, each with its own length, the N90 count
is defined as the smallest number of scaffolds whose length sum makes up 90%
of contigset size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_N90
owner: Genome
domain_of:
- Contigset
range: integer
scaf_bp:
name: scaf_bp
description: Total size in bp of all scaffolds
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_bp
owner: Genome
domain_of:
- Contigset
range: integer
scaf_l_gt50k:
name: scaf_l_gt50k
description: The total length of scaffolds longer than 50,000 base pairs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_l_gt50k
owner: Genome
domain_of:
- Contigset
range: integer
scaf_logsum:
name: scaf_logsum
description: The sum of the (length*log(length)) of all scaffolds, times some
constant. Increase the contiguity, the score will increase
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_logsum
owner: Genome
domain_of:
- Contigset
range: float
scaf_max:
name: scaf_max
description: Maximum scaffold length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_max
owner: Genome
domain_of:
- Contigset
range: integer
scaf_n_gt50K:
name: scaf_n_gt50K
description: The number of scaffolds longer than 50,000 base pairs.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_n_gt50K
owner: Genome
domain_of:
- Contigset
range: integer
scaf_pct_gt50K:
name: scaf_pct_gt50K
description: The percentage of the total assembly length represented by scaffolds
longer than 50,000 base pairs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_pct_gt50K
owner: Genome
domain_of:
- Contigset
range: float
scaf_powsum:
name: scaf_powsum
description: Powersum of all scaffolds is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25).
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_powsum
owner: Genome
domain_of:
- Contigset
range: float
hash:
name: hash
description: A hash value generated from one or more object attributes that serves
to ensure the entity is unique.
from_schema: https://github.com/kbase/cdm-schema
alias: hash
owner: Genome
domain_of:
- UniqueNamedThing
range: string
identifiers:
name: identifiers
description: URIs or CURIEs used to refer to this entity.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: identifiers
owner: Genome
domain_of:
- NamedThingWithId
range: Identifier
multivalued: true
description:
name: description
description: Definition or description of the entity.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: description
owner: Genome
domain_of:
- NamedThing
- Event
- Experiment
- Identifier
- Name
- Project
- Protein
- Sample
range: string
names:
name: names
description: Names, alternative names, and synonyms for an entity.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: names
owner: Genome
domain_of:
- NamedThing
range: Name
multivalued: true