Class: ContigCollection
A set of individual, overlapping contigs that represent the complete sequenced genome of an organism.
URI: cdm:ContigCollection
classDiagram
class ContigCollection
click ContigCollection href "../ContigCollection"
HasNames <|-- ContigCollection
click HasNames href "../HasNames"
HasIdentifiers <|-- ContigCollection
click HasIdentifiers href "../HasIdentifiers"
HasHash <|-- ContigCollection
click HasHash href "../HasHash"
ContigCollection <|-- Genome
click Genome href "../Genome"
ContigCollection : asm_score
ContigCollection : checkm2_completeness
ContigCollection : checkm2_contamination
ContigCollection : contig_bp
ContigCollection : contig_collection_id
ContigCollection : contig_collection_type
ContigCollection --> "0..1" ContigCollectionType : contig_collection_type
click ContigCollectionType href "../ContigCollectionType"
ContigCollection : ctg_L50
ContigCollection : ctg_L90
ContigCollection : ctg_logsum
ContigCollection : ctg_max
ContigCollection : ctg_N50
ContigCollection : ctg_N90
ContigCollection : ctg_powsum
ContigCollection : gap_pct
ContigCollection : gc_avg
ContigCollection : gc_std
ContigCollection : hash
ContigCollection : identifiers
ContigCollection --> "*" Identifier : identifiers
click Identifier href "../Identifier"
ContigCollection : n_contigs
ContigCollection : n_scaffolds
ContigCollection : names
ContigCollection --> "*" Name : names
click Name href "../Name"
ContigCollection : scaf_bp
ContigCollection : scaf_L50
ContigCollection : scaf_L90
ContigCollection : scaf_l_gt50k
ContigCollection : scaf_logsum
ContigCollection : scaf_max
ContigCollection : scaf_N50
ContigCollection : scaf_N90
ContigCollection : scaf_n_gt50K
ContigCollection : scaf_pct_gt50K
ContigCollection : scaf_powsum
Inheritance
- ContigCollection [ HasNames HasIdentifiers HasHash]
Slots
Name | Cardinality and Range | Description | Inheritance |
---|---|---|---|
asm_score | 0..1 Float |
A composite score for comparing contig collection quality | direct |
checkm2_completeness | 0..1 Float |
Estimate of the completeness of a contig collection (MAG or genome), estimate... | direct |
checkm2_contamination | 0..1 Float |
Estimate of the contamination of a contig collection (MAG or genome), estimat... | direct |
contig_collection_id | 1 UUID |
Internal (CDM) unique identifier | direct |
contig_bp | 0..1 Integer |
Total size in bp of all contigs | direct |
contig_collection_type | 0..1 ContigCollectionType |
The type of contig collection | direct |
ctg_L50 | 0..1 Integer |
Given a set of contigs, the L50 is defined as the sequence length of the shor... | direct |
ctg_L90 | 0..1 Integer |
The L90 statistic is less than or equal to the L50 statistic; it is the lengt... | direct |
ctg_N50 | 0..1 Integer |
Given a set of contigs, each with its own length, the N50 count is defined as... | direct |
ctg_N90 | 0..1 Integer |
Given a set of contigs, each with its own length, the N90 count is defined as... | direct |
ctg_logsum | 0..1 Float |
The sum of the (length*log(length)) of all contigs, times some constant | direct |
ctg_max | 0..1 Integer |
Maximum contig length | direct |
ctg_powsum | 0..1 Float |
Powersum of all contigs is the same as logsum except that it uses the sum of ... | direct |
gap_pct | 0..1 Float |
The gap size percentage of all scaffolds | direct |
gc_avg | 0..1 Float |
The average GC content of the contig collection, expressed as a percentage | direct |
gc_std | 0..1 Float |
The standard deviation of GC content across the contig collection | direct |
n_contigs | 0..1 Integer |
Total number of contigs | direct |
n_scaffolds | 0..1 Integer |
Total number of scaffolds | direct |
scaf_L50 | 0..1 Integer |
Given a set of scaffolds, the L50 is defined as the sequence length of the sh... | direct |
scaf_L90 | 0..1 Integer |
The L90 statistic is less than or equal to the L50 statistic; it is the lengt... | direct |
scaf_N50 | 0..1 Integer |
Given a set of scaffolds, each with its own length, the N50 count is defined ... | direct |
scaf_N90 | 0..1 Integer |
Given a set of scaffolds, each with its own length, the N90 count is defined ... | direct |
scaf_bp | 0..1 Integer |
Total size in bp of all scaffolds | direct |
scaf_l_gt50k | 0..1 Integer |
The total length of scaffolds longer than 50,000 base pairs | direct |
scaf_logsum | 0..1 Float |
The sum of the (length*log(length)) of all scaffolds, times some constant | direct |
scaf_max | 0..1 Integer |
Maximum scaffold length | direct |
scaf_n_gt50K | 0..1 Integer |
The number of scaffolds longer than 50,000 base pairs | direct |
scaf_pct_gt50K | 0..1 Float |
The percentage of the total assembly length represented by scaffolds longer t... | direct |
scaf_powsum | 0..1 Float |
Powersum of all scaffolds is the same as logsum except that it uses the sum o... | direct |
names | * Name |
Names, alternative names, and synonyms for an entity | HasNames |
identifiers | * Identifier |
URIs or CURIEs used to refer to this entity | HasIdentifiers |
hash | 0..1 String |
A hash value generated from one or more object attributes that serves to ensu... | HasHash |
Usages
used by | used in | type | used |
---|---|---|---|
ContigXContigCollection | contig_collection_id | range | ContigCollection |
ContigCollectionXFeature | contig_collection_id | range | ContigCollection |
ContigCollectionXProtein | contig_collection_id | range | ContigCollection |
Aliases
- genome
- biological subject
- assembly
- contig collection
- contig set
Identifier and Mapping Information
Schema Source
- from schema: https://github.com/kbase/cdm-schema
Mappings
Mapping Type | Mapped Value |
---|---|
self | cdm:ContigCollection |
native | cdm:ContigCollection |
LinkML Source
Direct
name: ContigCollection
description: A set of individual, overlapping contigs that represent the complete
sequenced genome of an organism.
from_schema: https://github.com/kbase/cdm-schema
aliases:
- genome
- biological subject
- assembly
- contig collection
- contig set
mixins:
- HasNames
- HasIdentifiers
- HasHash
attributes:
asm_score:
name: asm_score
description: A composite score for comparing contig collection quality
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: float
checkm2_completeness:
name: checkm2_completeness
description: Estimate of the completeness of a contig collection (MAG or genome),
estimated by CheckM2 tool
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: float
checkm2_contamination:
name: checkm2_contamination
description: Estimate of the contamination of a contig collection (MAG or genome),
estimated by CheckM2 tool
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: float
contig_collection_id:
name: contig_collection_id
description: Internal (CDM) unique identifier.
from_schema: https://github.com/kbase/cdm-schema
identifier: true
domain_of:
- Contig_X_ContigCollection
- ContigCollection_X_Feature
- ContigCollection_X_Protein
- ContigCollection
range: UUID
required: true
contig_bp:
name: contig_bp
description: Total size in bp of all contigs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
contig_collection_type:
name: contig_collection_type
description: The type of contig collection.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: ContigCollectionType
ctg_L50:
name: ctg_L50
description: Given a set of contigs, the L50 is defined as the sequence length
of the shortest contig at 50% of the total contig collection length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
ctg_L90:
name: ctg_L90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all contigs of that length or longer
contains at least 90% of the sum of the lengths of all contigs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
ctg_N50:
name: ctg_N50
description: Given a set of contigs, each with its own length, the N50 count is
defined as the smallest number_of_contigs whose length sum makes up half of
contig collection size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
ctg_N90:
name: ctg_N90
description: Given a set of contigs, each with its own length, the N90 count is
defined as the smallest number of contigs whose length sum makes up 90% of contig
collection size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
ctg_logsum:
name: ctg_logsum
description: The sum of the (length*log(length)) of all contigs, times some constant.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: float
ctg_max:
name: ctg_max
description: Maximum contig length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
ctg_powsum:
name: ctg_powsum
description: Powersum of all contigs is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25)
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: float
gap_pct:
name: gap_pct
description: The gap size percentage of all scaffolds
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: float
gc_avg:
name: gc_avg
description: The average GC content of the contig collection, expressed as a percentage
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: float
gc_std:
name: gc_std
description: The standard deviation of GC content across the contig collection
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: float
n_contigs:
name: n_contigs
description: Total number of contigs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
n_scaffolds:
name: n_scaffolds
description: Total number of scaffolds
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
scaf_L50:
name: scaf_L50
description: Given a set of scaffolds, the L50 is defined as the sequence length
of the shortest scaffold at 50% of the total contig collection length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
scaf_L90:
name: scaf_L90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all scaffolds of that length or longer
contains at least 90% of the sum of the lengths of all scaffolds.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
scaf_N50:
name: scaf_N50
description: Given a set of scaffolds, each with its own length, the N50 count
is defined as the smallest number of scaffolds whose length sum makes up half
of contig collection size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
scaf_N90:
name: scaf_N90
description: Given a set of scaffolds, each with its own length, the N90 count
is defined as the smallest number of scaffolds whose length sum makes up 90%
of contig collection size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
scaf_bp:
name: scaf_bp
description: Total size in bp of all scaffolds
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
scaf_l_gt50k:
name: scaf_l_gt50k
description: The total length of scaffolds longer than 50,000 base pairs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
scaf_logsum:
name: scaf_logsum
description: The sum of the (length*log(length)) of all scaffolds, times some
constant. Increase the contiguity, the score will increase
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: float
scaf_max:
name: scaf_max
description: Maximum scaffold length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
scaf_n_gt50K:
name: scaf_n_gt50K
description: The number of scaffolds longer than 50,000 base pairs.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: integer
scaf_pct_gt50K:
name: scaf_pct_gt50K
description: The percentage of the total assembly length represented by scaffolds
longer than 50,000 base pairs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: float
scaf_powsum:
name: scaf_powsum
description: Powersum of all scaffolds is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25).
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
domain_of:
- ContigCollection
range: float
Induced
name: ContigCollection
description: A set of individual, overlapping contigs that represent the complete
sequenced genome of an organism.
from_schema: https://github.com/kbase/cdm-schema
aliases:
- genome
- biological subject
- assembly
- contig collection
- contig set
mixins:
- HasNames
- HasIdentifiers
- HasHash
attributes:
asm_score:
name: asm_score
description: A composite score for comparing contig collection quality
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: asm_score
owner: ContigCollection
domain_of:
- ContigCollection
range: float
checkm2_completeness:
name: checkm2_completeness
description: Estimate of the completeness of a contig collection (MAG or genome),
estimated by CheckM2 tool
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: checkm2_completeness
owner: ContigCollection
domain_of:
- ContigCollection
range: float
checkm2_contamination:
name: checkm2_contamination
description: Estimate of the contamination of a contig collection (MAG or genome),
estimated by CheckM2 tool
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: checkm2_contamination
owner: ContigCollection
domain_of:
- ContigCollection
range: float
contig_collection_id:
name: contig_collection_id
description: Internal (CDM) unique identifier.
from_schema: https://github.com/kbase/cdm-schema
identifier: true
alias: contig_collection_id
owner: ContigCollection
domain_of:
- Contig_X_ContigCollection
- ContigCollection_X_Feature
- ContigCollection_X_Protein
- ContigCollection
range: UUID
required: true
contig_bp:
name: contig_bp
description: Total size in bp of all contigs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: contig_bp
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
contig_collection_type:
name: contig_collection_type
description: The type of contig collection.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: contig_collection_type
owner: ContigCollection
domain_of:
- ContigCollection
range: ContigCollectionType
ctg_L50:
name: ctg_L50
description: Given a set of contigs, the L50 is defined as the sequence length
of the shortest contig at 50% of the total contig collection length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_L50
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
ctg_L90:
name: ctg_L90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all contigs of that length or longer
contains at least 90% of the sum of the lengths of all contigs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_L90
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
ctg_N50:
name: ctg_N50
description: Given a set of contigs, each with its own length, the N50 count is
defined as the smallest number_of_contigs whose length sum makes up half of
contig collection size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_N50
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
ctg_N90:
name: ctg_N90
description: Given a set of contigs, each with its own length, the N90 count is
defined as the smallest number of contigs whose length sum makes up 90% of contig
collection size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_N90
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
ctg_logsum:
name: ctg_logsum
description: The sum of the (length*log(length)) of all contigs, times some constant.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_logsum
owner: ContigCollection
domain_of:
- ContigCollection
range: float
ctg_max:
name: ctg_max
description: Maximum contig length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_max
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
ctg_powsum:
name: ctg_powsum
description: Powersum of all contigs is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25)
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: ctg_powsum
owner: ContigCollection
domain_of:
- ContigCollection
range: float
gap_pct:
name: gap_pct
description: The gap size percentage of all scaffolds
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: gap_pct
owner: ContigCollection
domain_of:
- ContigCollection
range: float
gc_avg:
name: gc_avg
description: The average GC content of the contig collection, expressed as a percentage
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: gc_avg
owner: ContigCollection
domain_of:
- ContigCollection
range: float
gc_std:
name: gc_std
description: The standard deviation of GC content across the contig collection
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: gc_std
owner: ContigCollection
domain_of:
- ContigCollection
range: float
n_contigs:
name: n_contigs
description: Total number of contigs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: n_contigs
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
n_scaffolds:
name: n_scaffolds
description: Total number of scaffolds
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: n_scaffolds
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaf_L50:
name: scaf_L50
description: Given a set of scaffolds, the L50 is defined as the sequence length
of the shortest scaffold at 50% of the total contig collection length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_L50
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaf_L90:
name: scaf_L90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all scaffolds of that length or longer
contains at least 90% of the sum of the lengths of all scaffolds.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_L90
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaf_N50:
name: scaf_N50
description: Given a set of scaffolds, each with its own length, the N50 count
is defined as the smallest number of scaffolds whose length sum makes up half
of contig collection size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_N50
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaf_N90:
name: scaf_N90
description: Given a set of scaffolds, each with its own length, the N90 count
is defined as the smallest number of scaffolds whose length sum makes up 90%
of contig collection size
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_N90
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaf_bp:
name: scaf_bp
description: Total size in bp of all scaffolds
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_bp
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaf_l_gt50k:
name: scaf_l_gt50k
description: The total length of scaffolds longer than 50,000 base pairs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_l_gt50k
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaf_logsum:
name: scaf_logsum
description: The sum of the (length*log(length)) of all scaffolds, times some
constant. Increase the contiguity, the score will increase
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_logsum
owner: ContigCollection
domain_of:
- ContigCollection
range: float
scaf_max:
name: scaf_max
description: Maximum scaffold length
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_max
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaf_n_gt50K:
name: scaf_n_gt50K
description: The number of scaffolds longer than 50,000 base pairs.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_n_gt50K
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaf_pct_gt50K:
name: scaf_pct_gt50K
description: The percentage of the total assembly length represented by scaffolds
longer than 50,000 base pairs
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_pct_gt50K
owner: ContigCollection
domain_of:
- ContigCollection
range: float
scaf_powsum:
name: scaf_powsum
description: Powersum of all scaffolds is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25).
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: scaf_powsum
owner: ContigCollection
domain_of:
- ContigCollection
range: float
names:
name: names
description: Names, alternative names, and synonyms for an entity.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: names
owner: ContigCollection
domain_of:
- HasNames
range: Name
multivalued: true
identifiers:
name: identifiers
description: URIs or CURIEs used to refer to this entity.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: identifiers
owner: ContigCollection
domain_of:
- HasIdentifiers
range: Identifier
multivalued: true
hash:
name: hash
description: A hash value generated from one or more object attributes that serves
to ensure the entity is unique.
from_schema: https://github.com/kbase/cdm-schema
rank: 1000
alias: hash
owner: ContigCollection
domain_of:
- HasHash
range: string