Class: ContigCollection
A set of individual, overlapping contigs that represent the complete sequenced genome of an organism.
classDiagram
class ContigCollection
click ContigCollection href "../ContigCollection/"
Table <|-- ContigCollection
click Table href "../Table/"
ContigCollection : asm_score
ContigCollection : checkm_completeness
ContigCollection : checkm_contamination
ContigCollection : checkm_version
ContigCollection : contig_bp
ContigCollection : contig_collection_id
ContigCollection : contig_collection_type
ContigCollection --> "0..1" ContigCollectionType : contig_collection_type
click ContigCollectionType href "../ContigCollectionType/"
ContigCollection : contig_l50
ContigCollection : contig_l90
ContigCollection : contig_logsum
ContigCollection : contig_max
ContigCollection : contig_n50
ContigCollection : contig_n90
ContigCollection : contig_powersum
ContigCollection : gap_percent
ContigCollection : gc_average
ContigCollection : gc_std
ContigCollection : gtdb_taxon_id
ContigCollection : hash
ContigCollection : n_chromosomes
ContigCollection : n_contigs
ContigCollection : n_scaffolds
ContigCollection : ncbi_taxon_id
ContigCollection : scaffold_bp
ContigCollection : scaffold_l50
ContigCollection : scaffold_l90
ContigCollection : scaffold_logsum
ContigCollection : scaffold_maximum_length
ContigCollection : scaffold_n50
ContigCollection : scaffold_n90
ContigCollection : scaffold_powersum
ContigCollection : scaffolds_n_over_50K
ContigCollection : scaffolds_percent_over_50K
ContigCollection : scaffolds_total_length_over_50k
Inheritance
- Table
- ContigCollection
Slots
Name | Cardinality and Range | Description | Inheritance |
---|---|---|---|
contig_collection_id | 1 CdmContigCollectionId |
Internal (CDM) unique identifier for a contig collection. From the Entity table: entity_id where entity_type == 'ContigCollection'. |
direct |
hash | 0..1 String |
A hash value generated from one or more object attributes that serves to ensure the entity is unique. | direct |
asm_score | 0..1 Float |
A composite score for comparing contig collection quality. | direct |
checkm_completeness | 0..1 Float |
Estimate of the completeness of a contig collection (MAG or genome), estimated by CheckM tool. Ensure that percentage values are converted to floats. | direct |
checkm_contamination | 0..1 Float |
Estimate of the contamination of a contig collection (MAG or genome), estimated by CheckM tool. Ensure that percentage values are converted to floats. | direct |
checkm_version | 0..1 String |
Version of the CheckM tool used. | direct |
contig_bp | 0..1 Integer |
Total size in bp of all contigs | direct |
contig_collection_type | 0..1 ContigCollectionType |
The type of contig collection. | direct |
contig_l50 | 0..1 Integer |
Given a set of contigs, the L50 is defined as the sequence length of the shortest contig at 50% of the total contig collection length | direct |
contig_l90 | 0..1 Integer |
The L90 statistic is less than or equal to the L50 statistic; it is the length for which the collection of all contigs of that length or longer contains at least 90% of the sum of the lengths of all contigs | direct |
contig_n50 | 0..1 Integer |
Given a set of contigs, each with its own length, the N50 count is defined as the smallest number_of_contigs whose length sum makes up half of contig collection size | direct |
contig_n90 | 0..1 Integer |
Given a set of contigs, each with its own length, the N90 count is defined as the smallest number of contigs whose length sum makes up 90% of contig collection size | direct |
contig_logsum | 0..1 Float |
The sum of the (length*log(length)) of all contigs, times some constant. | direct |
contig_max | 0..1 Integer |
Maximum contig length | direct |
contig_powersum | 0..1 Float |
Powersum of all contigs is the same as logsum except that it uses the sum of (length*(length^P)) for some power P (default P=0.25) | direct |
gap_percent | 0..1 Float |
The gap size percentage of all scaffolds | direct |
gc_average | 0..1 Float |
The average GC content of the contig collection, expressed as a percentage | direct |
gc_std | 0..1 Float |
The standard deviation of GC content across the contig collection | direct |
gtdb_taxon_id | 0..1 Curie |
The GTDB taxon ID for this contig collection. | direct |
n_chromosomes | 0..1 Integer |
Total number of chromosomes | direct |
n_contigs | 0..1 Integer |
Total number of contigs | direct |
n_scaffolds | 0..1 Integer |
Total number of scaffolds | direct |
ncbi_taxon_id | 0..1 Curie |
The NCBI taxon ID for this contig collection. | direct |
scaffold_l50 | 0..1 Integer |
Given a set of scaffolds, the L50 is defined as the sequence length of the shortest scaffold at 50% of the total contig collection length | direct |
scaffold_l90 | 0..1 Integer |
The L90 statistic is less than or equal to the L50 statistic; it is the length for which the collection of all scaffolds of that length or longer contains at least 90% of the sum of the lengths of all scaffolds. | direct |
scaffold_n50 | 0..1 Integer |
Given a set of scaffolds, each with its own length, the N50 count is defined as the smallest number of scaffolds whose length sum makes up half of contig collection size | direct |
scaffold_n90 | 0..1 Integer |
Given a set of scaffolds, each with its own length, the N90 count is defined as the smallest number of scaffolds whose length sum makes up 90% of contig collection size | direct |
scaffold_bp | 0..1 Integer |
Total size in bp of all scaffolds | direct |
scaffold_logsum | 0..1 Float |
The sum of the (length*log(length)) of all scaffolds, times some constant. Increase the contiguity, the score will increase | direct |
scaffold_maximum_length | 0..1 Integer |
Maximum scaffold length | direct |
scaffold_powersum | 0..1 Float |
Powersum of all scaffolds is the same as logsum except that it uses the sum of (length*(length^P)) for some power P (default P=0.25). | direct |
scaffolds_n_over_50K | 0..1 Integer |
The number of scaffolds longer than 50,000 base pairs. | direct |
scaffolds_percent_over_50K | 0..1 Float |
The percentage of the total assembly length represented by scaffolds longer than 50,000 base pairs | direct |
scaffolds_total_length_over_50k | 0..1 Integer |
The total length of scaffolds longer than 50,000 base pairs | direct |
Usages
used by | used in | type | used |
---|---|---|---|
Schema | contig_collections | range | ContigCollection |
Aliases
- genome
- biological subject
- assembly
- contig collection
- contig set
Identifier and Mapping Information
Schema Source
- from schema: http://kbase.github.io/cdm-schema/linkml/cdm_schema
Mappings
Mapping Type | Mapped Value |
---|---|
self | kb_cdm:ContigCollection |
native | kb_cdm:ContigCollection |
LinkML Source
Direct
name: ContigCollection
description: A set of individual, overlapping contigs that represent the complete
sequenced genome of an organism.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_schema
aliases:
- genome
- biological subject
- assembly
- contig collection
- contig set
is_a: Table
slots:
- contig_collection_id
- hash
slot_usage:
contig_collection_id:
name: contig_collection_id
identifier: true
attributes:
asm_score:
name: asm_score
description: A composite score for comparing contig collection quality.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
domain_of:
- ContigCollection
range: float
checkm_completeness:
name: checkm_completeness
description: Estimate of the completeness of a contig collection (MAG or genome),
estimated by CheckM tool. Ensure that percentage values are converted to floats.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
domain_of:
- ContigCollection
range: float
checkm_contamination:
name: checkm_contamination
description: Estimate of the contamination of a contig collection (MAG or genome),
estimated by CheckM tool. Ensure that percentage values are converted to floats.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
domain_of:
- ContigCollection
range: float
checkm_version:
name: checkm_version
description: Version of the CheckM tool used.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
domain_of:
- ContigCollection
range: string
contig_bp:
name: contig_bp
description: Total size in bp of all contigs
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- sequence length
- total sequence length
rank: 1000
domain_of:
- ContigCollection
range: integer
contig_collection_type:
name: contig_collection_type
description: The type of contig collection.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
domain_of:
- ContigCollection
range: ContigCollectionType
contig_l50:
name: contig_l50
description: Given a set of contigs, the L50 is defined as the sequence length
of the shortest contig at 50% of the total contig collection length
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_L50
- contig_L50
rank: 1000
domain_of:
- ContigCollection
range: integer
contig_l90:
name: contig_l90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all contigs of that length or longer
contains at least 90% of the sum of the lengths of all contigs
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_L90
- contig_L90
rank: 1000
domain_of:
- ContigCollection
range: integer
contig_n50:
name: contig_n50
description: Given a set of contigs, each with its own length, the N50 count is
defined as the smallest number_of_contigs whose length sum makes up half of
contig collection size
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_N50
- contig_N50
rank: 1000
domain_of:
- ContigCollection
range: integer
contig_n90:
name: contig_n90
description: Given a set of contigs, each with its own length, the N90 count is
defined as the smallest number of contigs whose length sum makes up 90% of contig
collection size
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_N90
- contig_N90
rank: 1000
domain_of:
- ContigCollection
range: integer
contig_logsum:
name: contig_logsum
description: The sum of the (length*log(length)) of all contigs, times some constant.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_logsum
rank: 1000
domain_of:
- ContigCollection
range: float
contig_max:
name: contig_max
description: Maximum contig length
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_max
rank: 1000
domain_of:
- ContigCollection
range: integer
contig_powersum:
name: contig_powersum
description: Powersum of all contigs is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25)
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_powersum
- ctg_powsum
- contig_powsum
rank: 1000
domain_of:
- ContigCollection
range: float
gap_percent:
name: gap_percent
description: The gap size percentage of all scaffolds
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- gap_pct
rank: 1000
domain_of:
- ContigCollection
range: float
gc_average:
name: gc_average
description: The average GC content of the contig collection, expressed as a percentage
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- gc_avg
rank: 1000
domain_of:
- ContigCollection
range: float
gc_std:
name: gc_std
description: The standard deviation of GC content across the contig collection
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- gc_stdev
rank: 1000
domain_of:
- ContigCollection
range: float
gtdb_taxon_id:
name: gtdb_taxon_id
description: The GTDB taxon ID for this contig collection.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
domain_of:
- ContigCollection
range: curie
n_chromosomes:
name: n_chromosomes
description: Total number of chromosomes
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
domain_of:
- ContigCollection
range: integer
n_contigs:
name: n_contigs
description: Total number of contigs
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
domain_of:
- ContigCollection
range: integer
n_scaffolds:
name: n_scaffolds
description: Total number of scaffolds
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
domain_of:
- ContigCollection
range: integer
ncbi_taxon_id:
name: ncbi_taxon_id
description: The NCBI taxon ID for this contig collection.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
domain_of:
- ContigCollection
range: curie
scaffold_l50:
name: scaffold_l50
description: Given a set of scaffolds, the L50 is defined as the sequence length
of the shortest scaffold at 50% of the total contig collection length
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_L50
- scaffold_L50
rank: 1000
domain_of:
- ContigCollection
range: integer
scaffold_l90:
name: scaffold_l90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all scaffolds of that length or longer
contains at least 90% of the sum of the lengths of all scaffolds.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_L90
- scaffold_L90
rank: 1000
domain_of:
- ContigCollection
range: integer
scaffold_n50:
name: scaffold_n50
description: Given a set of scaffolds, each with its own length, the N50 count
is defined as the smallest number of scaffolds whose length sum makes up half
of contig collection size
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_N50
- scaffold_N50
rank: 1000
domain_of:
- ContigCollection
range: integer
scaffold_n90:
name: scaffold_n90
description: Given a set of scaffolds, each with its own length, the N90 count
is defined as the smallest number of scaffolds whose length sum makes up 90%
of contig collection size
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_N90
- scaffold_N90
rank: 1000
domain_of:
- ContigCollection
range: integer
scaffold_bp:
name: scaffold_bp
description: Total size in bp of all scaffolds
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_bp
rank: 1000
domain_of:
- ContigCollection
range: integer
scaffold_logsum:
name: scaffold_logsum
description: The sum of the (length*log(length)) of all scaffolds, times some
constant. Increase the contiguity, the score will increase
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_logsum
rank: 1000
domain_of:
- ContigCollection
range: float
scaffold_maximum_length:
name: scaffold_maximum_length
description: Maximum scaffold length
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_max
- scaffold_max
rank: 1000
domain_of:
- ContigCollection
range: integer
scaffold_powersum:
name: scaffold_powersum
description: Powersum of all scaffolds is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25).
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_powsum
- scaffold_powsum
rank: 1000
domain_of:
- ContigCollection
range: float
scaffolds_n_over_50K:
name: scaffolds_n_over_50K
description: The number of scaffolds longer than 50,000 base pairs.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_n_gt50K
- scaffold_n_gt50K
rank: 1000
domain_of:
- ContigCollection
range: integer
scaffolds_percent_over_50K:
name: scaffolds_percent_over_50K
description: The percentage of the total assembly length represented by scaffolds
longer than 50,000 base pairs
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_pct_gt50K
- scaffold_pct_gt50K
rank: 1000
domain_of:
- ContigCollection
range: float
scaffolds_total_length_over_50k:
name: scaffolds_total_length_over_50k
description: The total length of scaffolds longer than 50,000 base pairs
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_l_gt50k
- scaffold_l_gt50k
rank: 1000
domain_of:
- ContigCollection
range: integer
Induced
name: ContigCollection
description: A set of individual, overlapping contigs that represent the complete
sequenced genome of an organism.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_schema
aliases:
- genome
- biological subject
- assembly
- contig collection
- contig set
is_a: Table
slot_usage:
contig_collection_id:
name: contig_collection_id
identifier: true
attributes:
asm_score:
name: asm_score
description: A composite score for comparing contig collection quality.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
alias: asm_score
owner: ContigCollection
domain_of:
- ContigCollection
range: float
checkm_completeness:
name: checkm_completeness
description: Estimate of the completeness of a contig collection (MAG or genome),
estimated by CheckM tool. Ensure that percentage values are converted to floats.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
alias: checkm_completeness
owner: ContigCollection
domain_of:
- ContigCollection
range: float
checkm_contamination:
name: checkm_contamination
description: Estimate of the contamination of a contig collection (MAG or genome),
estimated by CheckM tool. Ensure that percentage values are converted to floats.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
alias: checkm_contamination
owner: ContigCollection
domain_of:
- ContigCollection
range: float
checkm_version:
name: checkm_version
description: Version of the CheckM tool used.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
alias: checkm_version
owner: ContigCollection
domain_of:
- ContigCollection
range: string
contig_bp:
name: contig_bp
description: Total size in bp of all contigs
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- sequence length
- total sequence length
rank: 1000
alias: contig_bp
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
contig_collection_type:
name: contig_collection_type
description: The type of contig collection.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
alias: contig_collection_type
owner: ContigCollection
domain_of:
- ContigCollection
range: ContigCollectionType
contig_l50:
name: contig_l50
description: Given a set of contigs, the L50 is defined as the sequence length
of the shortest contig at 50% of the total contig collection length
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_L50
- contig_L50
rank: 1000
alias: contig_l50
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
contig_l90:
name: contig_l90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all contigs of that length or longer
contains at least 90% of the sum of the lengths of all contigs
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_L90
- contig_L90
rank: 1000
alias: contig_l90
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
contig_n50:
name: contig_n50
description: Given a set of contigs, each with its own length, the N50 count is
defined as the smallest number_of_contigs whose length sum makes up half of
contig collection size
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_N50
- contig_N50
rank: 1000
alias: contig_n50
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
contig_n90:
name: contig_n90
description: Given a set of contigs, each with its own length, the N90 count is
defined as the smallest number of contigs whose length sum makes up 90% of contig
collection size
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_N90
- contig_N90
rank: 1000
alias: contig_n90
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
contig_logsum:
name: contig_logsum
description: The sum of the (length*log(length)) of all contigs, times some constant.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_logsum
rank: 1000
alias: contig_logsum
owner: ContigCollection
domain_of:
- ContigCollection
range: float
contig_max:
name: contig_max
description: Maximum contig length
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_max
rank: 1000
alias: contig_max
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
contig_powersum:
name: contig_powersum
description: Powersum of all contigs is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25)
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- ctg_powersum
- ctg_powsum
- contig_powsum
rank: 1000
alias: contig_powersum
owner: ContigCollection
domain_of:
- ContigCollection
range: float
gap_percent:
name: gap_percent
description: The gap size percentage of all scaffolds
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- gap_pct
rank: 1000
alias: gap_percent
owner: ContigCollection
domain_of:
- ContigCollection
range: float
gc_average:
name: gc_average
description: The average GC content of the contig collection, expressed as a percentage
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- gc_avg
rank: 1000
alias: gc_average
owner: ContigCollection
domain_of:
- ContigCollection
range: float
gc_std:
name: gc_std
description: The standard deviation of GC content across the contig collection
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- gc_stdev
rank: 1000
alias: gc_std
owner: ContigCollection
domain_of:
- ContigCollection
range: float
gtdb_taxon_id:
name: gtdb_taxon_id
description: The GTDB taxon ID for this contig collection.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
alias: gtdb_taxon_id
owner: ContigCollection
domain_of:
- ContigCollection
range: curie
n_chromosomes:
name: n_chromosomes
description: Total number of chromosomes
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
alias: n_chromosomes
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
n_contigs:
name: n_contigs
description: Total number of contigs
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
alias: n_contigs
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
n_scaffolds:
name: n_scaffolds
description: Total number of scaffolds
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
alias: n_scaffolds
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
ncbi_taxon_id:
name: ncbi_taxon_id
description: The NCBI taxon ID for this contig collection.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
rank: 1000
alias: ncbi_taxon_id
owner: ContigCollection
domain_of:
- ContigCollection
range: curie
scaffold_l50:
name: scaffold_l50
description: Given a set of scaffolds, the L50 is defined as the sequence length
of the shortest scaffold at 50% of the total contig collection length
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_L50
- scaffold_L50
rank: 1000
alias: scaffold_l50
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaffold_l90:
name: scaffold_l90
description: The L90 statistic is less than or equal to the L50 statistic; it
is the length for which the collection of all scaffolds of that length or longer
contains at least 90% of the sum of the lengths of all scaffolds.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_L90
- scaffold_L90
rank: 1000
alias: scaffold_l90
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaffold_n50:
name: scaffold_n50
description: Given a set of scaffolds, each with its own length, the N50 count
is defined as the smallest number of scaffolds whose length sum makes up half
of contig collection size
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_N50
- scaffold_N50
rank: 1000
alias: scaffold_n50
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaffold_n90:
name: scaffold_n90
description: Given a set of scaffolds, each with its own length, the N90 count
is defined as the smallest number of scaffolds whose length sum makes up 90%
of contig collection size
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_N90
- scaffold_N90
rank: 1000
alias: scaffold_n90
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaffold_bp:
name: scaffold_bp
description: Total size in bp of all scaffolds
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_bp
rank: 1000
alias: scaffold_bp
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaffold_logsum:
name: scaffold_logsum
description: The sum of the (length*log(length)) of all scaffolds, times some
constant. Increase the contiguity, the score will increase
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_logsum
rank: 1000
alias: scaffold_logsum
owner: ContigCollection
domain_of:
- ContigCollection
range: float
scaffold_maximum_length:
name: scaffold_maximum_length
description: Maximum scaffold length
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_max
- scaffold_max
rank: 1000
alias: scaffold_maximum_length
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaffold_powersum:
name: scaffold_powersum
description: Powersum of all scaffolds is the same as logsum except that it uses
the sum of (length*(length^P)) for some power P (default P=0.25).
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_powsum
- scaffold_powsum
rank: 1000
alias: scaffold_powersum
owner: ContigCollection
domain_of:
- ContigCollection
range: float
scaffolds_n_over_50K:
name: scaffolds_n_over_50K
description: The number of scaffolds longer than 50,000 base pairs.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_n_gt50K
- scaffold_n_gt50K
rank: 1000
alias: scaffolds_n_over_50K
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
scaffolds_percent_over_50K:
name: scaffolds_percent_over_50K
description: The percentage of the total assembly length represented by scaffolds
longer than 50,000 base pairs
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_pct_gt50K
- scaffold_pct_gt50K
rank: 1000
alias: scaffolds_percent_over_50K
owner: ContigCollection
domain_of:
- ContigCollection
range: float
scaffolds_total_length_over_50k:
name: scaffolds_total_length_over_50k
description: The total length of scaffolds longer than 50,000 base pairs
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
aliases:
- scaf_l_gt50k
- scaffold_l_gt50k
rank: 1000
alias: scaffolds_total_length_over_50k
owner: ContigCollection
domain_of:
- ContigCollection
range: integer
contig_collection_id:
name: contig_collection_id
description: 'Internal (CDM) unique identifier for a contig collection.
From the Entity table: entity_id where entity_type == ''ContigCollection''.
'
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_schema
rank: 1000
identifier: true
alias: contig_collection_id
owner: ContigCollection
domain_of:
- ContigCollection
- Contig_x_ContigCollection
- ContigCollection_x_EncodedFeature
- ContigCollection_x_Feature
- ContigCollection_x_Protein
range: cdm_contig_collection_id
required: true
hash:
name: hash
description: A hash value generated from one or more object attributes that serves
to ensure the entity is unique.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_schema
rank: 1000
alias: hash
owner: ContigCollection
domain_of:
- Contig
- ContigCollection
- EncodedFeature
- Feature
- Protein
range: string