Skip to content

Class: ContigCollection

A set of individual, overlapping contigs that represent the complete sequenced genome of an organism.

URI: kb_cdm:ContigCollection

classDiagram class ContigCollection click ContigCollection href "../ContigCollection/" Table <|-- ContigCollection click Table href "../Table/" ContigCollection : asm_score ContigCollection : checkm_completeness ContigCollection : checkm_contamination ContigCollection : checkm_version ContigCollection : contig_bp ContigCollection : contig_collection_id ContigCollection : contig_collection_type ContigCollection --> "0..1" ContigCollectionType : contig_collection_type click ContigCollectionType href "../ContigCollectionType/" ContigCollection : contig_l50 ContigCollection : contig_l90 ContigCollection : contig_logsum ContigCollection : contig_max ContigCollection : contig_n50 ContigCollection : contig_n90 ContigCollection : contig_powersum ContigCollection : gap_percent ContigCollection : gc_average ContigCollection : gc_std ContigCollection : gtdb_taxon_id ContigCollection : hash ContigCollection : n_chromosomes ContigCollection : n_contigs ContigCollection : n_scaffolds ContigCollection : ncbi_taxon_id ContigCollection : scaffold_bp ContigCollection : scaffold_l50 ContigCollection : scaffold_l90 ContigCollection : scaffold_logsum ContigCollection : scaffold_maximum_length ContigCollection : scaffold_n50 ContigCollection : scaffold_n90 ContigCollection : scaffold_powersum ContigCollection : scaffolds_n_over_50K ContigCollection : scaffolds_percent_over_50K ContigCollection : scaffolds_total_length_over_50k

Inheritance

Slots

Name Cardinality and Range Description Inheritance
contig_collection_id 1
CdmContigCollectionId
Internal (CDM) unique identifier for a contig collection.
From the Entity table: entity_id where entity_type == 'ContigCollection'.
direct
hash 0..1
String
A hash value generated from one or more object attributes that serves to ensure the entity is unique. direct
asm_score 0..1
Float
A composite score for comparing contig collection quality. direct
checkm_completeness 0..1
Float
Estimate of the completeness of a contig collection (MAG or genome), estimated by CheckM tool. Ensure that percentage values are converted to floats. direct
checkm_contamination 0..1
Float
Estimate of the contamination of a contig collection (MAG or genome), estimated by CheckM tool. Ensure that percentage values are converted to floats. direct
checkm_version 0..1
String
Version of the CheckM tool used. direct
contig_bp 0..1
Integer
Total size in bp of all contigs direct
contig_collection_type 0..1
ContigCollectionType
The type of contig collection. direct
contig_l50 0..1
Integer
Given a set of contigs, the L50 is defined as the sequence length of the shortest contig at 50% of the total contig collection length direct
contig_l90 0..1
Integer
The L90 statistic is less than or equal to the L50 statistic; it is the length for which the collection of all contigs of that length or longer contains at least 90% of the sum of the lengths of all contigs direct
contig_n50 0..1
Integer
Given a set of contigs, each with its own length, the N50 count is defined as the smallest number_of_contigs whose length sum makes up half of contig collection size direct
contig_n90 0..1
Integer
Given a set of contigs, each with its own length, the N90 count is defined as the smallest number of contigs whose length sum makes up 90% of contig collection size direct
contig_logsum 0..1
Float
The sum of the (length*log(length)) of all contigs, times some constant. direct
contig_max 0..1
Integer
Maximum contig length direct
contig_powersum 0..1
Float
Powersum of all contigs is the same as logsum except that it uses the sum of (length*(length^P)) for some power P (default P=0.25) direct
gap_percent 0..1
Float
The gap size percentage of all scaffolds direct
gc_average 0..1
Float
The average GC content of the contig collection, expressed as a percentage direct
gc_std 0..1
Float
The standard deviation of GC content across the contig collection direct
gtdb_taxon_id 0..1
Curie
The GTDB taxon ID for this contig collection. direct
n_chromosomes 0..1
Integer
Total number of chromosomes direct
n_contigs 0..1
Integer
Total number of contigs direct
n_scaffolds 0..1
Integer
Total number of scaffolds direct
ncbi_taxon_id 0..1
Curie
The NCBI taxon ID for this contig collection. direct
scaffold_l50 0..1
Integer
Given a set of scaffolds, the L50 is defined as the sequence length of the shortest scaffold at 50% of the total contig collection length direct
scaffold_l90 0..1
Integer
The L90 statistic is less than or equal to the L50 statistic; it is the length for which the collection of all scaffolds of that length or longer contains at least 90% of the sum of the lengths of all scaffolds. direct
scaffold_n50 0..1
Integer
Given a set of scaffolds, each with its own length, the N50 count is defined as the smallest number of scaffolds whose length sum makes up half of contig collection size direct
scaffold_n90 0..1
Integer
Given a set of scaffolds, each with its own length, the N90 count is defined as the smallest number of scaffolds whose length sum makes up 90% of contig collection size direct
scaffold_bp 0..1
Integer
Total size in bp of all scaffolds direct
scaffold_logsum 0..1
Float
The sum of the (length*log(length)) of all scaffolds, times some constant. Increase the contiguity, the score will increase direct
scaffold_maximum_length 0..1
Integer
Maximum scaffold length direct
scaffold_powersum 0..1
Float
Powersum of all scaffolds is the same as logsum except that it uses the sum of (length*(length^P)) for some power P (default P=0.25). direct
scaffolds_n_over_50K 0..1
Integer
The number of scaffolds longer than 50,000 base pairs. direct
scaffolds_percent_over_50K 0..1
Float
The percentage of the total assembly length represented by scaffolds longer than 50,000 base pairs direct
scaffolds_total_length_over_50k 0..1
Integer
The total length of scaffolds longer than 50,000 base pairs direct

Usages

used by used in type used
Schema contig_collections range ContigCollection

Aliases

  • genome
  • biological subject
  • assembly
  • contig collection
  • contig set

Identifier and Mapping Information

Schema Source

  • from schema: http://kbase.github.io/cdm-schema/linkml/cdm_schema

Mappings

Mapping Type Mapped Value
self kb_cdm:ContigCollection
native kb_cdm:ContigCollection

LinkML Source

Direct

name: ContigCollection
description: A set of individual, overlapping contigs that represent the complete
  sequenced genome of an organism.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_schema
aliases:
- genome
- biological subject
- assembly
- contig collection
- contig set
is_a: Table
slots:
- contig_collection_id
- hash
slot_usage:
  contig_collection_id:
    name: contig_collection_id
    identifier: true
attributes:
  asm_score:
    name: asm_score
    description: A composite score for comparing contig collection quality.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    domain_of:
    - ContigCollection
    range: float
  checkm_completeness:
    name: checkm_completeness
    description: Estimate of the completeness of a contig collection (MAG or genome),
      estimated by CheckM tool. Ensure that percentage values are converted to floats.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    domain_of:
    - ContigCollection
    range: float
  checkm_contamination:
    name: checkm_contamination
    description: Estimate of the contamination of a contig collection (MAG or genome),
      estimated by CheckM tool. Ensure that percentage values are converted to floats.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    domain_of:
    - ContigCollection
    range: float
  checkm_version:
    name: checkm_version
    description: Version of the CheckM tool used.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    domain_of:
    - ContigCollection
    range: string
  contig_bp:
    name: contig_bp
    description: Total size in bp of all contigs
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - sequence length
    - total sequence length
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  contig_collection_type:
    name: contig_collection_type
    description: The type of contig collection.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    domain_of:
    - ContigCollection
    range: ContigCollectionType
  contig_l50:
    name: contig_l50
    description: Given a set of contigs, the L50 is defined as the sequence length
      of the shortest contig at 50% of the total contig collection length
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_L50
    - contig_L50
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  contig_l90:
    name: contig_l90
    description: The L90 statistic is less than or equal to the L50 statistic; it
      is the length for which the collection of all contigs of that length or longer
      contains at least 90% of the sum of the lengths of all contigs
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_L90
    - contig_L90
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  contig_n50:
    name: contig_n50
    description: Given a set of contigs, each with its own length, the N50 count is
      defined as the smallest number_of_contigs whose length sum makes up half of
      contig collection size
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_N50
    - contig_N50
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  contig_n90:
    name: contig_n90
    description: Given a set of contigs, each with its own length, the N90 count is
      defined as the smallest number of contigs whose length sum makes up 90% of contig
      collection size
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_N90
    - contig_N90
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  contig_logsum:
    name: contig_logsum
    description: The sum of the (length*log(length)) of all contigs, times some constant.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_logsum
    rank: 1000
    domain_of:
    - ContigCollection
    range: float
  contig_max:
    name: contig_max
    description: Maximum contig length
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_max
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  contig_powersum:
    name: contig_powersum
    description: Powersum of all contigs is the same as logsum except that it uses
      the sum of (length*(length^P)) for some power P (default P=0.25)
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_powersum
    - ctg_powsum
    - contig_powsum
    rank: 1000
    domain_of:
    - ContigCollection
    range: float
  gap_percent:
    name: gap_percent
    description: The gap size percentage of all scaffolds
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - gap_pct
    rank: 1000
    domain_of:
    - ContigCollection
    range: float
  gc_average:
    name: gc_average
    description: The average GC content of the contig collection, expressed as a percentage
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - gc_avg
    rank: 1000
    domain_of:
    - ContigCollection
    range: float
  gc_std:
    name: gc_std
    description: The standard deviation of GC content across the contig collection
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - gc_stdev
    rank: 1000
    domain_of:
    - ContigCollection
    range: float
  gtdb_taxon_id:
    name: gtdb_taxon_id
    description: The GTDB taxon ID for this contig collection.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    domain_of:
    - ContigCollection
    range: curie
  n_chromosomes:
    name: n_chromosomes
    description: Total number of chromosomes
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  n_contigs:
    name: n_contigs
    description: Total number of contigs
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  n_scaffolds:
    name: n_scaffolds
    description: Total number of scaffolds
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  ncbi_taxon_id:
    name: ncbi_taxon_id
    description: The NCBI taxon ID for this contig collection.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    domain_of:
    - ContigCollection
    range: curie
  scaffold_l50:
    name: scaffold_l50
    description: Given a set of scaffolds, the L50 is defined as the sequence length
      of the shortest scaffold at 50% of the total contig collection length
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_L50
    - scaffold_L50
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  scaffold_l90:
    name: scaffold_l90
    description: The L90 statistic is less than or equal to the L50 statistic; it
      is the length for which the collection of all scaffolds of that length or longer
      contains at least 90% of the sum of the lengths of all scaffolds.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_L90
    - scaffold_L90
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  scaffold_n50:
    name: scaffold_n50
    description: Given a set of scaffolds, each with its own length, the N50 count
      is defined as the smallest number of scaffolds whose length sum makes up half
      of contig collection size
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_N50
    - scaffold_N50
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  scaffold_n90:
    name: scaffold_n90
    description: Given a set of scaffolds, each with its own length, the N90 count
      is defined as the smallest number of scaffolds whose length sum makes up 90%
      of contig collection size
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_N90
    - scaffold_N90
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  scaffold_bp:
    name: scaffold_bp
    description: Total size in bp of all scaffolds
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_bp
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  scaffold_logsum:
    name: scaffold_logsum
    description: The sum of the (length*log(length)) of all scaffolds, times some
      constant. Increase the contiguity, the score will increase
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_logsum
    rank: 1000
    domain_of:
    - ContigCollection
    range: float
  scaffold_maximum_length:
    name: scaffold_maximum_length
    description: Maximum scaffold length
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_max
    - scaffold_max
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  scaffold_powersum:
    name: scaffold_powersum
    description: Powersum of all scaffolds is the same as logsum except that it uses
      the sum of (length*(length^P)) for some power P (default P=0.25).
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_powsum
    - scaffold_powsum
    rank: 1000
    domain_of:
    - ContigCollection
    range: float
  scaffolds_n_over_50K:
    name: scaffolds_n_over_50K
    description: The number of scaffolds longer than 50,000 base pairs.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_n_gt50K
    - scaffold_n_gt50K
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer
  scaffolds_percent_over_50K:
    name: scaffolds_percent_over_50K
    description: The percentage of the total assembly length represented by scaffolds
      longer than 50,000 base pairs
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_pct_gt50K
    - scaffold_pct_gt50K
    rank: 1000
    domain_of:
    - ContigCollection
    range: float
  scaffolds_total_length_over_50k:
    name: scaffolds_total_length_over_50k
    description: The total length of scaffolds longer than 50,000 base pairs
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_l_gt50k
    - scaffold_l_gt50k
    rank: 1000
    domain_of:
    - ContigCollection
    range: integer

Induced

name: ContigCollection
description: A set of individual, overlapping contigs that represent the complete
  sequenced genome of an organism.
from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_schema
aliases:
- genome
- biological subject
- assembly
- contig collection
- contig set
is_a: Table
slot_usage:
  contig_collection_id:
    name: contig_collection_id
    identifier: true
attributes:
  asm_score:
    name: asm_score
    description: A composite score for comparing contig collection quality.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    alias: asm_score
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: float
  checkm_completeness:
    name: checkm_completeness
    description: Estimate of the completeness of a contig collection (MAG or genome),
      estimated by CheckM tool. Ensure that percentage values are converted to floats.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    alias: checkm_completeness
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: float
  checkm_contamination:
    name: checkm_contamination
    description: Estimate of the contamination of a contig collection (MAG or genome),
      estimated by CheckM tool. Ensure that percentage values are converted to floats.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    alias: checkm_contamination
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: float
  checkm_version:
    name: checkm_version
    description: Version of the CheckM tool used.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    alias: checkm_version
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: string
  contig_bp:
    name: contig_bp
    description: Total size in bp of all contigs
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - sequence length
    - total sequence length
    rank: 1000
    alias: contig_bp
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  contig_collection_type:
    name: contig_collection_type
    description: The type of contig collection.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    alias: contig_collection_type
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: ContigCollectionType
  contig_l50:
    name: contig_l50
    description: Given a set of contigs, the L50 is defined as the sequence length
      of the shortest contig at 50% of the total contig collection length
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_L50
    - contig_L50
    rank: 1000
    alias: contig_l50
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  contig_l90:
    name: contig_l90
    description: The L90 statistic is less than or equal to the L50 statistic; it
      is the length for which the collection of all contigs of that length or longer
      contains at least 90% of the sum of the lengths of all contigs
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_L90
    - contig_L90
    rank: 1000
    alias: contig_l90
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  contig_n50:
    name: contig_n50
    description: Given a set of contigs, each with its own length, the N50 count is
      defined as the smallest number_of_contigs whose length sum makes up half of
      contig collection size
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_N50
    - contig_N50
    rank: 1000
    alias: contig_n50
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  contig_n90:
    name: contig_n90
    description: Given a set of contigs, each with its own length, the N90 count is
      defined as the smallest number of contigs whose length sum makes up 90% of contig
      collection size
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_N90
    - contig_N90
    rank: 1000
    alias: contig_n90
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  contig_logsum:
    name: contig_logsum
    description: The sum of the (length*log(length)) of all contigs, times some constant.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_logsum
    rank: 1000
    alias: contig_logsum
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: float
  contig_max:
    name: contig_max
    description: Maximum contig length
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_max
    rank: 1000
    alias: contig_max
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  contig_powersum:
    name: contig_powersum
    description: Powersum of all contigs is the same as logsum except that it uses
      the sum of (length*(length^P)) for some power P (default P=0.25)
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - ctg_powersum
    - ctg_powsum
    - contig_powsum
    rank: 1000
    alias: contig_powersum
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: float
  gap_percent:
    name: gap_percent
    description: The gap size percentage of all scaffolds
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - gap_pct
    rank: 1000
    alias: gap_percent
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: float
  gc_average:
    name: gc_average
    description: The average GC content of the contig collection, expressed as a percentage
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - gc_avg
    rank: 1000
    alias: gc_average
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: float
  gc_std:
    name: gc_std
    description: The standard deviation of GC content across the contig collection
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - gc_stdev
    rank: 1000
    alias: gc_std
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: float
  gtdb_taxon_id:
    name: gtdb_taxon_id
    description: The GTDB taxon ID for this contig collection.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    alias: gtdb_taxon_id
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: curie
  n_chromosomes:
    name: n_chromosomes
    description: Total number of chromosomes
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    alias: n_chromosomes
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  n_contigs:
    name: n_contigs
    description: Total number of contigs
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    alias: n_contigs
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  n_scaffolds:
    name: n_scaffolds
    description: Total number of scaffolds
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    alias: n_scaffolds
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  ncbi_taxon_id:
    name: ncbi_taxon_id
    description: The NCBI taxon ID for this contig collection.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    rank: 1000
    alias: ncbi_taxon_id
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: curie
  scaffold_l50:
    name: scaffold_l50
    description: Given a set of scaffolds, the L50 is defined as the sequence length
      of the shortest scaffold at 50% of the total contig collection length
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_L50
    - scaffold_L50
    rank: 1000
    alias: scaffold_l50
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  scaffold_l90:
    name: scaffold_l90
    description: The L90 statistic is less than or equal to the L50 statistic; it
      is the length for which the collection of all scaffolds of that length or longer
      contains at least 90% of the sum of the lengths of all scaffolds.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_L90
    - scaffold_L90
    rank: 1000
    alias: scaffold_l90
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  scaffold_n50:
    name: scaffold_n50
    description: Given a set of scaffolds, each with its own length, the N50 count
      is defined as the smallest number of scaffolds whose length sum makes up half
      of contig collection size
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_N50
    - scaffold_N50
    rank: 1000
    alias: scaffold_n50
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  scaffold_n90:
    name: scaffold_n90
    description: Given a set of scaffolds, each with its own length, the N90 count
      is defined as the smallest number of scaffolds whose length sum makes up 90%
      of contig collection size
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_N90
    - scaffold_N90
    rank: 1000
    alias: scaffold_n90
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  scaffold_bp:
    name: scaffold_bp
    description: Total size in bp of all scaffolds
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_bp
    rank: 1000
    alias: scaffold_bp
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  scaffold_logsum:
    name: scaffold_logsum
    description: The sum of the (length*log(length)) of all scaffolds, times some
      constant. Increase the contiguity, the score will increase
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_logsum
    rank: 1000
    alias: scaffold_logsum
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: float
  scaffold_maximum_length:
    name: scaffold_maximum_length
    description: Maximum scaffold length
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_max
    - scaffold_max
    rank: 1000
    alias: scaffold_maximum_length
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  scaffold_powersum:
    name: scaffold_powersum
    description: Powersum of all scaffolds is the same as logsum except that it uses
      the sum of (length*(length^P)) for some power P (default P=0.25).
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_powsum
    - scaffold_powsum
    rank: 1000
    alias: scaffold_powersum
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: float
  scaffolds_n_over_50K:
    name: scaffolds_n_over_50K
    description: The number of scaffolds longer than 50,000 base pairs.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_n_gt50K
    - scaffold_n_gt50K
    rank: 1000
    alias: scaffolds_n_over_50K
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  scaffolds_percent_over_50K:
    name: scaffolds_percent_over_50K
    description: The percentage of the total assembly length represented by scaffolds
      longer than 50,000 base pairs
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_pct_gt50K
    - scaffold_pct_gt50K
    rank: 1000
    alias: scaffolds_percent_over_50K
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: float
  scaffolds_total_length_over_50k:
    name: scaffolds_total_length_over_50k
    description: The total length of scaffolds longer than 50,000 base pairs
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_bioentity
    aliases:
    - scaf_l_gt50k
    - scaffold_l_gt50k
    rank: 1000
    alias: scaffolds_total_length_over_50k
    owner: ContigCollection
    domain_of:
    - ContigCollection
    range: integer
  contig_collection_id:
    name: contig_collection_id
    description: 'Internal (CDM) unique identifier for a contig collection.

      From the Entity table: entity_id where entity_type == ''ContigCollection''.

      '
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_schema
    rank: 1000
    identifier: true
    alias: contig_collection_id
    owner: ContigCollection
    domain_of:
    - ContigCollection
    - Contig_x_ContigCollection
    - ContigCollection_x_EncodedFeature
    - ContigCollection_x_Feature
    - ContigCollection_x_Protein
    range: cdm_contig_collection_id
    required: true
  hash:
    name: hash
    description: A hash value generated from one or more object attributes that serves
      to ensure the entity is unique.
    from_schema: http://kbase.github.io/cdm-schema/linkml/cdm_schema
    rank: 1000
    alias: hash
    owner: ContigCollection
    domain_of:
    - Contig
    - ContigCollection
    - EncodedFeature
    - Feature
    - Protein
    range: string