Skip to content

Class: Genome

A contigset with a completeness score of greater than 90% and a contamination score of less than 5%.

URI: kb_cdm:Genome

classDiagram class Genome click Genome href "../Genome" Contigset <|-- Genome click Contigset href "../Contigset" Genome : asm_score Genome : checkm2_completeness Genome : checkm2_contamination Genome : contig_bp Genome : contigset_id Genome : ctg_L50 Genome : ctg_L90 Genome : ctg_logsum Genome : ctg_max Genome : ctg_N50 Genome : ctg_N90 Genome : ctg_powsum Genome : description Genome : gap_pct Genome : gc_avg Genome : gc_std Genome : hash Genome : identifiers Genome --> "*" Identifier : identifiers click Identifier href "../Identifier" Genome : n_contigs Genome : n_scaffolds Genome : names Genome --> "*" Name : names click Name href "../Name" Genome : scaf_bp Genome : scaf_L50 Genome : scaf_L90 Genome : scaf_l_gt50k Genome : scaf_logsum Genome : scaf_max Genome : scaf_N50 Genome : scaf_N90 Genome : scaf_n_gt50K Genome : scaf_pct_gt50K Genome : scaf_powsum

Inheritance

Slots

Name Cardinality and Range Description Inheritance
asm_score 0..1
Float
A composite score for comparing contigset quality Contigset
checkm2_completeness 0..1
Float
Estimate of the completeness of a contigset (MAG or genome), estimated by Che... Contigset
checkm2_contamination 0..1
Float
Estimate of the contamination of a contigset (MAG or genome), estimated by Ch... Contigset
contigset_id 1
UUID
Internal (CDM) unique identifier Contigset
contig_bp 0..1
Integer
Total size in bp of all contigs Contigset
ctg_L50 0..1
Integer
Given a set of contigs, the L50 is defined as the sequence length of the shor... Contigset
ctg_L90 0..1
Integer
The L90 statistic is less than or equal to the L50 statistic; it is the lengt... Contigset
ctg_N50 0..1
Integer
Given a set of contigs, each with its own length, the N50 count is defined as... Contigset
ctg_N90 0..1
Integer
Given a set of contigs, each with its own length, the N90 count is defined as... Contigset
ctg_logsum 0..1
Float
The sum of the (length*log(length)) of all contigs, times some constant Contigset
ctg_max 0..1
Integer
Maximum contig length Contigset
ctg_powsum 0..1
Float
Powersum of all contigs is the same as logsum except that it uses the sum of ... Contigset
gap_pct 0..1
Float
The gap size percentage of all scaffolds Contigset
gc_avg 0..1
Float
The average GC content of the contigset, expressed as a percentage Contigset
gc_std 0..1
Float
The standard deviation of GC content across the contigset Contigset
n_contigs 0..1
Integer
Total number of contigs Contigset
n_scaffolds 0..1
Integer
Total number of scaffolds Contigset
scaf_L50 0..1
Integer
Given a set of scaffolds, the L50 is defined as the sequence length of the sh... Contigset
scaf_L90 0..1
Integer
The L90 statistic is less than or equal to the L50 statistic; it is the lengt... Contigset
scaf_N50 0..1
Integer
Given a set of scaffolds, each with its own length, the N50 count is defined ... Contigset
scaf_N90 0..1
Integer
Given a set of scaffolds, each with its own length, the N90 count is defined ... Contigset
scaf_bp 0..1
Integer
Total size in bp of all scaffolds Contigset
scaf_l_gt50k 0..1
Integer
The total length of scaffolds longer than 50,000 base pairs Contigset
scaf_logsum 0..1
Float
The sum of the (length*log(length)) of all scaffolds, times some constant Contigset
scaf_max 0..1
Integer
Maximum scaffold length Contigset
scaf_n_gt50K 0..1
Integer
The number of scaffolds longer than 50,000 base pairs Contigset
scaf_pct_gt50K 0..1
Float
The percentage of the total assembly length represented by scaffolds longer t... Contigset
scaf_powsum 0..1
Float
Powersum of all scaffolds is the same as logsum except that it uses the sum o... Contigset
hash 0..1
String
A hash value generated from one or more object attributes that serves to ensu... UniqueNamedThing
identifiers *
Identifier
URIs or CURIEs used to refer to this entity NamedThingWithId
description 0..1
String
Definition or description of the entity NamedThing
names *
Name
Names, alternative names, and synonyms for an entity NamedThing

Identifier and Mapping Information

Schema Source

  • from schema: https://github.com/kbase/cdm-schema

Mappings

Mapping Type Mapped Value
self kb_cdm:Genome
native kb_cdm:Genome

LinkML Source

Direct

name: Genome
description: A contigset with a completeness score of greater than 90% and a contamination
  score of less than 5%.
from_schema: https://github.com/kbase/cdm-schema
is_a: Contigset

Induced

name: Genome
description: A contigset with a completeness score of greater than 90% and a contamination
  score of less than 5%.
from_schema: https://github.com/kbase/cdm-schema
is_a: Contigset
attributes:
  asm_score:
    name: asm_score
    description: A composite score for comparing contigset quality
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: asm_score
    owner: Genome
    domain_of:
    - Contigset
    range: float
  checkm2_completeness:
    name: checkm2_completeness
    description: Estimate of the completeness of a contigset (MAG or genome), estimated
      by CheckM2 tool
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: checkm2_completeness
    owner: Genome
    domain_of:
    - Contigset
    range: float
  checkm2_contamination:
    name: checkm2_contamination
    description: Estimate of the contamination of a contigset (MAG or genome), estimated
      by CheckM2 tool
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: checkm2_contamination
    owner: Genome
    domain_of:
    - Contigset
    range: float
  contigset_id:
    name: contigset_id
    description: Internal (CDM) unique identifier.
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    identifier: true
    alias: contigset_id
    owner: Genome
    domain_of:
    - Contigset
    range: UUID
    required: true
  contig_bp:
    name: contig_bp
    description: Total size in bp of all contigs
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: contig_bp
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  ctg_L50:
    name: ctg_L50
    description: Given a set of contigs, the L50 is defined as the sequence length
      of the shortest contig at 50% of the total contigset length
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: ctg_L50
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  ctg_L90:
    name: ctg_L90
    description: The L90 statistic is less than or equal to the L50 statistic; it
      is the length for which the collection of all contigs of that length or longer
      contains at least 90% of the sum of the lengths of all contigs
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: ctg_L90
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  ctg_N50:
    name: ctg_N50
    description: Given a set of contigs, each with its own length, the N50 count is
      defined as the smallest number_of_contigs whose length sum makes up half of
      contigset size
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: ctg_N50
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  ctg_N90:
    name: ctg_N90
    description: Given a set of contigs, each with its own length, the N90 count is
      defined as the smallest number of contigs whose length sum makes up 90% of contigset
      size
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: ctg_N90
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  ctg_logsum:
    name: ctg_logsum
    description: The sum of the (length*log(length)) of all contigs, times some constant.
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: ctg_logsum
    owner: Genome
    domain_of:
    - Contigset
    range: float
  ctg_max:
    name: ctg_max
    description: Maximum contig length
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: ctg_max
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  ctg_powsum:
    name: ctg_powsum
    description: Powersum of all contigs is the same as logsum except that it uses
      the sum of (length*(length^P)) for some power P (default P=0.25)
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: ctg_powsum
    owner: Genome
    domain_of:
    - Contigset
    range: float
  gap_pct:
    name: gap_pct
    description: The gap size percentage of all scaffolds
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: gap_pct
    owner: Genome
    domain_of:
    - Contigset
    range: float
  gc_avg:
    name: gc_avg
    description: The average GC content of the contigset, expressed as a percentage
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: gc_avg
    owner: Genome
    domain_of:
    - Contigset
    range: float
  gc_std:
    name: gc_std
    description: The standard deviation of GC content across the contigset
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: gc_std
    owner: Genome
    domain_of:
    - Contigset
    range: float
  n_contigs:
    name: n_contigs
    description: Total number of contigs
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: n_contigs
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  n_scaffolds:
    name: n_scaffolds
    description: Total number of scaffolds
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: n_scaffolds
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  scaf_L50:
    name: scaf_L50
    description: Given a set of scaffolds, the L50 is defined as the sequence length
      of the shortest scaffold at 50% of the total contigset length
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: scaf_L50
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  scaf_L90:
    name: scaf_L90
    description: The L90 statistic is less than or equal to the L50 statistic; it
      is the length for which the collection of all scaffolds of that length or longer
      contains at least 90% of the sum of the lengths of all scaffolds.
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: scaf_L90
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  scaf_N50:
    name: scaf_N50
    description: Given a set of scaffolds, each with its own length, the N50 count
      is defined as the smallest number of scaffolds whose length sum makes up half
      of contigset size
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: scaf_N50
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  scaf_N90:
    name: scaf_N90
    description: Given a set of scaffolds, each with its own length, the N90 count
      is defined as the smallest number of scaffolds whose length sum makes up 90%
      of contigset size
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: scaf_N90
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  scaf_bp:
    name: scaf_bp
    description: Total size in bp of all scaffolds
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: scaf_bp
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  scaf_l_gt50k:
    name: scaf_l_gt50k
    description: The total length of scaffolds longer than 50,000 base pairs
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: scaf_l_gt50k
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  scaf_logsum:
    name: scaf_logsum
    description: The sum of the (length*log(length)) of all scaffolds, times some
      constant. Increase the contiguity, the score will increase
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: scaf_logsum
    owner: Genome
    domain_of:
    - Contigset
    range: float
  scaf_max:
    name: scaf_max
    description: Maximum scaffold length
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: scaf_max
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  scaf_n_gt50K:
    name: scaf_n_gt50K
    description: The number of scaffolds longer than 50,000 base pairs.
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: scaf_n_gt50K
    owner: Genome
    domain_of:
    - Contigset
    range: integer
  scaf_pct_gt50K:
    name: scaf_pct_gt50K
    description: The percentage of the total assembly length represented by scaffolds
      longer than 50,000 base pairs
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: scaf_pct_gt50K
    owner: Genome
    domain_of:
    - Contigset
    range: float
  scaf_powsum:
    name: scaf_powsum
    description: Powersum of all scaffolds is the same as logsum except that it uses
      the sum of (length*(length^P)) for some power P (default P=0.25).
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: scaf_powsum
    owner: Genome
    domain_of:
    - Contigset
    range: float
  hash:
    name: hash
    description: A hash value generated from one or more object attributes that serves
      to ensure the entity is unique.
    from_schema: https://github.com/kbase/cdm-schema
    alias: hash
    owner: Genome
    domain_of:
    - UniqueNamedThing
    range: string
  identifiers:
    name: identifiers
    description: URIs or CURIEs used to refer to this entity.
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: identifiers
    owner: Genome
    domain_of:
    - NamedThingWithId
    range: Identifier
    multivalued: true
  description:
    name: description
    description: Definition or description of the entity.
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: description
    owner: Genome
    domain_of:
    - NamedThing
    - Event
    - Experiment
    - Identifier
    - Name
    - Project
    - Protein
    - Sample
    range: string
  names:
    name: names
    description: Names, alternative names, and synonyms for an entity.
    from_schema: https://github.com/kbase/cdm-schema
    rank: 1000
    alias: names
    owner: Genome
    domain_of:
    - NamedThing
    range: Name
    multivalued: true