Anatomy of a Module
KBase modules consist of a set of code, configurations, and specifications that describe how a set of related apps are shown in the Narrative interface and define the implementation of a function that is run when the Narrative app is executed. Modules can also include components that define visualization widgets, data types, and data uploaders/downloaders/converters. This guide provides a brief overview of these key components and how they are related.
Overview
To explore the contents of a module yourself, you can use the kb-sdk tool to create an example module. (Please see these instructions on how to install kb-sdk.)
kb-sdk init --example -l python --user YOURUSERNAME MyModule
cd MyModule
make
This command will initialize a new Module named MyModule with a prebuilt implementation of an example app. You should use your own GitHub username instead of YOURUSERNAME. The generated directory structure will look something like this:
MyModule
├── README.md
├── LICENSE
├── kbase.yml
├── Dockerfile
├── Makefile
├── deploy.cfg
├── MyModule.spec
├── .travis.yml
├── data
│ └── README.md
├── lib
│ ├── MyModule
│ │ ├── MyModuleImpl.py
│ │ └── __init__.py
│ └── README.md
├── scripts
│ ├── entrypoint.sh
│ ├── prepare_deploy_cfg.py
│ ├── run_async.sh
│ └── start_server.sh
├── test
│ ├── MyModule_server_test.py
│ ├── README.md
│ └── run_tests.sh
├── test_local
│ ├── build_run_tests.sh
│ ├── readme.txt
│ ├── run_bash.sh
│ └── test.cfg
└── ui
├── README.md
└── narrative
└── methods
└── count_contigs_in_set
├── display.yaml
├── img
└── spec.json
Let’s break this down and examine each component.
The Basics
MyModule
├── README.md
├── LICENSE
├── kbase.yml
All KBase modules are required to have a kbase.yml file, which is a simple YAML file that includes basic information and documentation about your Module. This is where you define your Module Name, the set of users that have permission to register/edit this Module, and descriptions of what your Module does. This information is important for giving you credit for building this module, and is used throughout KBase for describing to users why they should use your Module.
README.md and LICENSE files are also required. README.md should contain basic information about your module, primarily targeted towards other developers or users who want to browse your code. A LICENSE file is also required. Your code will not be approved for release on KBase without a license that is compatible with KBase’s open source license.
Dockerfile
MyModule
├── Dockerfile
One of the central components of a KBase module is the Dockerfile. Nearly all KBase apps are executed within Docker containers so that you can precisely manage your system dependencies and ensure that code that you are testing locally will be run exactly the same way in the KBase system. Docker images also act like snapshots that allow KBase to maintain and run old versions of your module. To effectively develop modules in KBase that execute code, you should install Docker locally and familiarize yourself with Docker tools.
Therefore, there are no dependencies required except for a Dockerfile that can be used to create a Docker image. Instead, in your Dockerfile, you will define a set of commands that installs any system or package dependencies beyond what is provided in the KBase base image.
KBase Interface Description Language (KIDL) Specification File
MyModule
├── MyModule.spec
The KIDL specification file, often referred to as your “KBase spec file”, defines the interface of your module. This interface is a set of function definitions defining what types of parameters they can accept and what type of data they can return. Using this interface, the KBase platform will know how to call any function in your module in a generic way and search the KBase Catalog for your apps.
The kb-sdk
tool compiles your spec file into a set of implementation
stubs in either Python, Perl, or Java that you will use to execute your
code. Technical documentation should also be added to spec files, and
can be used with the kb-sdk to generate nice looking html documentation
for you.
In this simple example of a spec file, there is a single function defined for counting the number of contigs in a contig set. (Note that a “workspace” is like a directory that contains particular data objects.)
module MyModule {
/*
A string representing a ContigSet id.
*/
typedef string contigset_id;
/*
A string representing a workspace name.
*/
typedef string workspace_name;
typedef structure {
int contig_count;
} CountContigsResults;
/*
Count contigs in a ContigSet
contigset_id - the ContigSet to count.
*/
funcdef count_contigs(workspace_name,contigset_id) returns (CountContigsResults)
authentication required;
};
If you initialize an app without the -e
flag, the KIDL file will contain a default spec that
accepts any number of parameters and returns a HTML report.
Files for building and testing the Module
├── Makefile ├── deploy.cfg ├── .travis.yml
KBase modules are currently all built using make
, with targets that
can rebuild components of your module and start tests. You can explore
the Makefile directly and add additional targets as needed, but you
should not have to edit significantly the basic Makefile targets
generated by kb-sdk.
Data
MyModule
├── data
│ └── README.md
Reference data that is smaller than 100 MB can be stored in this directory. Larger files and databases cannot be checked into github directly and thus will have to use the versioned reference data system
App (Method) Implementation
MyModule
├── lib
│ ├── MyModule
│ │ ├── MyModuleImpl.py
│ │ └── __init__.py
│ └── README.md
The lib directory is where the actual implementation code of your app is
defined. In this example, your code consists of a single Python module
with a kb-sdk generated Implementation file, which includes stubs that
you can fill in. In this example there is a single count_contigs
method. When you run make
, this file is updated and recompiled using
kb-sdk compile
based on any changes in your spec file. For each
function you define in the KIDL spec file, you will see a corresponding
stub that you can fill in. For example:
def count_contigs(self, ctx, workspace_name, contigset_id):
# ctx is the context object
# return variables are: returnVal
#BEGIN count_contigs
token = ctx['token']
wsClient = workspaceService(self.workspaceURL, token=token)
contigSet = wsClient.get_objects([{'ref': workspace_name+'/'+contigset_id}])[0]['data']
returnVal = {'contig_count': len(contigSet['contigs'])}
#END count_contigs
# At some point might do deeper type checking...
if not isinstance(returnVal, object):
raise ValueError('Method count_contigs return value ' +
'returnVal is not type object as required.')
# return the results
return [returnVal]
Note that your implementation code will be defined between
#BEGIN contig_counts
and #END contig_counts
. Any code written
outside of these #BEGIN
and #END
directives will be overwritten
when the implementation file is rebuilt. The exact code generated by
kb-sdk compile
and structure of the lib directory will of course
depend on the programming language you indicated when running
kb-sdk init
.
It is good practice to limit the amount of code you place directly in the implementation files. Instead, create your own modules and packages that perform most of the logic, and only include calls to those libraries from within the generated Implementation file.
Scripts Directory for Utility/Docker Scripts
MyModule
├── scripts
│ ├── entrypoint.sh
│ ├── prepare_deploy_cfg.py
│ ├── run_async.sh
│ └── start_server.sh
Your module will include by default a few autogenerated scripts to aid in deployment and to define how your Docker container is run. For the most part, you can ignore these files. If you need additional utility scripts, for instance to aid in system dependency installations, fetch a reference data file that needs to be stored in the Docker image, or other methods for testing or validation, you should place them in the scripts directory.
Test Framework
MyModule
├── test
│ ├── MyModule6_server_test.py
│ ├── README.md
│ └── run_tests.sh
├── test_local
│ ├── build_run_tests.sh
│ ├── readme.txt
│ ├── run_bash.sh
│ └── test.cfg
The test directory contains a basic template for performing unit tests
of the code in your module implementation. This is useful for both
debugging and ensuring your module is robust and operates well on a
range of input data. The test_local directory is created by make
to
create a scratch space for running tests locally. It is important that
you do not include any passwords in configuration files that you are
committing to public git repositories.
Narrative Method Specifications
MyModule
└── ui
├── README.md
└── narrative
└── methods
└── count_contigs_in_set
├── display.yaml
├── img
└── spec.json
Apps in the Narrative interface are defined by method specifications that consist of a JSON specification file and a YAML file for documentation and display labels. In this example, this module has only a single Narrative method defined in a folder named count_contigs_in_set. This folder name also serves as the method ID. Method IDs must therefore be unique within a module. You can add more apps by simply adding another directory in the methods folder.
These method specifications indicate which parameters are exposed to the
user, how those parameters are selected (e.g., dropdown, text field,
checkbox) and how those parameters map to your implementation. An
optional img
directory allows you to attach screenshots or other
images that will automatically be included in the app detail page for
your Narrative method.