Skip to content

Stage Your Files on Request

After you've designated a file staging area and configured a Globus guest collection for your organization, it's time to set up a staging request endpoint that allows the Data Transfer System (DTS) to ask for specific files to be moved to this staging area in preparation for transfer.

Why is this endpoint needed? In a perfect world, storage would be infinite and free, and your organization could simply "keep all the files somewhere, ready to be transferred anytime." In practice, the amount of scientific data being continuously generated greatly exceeds available storage at any given moment, so it needs to be archived when it's not being used.

For example, the JGI Data Portal (JDP) stores most of its data on tape, unarchiving it to disk (the staging area used by the DTS, specifically) on request. Your organization probably has a similar scheme in place for storing its data.

It works like this: when a user searches for and selects files to transfer from your database to somewhere else, the DTS checks your staging area for the requested files via Globus. If those files are not in your staging area, the DTS makes a request to the staging request endpoint to ask your system to copy the files into place, whatever that entails.

The DTS understands that this process may take some time, so it also needs a way to request the status of your file staging operation. After all, there are only so many robot arms in a tape backup facility, and they can only swing around so quickly and move so many tapes at once. Accordingly, your system must provide a unique identifier for the file staging operation that allows the DTS to check for its completion.

Let's take a look at the staging request endpoint. We'll discuss how your system can report the status of a file staging operation in the next section.

Endpoint Recommendations

Create a REST endpoint that accepts an HTTP POST request with a body that contains a set of unique identifiers corresponding to files that should be moved to your file staging area. The endpoint validates the request, checking that the files exist in your system, and initiate a staging operation that copies the requested files into place within your staging area. The endpoint then responds with a body containing a universally unique identifier (UUID) that the DTS can use to request the status of the staging operation.

Error codes should be used in accordance with HTTP conventions:

  • A successful query returns a 201 Created status code
  • An improperly-formed request should result in a 400 Bad Request status code
  • If one or more file IDs do not correspond to existing files in your organization's database, the endpoint can respond with a 404 Not Found status code.

Example

Suppose we want to request that JGI Data Portal (JDP) stage the files with the unique identifiers 615a383dcc4ff44f36ca5ba2 and 61412246cc4ff44f36c8913f (referred to by the DTS as JDP:615a383dcc4ff44f36ca5ba2 and JDP:61412246cc4ff44f36c8913f, respectively). If the JDP endpoint is hosted at example.com and is implemented according to our recommendations, we can send a POST to

https://example.com/dts/staging

with the following body:

{
  "ids": [
    "615a383dcc4ff44f36ca5ba2",
    "61412246cc4ff44f36c8913f"
  ]
}

This results in a response with a 201 Created status code with a body containing a UUID:

{
  "request_id": "4b86e181-8c83-447e-aada-af9232af7da0"
}

We'll see how this UUID can be used to retrieve the status of the file staging operation in the next section.

Existing implementations

The JDP endpoint we mention above essentially conforms to what we've described, but contains some additional fields that determine, for example, whether an email is sent upon completion of the staging process. This is because we've repurposed an endpoint that was originally intended to allow users to download requested files directly from Globus. If your organization already has something in place that can serve as a file staging request endpoint, we can work with you to similarly leverage it to get your system hooked up to the DTS.