CLI Tutorial#
The following tutorial is for CLI users, if you are using the library to write your own Python or R scripts then you’ll want to read Developers.
Installation#
We recommend using a conda environment just for CTDFjorder. To do this, open your terminal (MacOS/Linux) or command prompt (Windows) and run the following commands:
$ conda deactivate
$ conda create --name ctdfjorder python=3.12
$ conda activate ctdfjorder
Then install CTDFjorder using pip:
(ctdfjorder)$ pip install ctdfjorder
Run ctdcli#
Now we will process our files.
Tip
To see what options you have to process the files, type ctdcli default -h or view the documentation for the CLI.
For the purposes of this demo are assuming that you have the following:
Files with endings .rsk from an RBR instrument or .csv from a Castaway device.
- A master sheet which will be used to attach metadata to the CTD tables. This must be named mastersheet.csv and be located in the same folder as your CTD data. Additionally it must have the following fields:
UNIQUE ID CODE
nominal longitude
nominal latitude
CTD cast file name
location
loc id
date/time (ISO)
sechhi depth
Access to a public MapBox token.
If you meet those conditions make your terminal window fullscreen.
Then copy and paste the following into your terminal, and replace MY_TOKEN with your public MapBox token.
Members of FjordPhyto can use this token pk.eyJ1Ijoibmlrb3Rob21hcyIsImEiOiJjbHl2Z2JzbDQxZjEwMmpwd2c1cnJpYmRyIn0.j9l0EXWa2ik51AbAcIe5HQ
Tip
Add plotting by including -p in the command, like so ctdcli default -r -p --token MY_TOKEN
(ctdfjorder) $ ctdcli default -r -m mastersheet.csv --token MY_TOKEN
Here we are telling CTDFjorder the following:
-rReset our file environment (delete old plots and remake folders)-mThe location of our mastersheet--tokenOur token to interact with MapBox and generate our map.
Interpret output#
If you see a spinning globe you did it! Once the files are done processing a table will print with pipeline information for each file. Green means the file passed a step and red means an error occurred such that the file could not continue to be processed. Once all files are completed, a map will open as well. The points are individual casts. The map can be filtered.
If you used the
-poption then plots are in the ctdplots folder next to our original data and were made with functions from the Visualize module.There you will also find a ctdfjorder_data.csv with our processed data.
To investigate files that did not pass the pipeline open the ctdfjorder.log file.
Steps#
These are the functions we ran through the CLI on each file in this tutorial:
data = CTD(file)
data.expand_date(day=False)
data.remove_upcasts()
data.remove_non_positive_samples()
data.filter_columns_by_range(column='salinity', upper_bound=None, lower_bound=10)
data.add_metadata(master_sheet_path='mastersheet.csv')
data.clean(method='clean_salinity_ai')
data.add_surface_salinity()
data.add_surface_temperature()
data.add_meltwater_fraction()
data.add_absolute_salinity()
data.add_density()
data.add_potential_density()
data.add_n_squared()
data.add_mld_bf()
data.add_profile_classification()
Congrats! You can now use CTDFjorder to investigate your ctd data. For more in depth information on the processes executed here, read the API.
CLI Commands#
CTDFjorder
usage: sample [-h] {default} ...
Positional Arguments#
- command
Possible choices: default
Sub-commands#
default#
Run the default processing pipeline
sample default [-h] [-p] [-v] [-q] [-r] [-s] [-d] [-m MASTERSHEET]
[-w [WORKERS]] [--token TOKEN] [-o OUTPUT]
[--filter-columns [{filename,unique_id,profile_id,site_id,site_name,timestamp,year,month,day,latitude,longitude,depth,pressure,sea_pressure,p_mid,temperature,conservative_temperature,salinity,salinity_abs,density,potential_density,surface_temperature,surface_salinity,surface_density,meltwater_fraction_eq_10,meltwater_fraction_eq_11,brunt_vaisala_frequency_squared,profile_type,conductivity,specific_conductivity,speed_of_sound,oxygen_concentration,oxygen_saturation,ph,alkalinity,nitrate,phosphate,silicate,ammonium,particulate_organic_carbon,total_organic_carbon,particulate_inorganic_carbon,dissolved_inorganic_carbon,secchi_depth,turbidity,chlorophyll,chlorophyll_fluorescence,par,orp} ...]]
[--filter-upper [FILTER_UPPER ...]]
[--filter-lower [FILTER_LOWER ...]]
Named Arguments#
- -p, --plot
Generate plots
Default:
False- -v, --verbose
Verbose logger output to ctdfjorder.log (repeat for increased verbosity)
Default:
3- -q, --quiet
Quiet output (show errors only)
Default:
0- -r, --reset
Reset file environment
Default:
False- -s, --show-status
Show processing status and pipeline status
Default:
False- -d, --debug-run
Run 20 files for testing
Default:
False- -m, --mastersheet
Path to mastersheet
- -w, --workers
Max workers
- --token
MapBox token to enable interactive map plot
- -o, --output
Output file path
Default:
'ctdfjorder_data.csv'- --filter-columns
Possible choices: filename, unique_id, profile_id, site_id, site_name, timestamp, year, month, day, latitude, longitude, depth, pressure, sea_pressure, p_mid, temperature, conservative_temperature, salinity, salinity_abs, density, potential_density, surface_temperature, surface_salinity, surface_density, meltwater_fraction_eq_10, meltwater_fraction_eq_11, brunt_vaisala_frequency_squared, profile_type, conductivity, specific_conductivity, speed_of_sound, oxygen_concentration, oxygen_saturation, ph, alkalinity, nitrate, phosphate, silicate, ammonium, particulate_organic_carbon, total_organic_carbon, particulate_inorganic_carbon, dissolved_inorganic_carbon, secchi_depth, turbidity, chlorophyll, chlorophyll_fluorescence, par, orp
List of columns to filter
- --filter-upper
Upper bounds for the filtered columns
- --filter-lower
Lower bounds for the filtered columns