CTD#

The CTD class in the CTDFjorder library provides a comprehensive interface for loading, processing, and analyzing CTD data. Below is a detailed breakdown of the key methods and functionalities provided by the CTD class.

Initialization#

class ctdfjorder.CTD.CTD(ctd_file_path: str, cached_master_sheet: MasterSheet = None, master_sheet_path=None, plot=False)[source]#

Object representing a single profile.

Parameters:

ctd_file_path (str) – The file path to the RSK or Castaway file.
cached_master_sheet (masterSheet, default None) – CTDFjorder’s internal representation of a master sheet.
master_sheet_path (str, default None) – Path to a master sheet.
plot (bool, default False) – If true saves plots to ‘plots’ folder in working directory.

Examples

Castaway CTD profile with valid data

>>> ctd_data = CTD('CC1531002_20181225_114931.csv')
>>> output = ctd_data.get_df()
>>> print(output.head(3))
shape: (3, 13)
┌──────────────┬──────────┬─────────────┬──────────────┬───┬────────────┬───────────────────────────────┬────────────┬────────────┐
│ sea_pressure ┆ depth    ┆ temperature ┆ conductivity ┆ … ┆ profile_id ┆ filename                      ┆ latitude   ┆ longitude  │
│ ---          ┆ ---      ┆ ---         ┆ ---          ┆   ┆ ---        ┆ ---                           ┆ ---        ┆ ---        │
│ f64          ┆ f64      ┆ f64         ┆ f64          ┆   ┆ i32        ┆ str                           ┆ f64        ┆ f64        │
╞══════════════╪══════════╪═════════════╪══════════════╪═══╪════════════╪═══════════════════════════════╪════════════╪════════════╡
│ 0.15         ┆ 0.148676 ┆ 0.32895     ┆ 28413.735648 ┆ … ┆ 0          ┆ CC1531002_20181225_114931.csv ┆ -64.668455 ┆ -62.641775 │
│ 0.45         ┆ 0.446022 ┆ 0.316492    ┆ 28392.966662 ┆ … ┆ 0          ┆ CC1531002_20181225_114931.csv ┆ -64.668455 ┆ -62.641775 │
│ 0.75         ┆ 0.743371 ┆ 0.310613    ┆ 28386.78011  ┆ … ┆ 0          ┆ CC1531002_20181225_114931.csv ┆ -64.668455 ┆ -62.641775 │
└──────────────┴──────────┴─────────────┴──────────────┴───┴────────────┴───────────────────────────────┴────────────┴────────────┘

Castaway CTD profile with no data

>>> ctd_data = CTD('CC1627007_20191220_195931.csv') 
Traceback (most recent call last):
...
ctdfjorder.CTDError: CC1627007_20191220_195931.csv - No samples in file

Raises:: CTDError – For ctdfjorder related errors

Data Processing#

CTD.remove_upcasts() → None[source]#

Removes upcasts by dropping rows where pressure decreases from one sampling event to the next.

Notes

An upcast in CTD (Conductivity, Temperature, Depth) profiles occurs when the sensor package is moved upward through the water column, causing a decrease in pressure readings. This method identifies and removes such events by ensuring pressure monotonically increases within each profile.

The procedure is as follows:

For each unique profile identified by profile_id, the method extracts the profile’s data.
It then computes the difference in pressure between consecutive samples within the profile.
Rows where the pressure difference is not positive are removed, indicating a non-increasing pressure (i.e., an upcast or no movement).
The cleaned profile is then reintegrated into the main dataset, replacing the original data.

Let \(p_i\) be the pressure at the \(i\)-th sampling event. The condition for retaining a data point is given by:

\[\Delta p_i = p_{i} - p_{i-1} > 0 \quad \text{for} \quad i = 1, 2, \ldots, N\]

where \(N\) is the total number of sampling events in the profile. Rows not satisfying this condition are considered upcasts and are removed.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.remove_upcasts()
>>> # This will clean the dataset by removing upcasts, ensuring all profiles have monotonically
>>> # increasing pressure readings.

See also

CTD: To initialize an object

CTD.filter_columns_by_range(filters: zip = None, columns: list[str] = None, upper_bounds: list[float | int] = None, lower_bounds: list[float | int] = None)[source]#

Filters columns of the dataset based on specified upper and lower bounds.

This method allows filtering of a dataset by applying specified upper and lower bounds on the columns. It processes the data profile by profile and updates the dataset accordingly.

Parameters:

filters (zip, optional) – An iterable of tuples, where each tuple contains a column name, an upper bound, and a lower bound. If provided, this takes precedence over the individual columns, upper_bounds, and lower_bounds parameters.
columns (list of str, optional) – A list of column names to be filtered. Must be provided along with upper_bounds and lower_bounds.
upper_bounds (list of float or int, optional) – A list of upper bound values corresponding to each column in columns.
lower_bounds (list of float or int, optional) – A list of lower bound values corresponding to each column in columns.

Notes

The method performs the following steps for each unique profile identified by profile_id:

Extracts the data for the profile using the profile_id.
If filters is provided, it iterates over each filter tuple and applies the specified bounds to the relevant column, filtering out data that does not meet the criteria.
If columns, upper_bounds, and lower_bounds are provided, it iterates over each column and applies the corresponding upper and lower bounds to filter the data.
Updates the dataset by removing the original profile data and reintegrating the filtered profile data.
Checks if the dataset is empty after filtering using the _is_empty method.

The method is designed to be flexible, allowing filtering through either a comprehensive set of filters or by specifying individual columns and their bounds directly. This makes it adaptable to various dataset structures and filtering requirements.

Examples

>>> ctd_data = CTD('example.csv')
>>> filters = zip(['temperature', 'salinity'], [20.0, 35.0], [10.0, 30.0])
>>> ctd_data.filter_columns_by_range(filters=filters)
>>> # This will filter the `temperature` column to be between 10.0 and 20.0, and `salinity` to be between 30.0 and 35.0.

>>> columns = ['temperature', 'salinity']
>>> upper_bounds = [20.0, 35.0]
>>> lower_bounds = [10.0, 30.0]
>>> ctd_data.filter_columns_by_range(columns=columns, upper_bounds=upper_bounds, lower_bounds=lower_bounds)
>>> # This will filter the `temperature` column to be between 10.0 and 20.0, and `salinity` to be between 30.0 and 35.0.

See also

remove_non_positive_samples: Method to remove rows with non-positive values for specific parameters.

CTD.remove_non_positive_samples() → None[source]#

Removes rows with non-positive values for depth, pressure, practical salinity, absolute salinity, or density.

Notes

This method cleans the CTD (Conductivity, Temperature, Depth) dataset by removing any samples that have non-positive values for key parameters. Non-positive values in these parameters are generally invalid and indicate erroneous measurements.

The procedure is as follows:

For each unique profile identified by profile_id, the method extracts the profile’s data.
It then iterates over a predefined set of key parameters: depth, pressure, practical salinity, absolute salinity, and density.
For each parameter present in the profile, it filters out rows where the parameter’s value is non-positive, null, or NaN.
The cleaned profile is then reintegrated into the main dataset, replacing the original data.

Let \(( x_i )\) represent the value of a parameter (depth, pressure, practical salinity, absolute salinity, or density) at the \(( i )\)-th sampling event. The condition for retaining a data point is given by:

\[x_i > 0 \quad \text{and} \quad x_i \neq \text{NaN} \quad \text{and} \quad x_i \neq \text{null}\]

Rows not satisfying this condition for any of the parameters are removed.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.remove_non_positive_samples()
>>> # This will clean the dataset by removing samples with non-positive, null, or NaN values
>>> # for the specified key parameters.

See also

remove_invalid_salinity_values: method to remove salinity values < 10 PSU.

CTD.clean(method) → None[source]#

Applies data cleaning methods to the specified feature using the selected method. Supports cleaning practical salinity using the ‘clean_salinity_ai’ method.

Parameters:: method (str, default 'clean_salinity_ai') – The cleaning method to apply. Currently, only ‘clean_salinity_ai’ is supported, which uses a GRU-based machine learning model to clean the salinity values.
Raises:: CTDError – When the cleaning method is invalid.

Notes

The ‘clean_salinity_ai’ method uses a Gated Recurrent Unit (GRU) model to correct salinity measurements. This model is designed to smooth out unrealistic fluctuations in salinity with respect to pressure.

The procedure for ‘clean_salinity_ai’ is as follows:

For each unique profile identified by profile_id, extract the profile’s data.
Bin the data every 0.5 dbar of pressure.
Use a GRU model with a loss function that penalizes non-monotonic salinity increases with decreasing pressure.
Train the model on the binned data to predict clean salinity values.
Replace the original salinity values in the profile with the predicted clean values.
Reintegration of the cleaned profile into the main dataset.

The loss function \(( L )\) used in training is defined as:

\[L = \text{MAE}(y_{\text{true}}, y_{\text{pred}}) + \lambda \cdot \text{mean}(P)\]

where \(( P )\) are penalties for predicted salinity increases with decreasing pressure, and \(( \lambda )\) is a weighting factor.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.clean('clean_salinity_ai')
>>> # This will clean the salinity data using the AI-based method, correcting any unrealistic
>>> # values and ensuring smoother transitions with respect to pressure.

See also

_AI.clean_salinity_ai: Method used to clean salinity with ‘clean_salinity_ai’ option.

Derived Calculations#

CTD.add_absolute_salinity() → None[source]#

Calculates and adds absolute salinity to the CTD data using the TEOS-10 salinity conversion formula.

Notes

The gsw.SA_from_SP function from the Gibbs SeaWater (GSW) Oceanographic Toolbox is utilized for this calculation. More information about this function can be found at the TEOS-10 website.

This method computes the absolute salinity from practical salinity for each profile in the dataset using the TEOS-10 standard. Absolute salinity provides a more accurate representation of salinity by accounting for the variations in seawater composition.

The procedure is as follows:

Initialize a new column for absolute salinity in the dataset.
For each unique profile identified by profile_id, extract the profile’s data.
Use the gsw.conversions.SA_from_SP function to compute absolute salinity from practical salinity.
Update the profile with the computed absolute salinity values.
Reintegration of the updated profile into the main dataset.

The TEOS-10 formula for converting practical salinity \(( S_P )\) to absolute salinity \(( S_A )\) is used:

\[S_A = f(S_P, p, \phi, \lambda)\]

where \(( p )\) is the sea pressure, \(( \phi )\) is the latitude, and \(( \lambda )\) is the longitude.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.add_absolute_salinity()
>>> # This will add a new column with absolute salinity values to the dataset, calculated using the
>>> # TEOS-10 formula.

See also

gsw.conversions.SA_from_SP: Function used for the conversion from practical salinity to absolute salinity.

CTD.add_density()[source]#

Calculates and adds density to CTD data using the TEOS-10 formula. If absolute salinity is not present, it is calculated first.

Notes

The gsw.rho_t_exact function from the Gibbs SeaWater (GSW) Oceanographic Toolbox is utilized for this calculation. More information about this function can be found at the TEOS-10 website.

This method computes the density of seawater from absolute salinity, in-situ temperature, and sea pressure using the TEOS-10 standard. The density is a critical parameter for understanding the physical properties of seawater and its buoyancy characteristics.

The procedure is as follows:

Check if absolute salinity is already present in the dataset. If not, calculate it using add_absolute_salinity().
Initialize a new column for density in the dataset.
For each unique profile identified by profile_id, extract the profile’s data.
Use the gsw.density.rho_t_exact function to compute density from absolute salinity, temperature, and pressure.
Update the profile with the computed density values.
Reintegration of the updated profile into the main dataset.

The TEOS-10 formula for calculating density \(( \rho )\) is used:

\[\rho = f(S_A, T, p)\]

where \(( S_A )\) is the absolute salinity, \(( T )\) is the in-situ temperature, and \(( p )\) is the sea pressure.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.add_density()
>>> # This will add a new column with density values to the dataset, calculated using the
>>> # TEOS-10 formula.

See also

gsw.density.rho_t_exact: Function used for the calculation of density.
add_absolute_salinity: Method to add absolute salinity if it is not already present in the dataset.

CTD.add_potential_density()[source]#

Calculates and adds potential density to the CTD data using the TEOS-10 formula. If absolute salinity is not present, it is calculated first.

Notes

The gsw.sigma0 function from the Gibbs SeaWater (GSW) Oceanographic Toolbox is utilized for this calculation. More information about this function can be found at the TEOS-10 website.

This method computes the potential density of seawater from absolute salinity and in-situ temperature using the TEOS-10 standard. Potential density is the density a parcel of seawater would have if it were adiabatically brought to the sea surface, which helps in understanding the stability and stratification of the water column.

The procedure is as follows:

Check if absolute salinity is already present in the dataset. If not, calculate it using add_absolute_salinity().
Initialize a new column for potential density in the dataset.
For each unique profile identified by profile_id, extract the profile’s data.
Use the gsw.sigma0 function to compute potential density from absolute salinity and temperature.
Update the profile with the computed potential density values.
Reintegration of the updated profile into the main dataset.

The TEOS-10 formula for calculating potential density \(( \sigma_0 )\) is used:

\[\sigma_0 = f(S_A, T)\]

where \(( S_A )\) is the absolute salinity and \(( T )\) is the in-situ temperature.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.add_potential_density()
>>> # This will add a new column with potential density values to the dataset, calculated using the
>>> # TEOS-10 formula.

See also

gsw.sigma0: Function used for the calculation of potential density.
add_absolute_salinity: Method to add absolute salinity if it is not already present in the dataset.

CTD.add_surface_salinity_temp_meltwater(start=10.1325, end=12.1325)[source]#

Calculates the surface salinity, surface temperature, and meltwater fraction of a CTD profile. Adds these values to the CTD data.

Parameters:

start (float, default 10.1325) – Upper bound of surface pressure.
end (float, default 12.1325) – Lower bound of surface pressure.

Notes

This method adds three new columns to the dataset: surface salinity, surface temperature, and meltwater fraction. The values are calculated as follows:

Surface temperature is the mean temperature from pressure start to end.
Surface salinity is the salinity value at the lowest pressure within the range from start to end.
Meltwater fraction is calculated using the formula from Pan et. al 2019:

\[\text{Meltwater fraction} = (-0.021406 \cdot S_0 + 0.740392) \cdot 100\]

where \(( S_0 )\) is the surface salinity.

The procedure is as follows:

Initialize new columns for surface salinity, surface temperature, and meltwater fraction in the dataset.
For each unique profile identified by profile_id, extract the profile’s data.
Filter the data to include only the samples within the specified pressure range (start to end).
Calculate the mean surface temperature, surface salinity, and meltwater fraction based on the filtered data.
Update the profile with the computed values.
Reintegration of the updated profile into the main dataset.

Raises:: CTDError – When there are no measurements within the specified pressure range for a profile.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.add_surface_salinity_temp_meltwater(start=10.1325, end=12.1325)
>>> # This will add new columns with surface salinity, surface temperature, and meltwater fraction
>>> # values to the dataset, calculated using the specified pressure range.

See also

add_density: method to calculate derived density.
add_potential_density: method to calculate derived potential density.

CTD.add_mean_surface_density(start=10.1325, end=12.1325)[source]#

Calculates the mean surface density from the density values and adds it as a new column to the CTD data table. Requires absolute salinity and absolute density to be calculated first.

Parameters:

start (float, default 10.1325) – Upper bound of surface pressure.
end (float, default 12.1325) – Lower bound of surface pressure.

Notes

This method adds a new column for the mean surface density to the dataset. The values are calculated as follows:

Mean surface density is computed as the mean of density values within the specified pressure range (start to end).

The procedure is as follows:

For each unique profile identified by profile_id, extract the profile’s data.
Filter the data to include only the samples within the specified pressure range (start to end).
Calculate the mean surface density based on the filtered data.
Update the profile with the computed mean surface density value.
Reintegration of the updated profile into the main dataset.

Raises:: CTDError – When there are no measurements within the specified pressure range for a profile.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.add_mean_surface_density(start=10.1325, end=12.1325)
>>> # This will add a new column with mean surface density values to the dataset, calculated using the
>>> # specified pressure range.

See also

add_absolute_salinity: Method to add absolute salinity if it is not already present in the dataset.
add_density: Method to add density if it is not already present in the dataset.

CTD.add_mld(reference: int, method: str = 'potential_density_avg', delta: float = 0.05)[source]#

Calculates and adds the mixed layer depth (MLD) using the density threshold method.

Parameters:

reference (int) – The reference depth for MLD calculation.
method (str, default "potential_density_avg") – The MLD calculation method. Options are “abs_density_avg” or “potential_density_avg”.
delta (float, default 0.05) – The change in density or potential density from the reference that would define the MLD in units of \(\frac{kg}{m^3}\).

Notes

The mixed layer depth (MLD) is calculated using the density threshold method, defined as the depth at which the density increases by a specified amount (delta) from the reference density. The reference density is calculated as the mean density up to the reference depth.

The procedure is as follows:

Initialize a new column for MLD in the dataset.
For each unique profile identified by profile_id, extract the profile’s data.
Filter the data to include only the samples up to the reference depth.
Calculate the reference density based on the chosen method: - “abs_density_avg”: mean absolute density up to the reference depth. - “potential_density_avg”: mean potential density up to the reference depth.
Identify the MLD as the shallowest depth where the density exceeds the reference density by the specified delta.
Update the profile with the calculated MLD value.
Reintegration of the updated profile into the main dataset.

The MLD equation is given by:

\[\text{MLD} = \min(D + \Delta > D_r)\]

where:

\(( D_r )\) is the reference density, defined as the mean density up to the reference depth.
\(( D )\) represents all densities in the profile.
\(( \Delta )\) is the specified change in density (delta).

Raises:: CTDError – When the specified method is invalid.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.add_mld(reference=10, method="potential_density_avg", delta=0.05)
>>> # This will add a new column with MLD values to the dataset, calculated using the specified method
>>> # and parameters.

See also

add_density: method to calculate derived density.
add_potential_density: method to calculate derived potential density.

CTD.add_brunt_vaisala_squared()[source]#

Calculates buoyancy frequency squared and adds it to the CTD data. Requires potential density to be calculated first.

This method computes the buoyancy frequency squared (also known as the Brunt-Väisälä frequency squared) for each profile in the dataset using the TEOS-10 standard. This parameter is essential for understanding the stability of the water column and its propensity to mix vertically.

The procedure is as follows:

Initialize new columns for buoyancy frequency squared and the mid-pressure values in the dataset.
For each unique profile identified by profile_id, extract the profile’s data.
Use the gsw.Nsquared function to compute buoyancy frequency squared and mid-pressure values from absolute salinity, conservative temperature, pressure, and latitude.
Update the profile with the computed buoyancy frequency squared and mid-pressure values.
Reintegration of the updated profile into the main dataset.

Notes

The gsw.Nsquared function from the Gibbs SeaWater (GSW) Oceanographic Toolbox is utilized for this calculation. More information about this function can be found at the TEOS-10 website.

The buoyancy frequency squared \(( N^2 )\) is calculated using the formula:

\[N_2 = g_2 \cdot \frac{\beta \cdot d(SA) - \alpha \cdot d(CT)}{\text{specvol_local} \cdot dP}\]

Note. This routine uses rho from “gsw_specvol”, which is the: computationally efficient 75-term expression for specific volume in terms of SA, CT and p (Roquet et al., 2015).
Note also that the pressure increment, dP, in the above formula is in: Pa, so that it is 104 times the pressure increment dp in dbar.

Raises:: CTDError – When buoyancy frequency could not be calculated due to a ValueError.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.add_brunt_vaisala_squared()
>>> # This will add new columns with buoyancy frequency squared values and mid-pressure values to the dataset,
>>> # calculated using the TEOS-10 formula.

See also

gsw.Nsquared: Function used for the calculation of buoyancy frequency squared.

CTD.add_potential_temperature(p_ref: float | ndarray = 0) → None[source]#

Calculates and adds potential temperature to the CTD data using the TEOS-10 formula.

This method computes the potential temperature of seawater, which is the temperature a parcel of water would have if moved adiabatically to the sea surface pressure.

Notes

The gsw.pt_from_t function from the Gibbs SeaWater (GSW) Oceanographic Toolbox is utilized for this calculation. More information about this function can be found at the TEOS-10 website.

This method adds a new column for potential temperature in the dataset.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.add_potential_temperature()
>>> # This will add a new column with potential temperature values to the dataset, calculated using the TEOS-10 formula.

CTD.add_conservative_temperature() → None[source]#

Calculates and adds conservative temperature to the CTD data using the TEOS-10 formula.

This method computes the conservative temperature, which is a more accurate measure of heat content in seawater compared to potential temperature.

Notes

The gsw.CT_from_t function from the Gibbs SeaWater (GSW) Oceanographic Toolbox is utilized for this calculation. More information about this function can be found at the TEOS-10 website.

This method adds a new column for conservative temperature in the dataset.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.add_conservative_temperature()
>>> # This will add a new column with conservative temperature values to the dataset, calculated using the TEOS-10 formula.

CTD.add_dynamic_height(p_ref: float | ndarray = 0) → None[source]#

Calculates and adds dynamic height anomaly to the CTD data using the TEOS-10 formula.

This method computes the dynamic height anomaly, which represents the geostrophic streamfunction that indicates the difference in horizontal velocity between the pressure at the measurement point (p) and a reference pressure (p_ref).

Parameters:: p_ref (Union[float, np.ndarray], optional) – Reference pressure, in dbar. The default is 0 dbar, corresponding to the sea surface.

Notes

The gsw.geo_strf_dyn_height function from the Gibbs SeaWater (GSW) Oceanographic Toolbox is utilized for this calculation. More information about this function can be found at the TEOS-10 website.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.add_dynamic_height()
>>> # This will add a new column with dynamic height values to the dataset, calculated using the TEOS-10 formula.

CTD.add_thermal_expansion_coefficient() → None[source]#

Calculates and adds thermal expansion coefficient to the CTD data using the TEOS-10 formula.

The thermal expansion coefficient, \(\alpha\), is important for understanding how the volume of seawater changes with temperature at constant pressure. It is derived from absolute salinity, conservative temperature, and sea pressure.

Notes

The gsw.alpha function from the Gibbs SeaWater (GSW) Oceanographic Toolbox is utilized for this calculation. More information about this function can be found at the TEOS-10 website.

The thermal expansion coefficient is calculated using the following equation from TEOS-10:

\[\alpha^\theta = -\frac{1}{\rho} \frac{\partial \rho}{\partial \theta} \bigg|_{S_A, P}\]

where: - \(\rho\) is the in-situ density of seawater, - \(\theta\) is the conservative temperature, - \(S_{A}\) is the absolute salinity, - \(P\) is the sea pressure.

This method adds a new column for the thermal expansion coefficient in the dataset. It ensures that absolute salinity and conservative temperature are present in the data, calculating them if necessary, before computing the thermal expansion coefficient.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.add_thermal_expansion_coefficient()
>>> # This will add a new column with thermal expansion coefficient values to the dataset, calculated using the TEOS-10 formula.

CTD.add_haline_contraction_coefficient() → None[source]#

Calculates and adds haline contraction coefficient to the CTD data using the TEOS-10 formula.

The haline contraction coefficient, \(\beta\), is important for understanding how the volume of seawater changes with salinity at constant temperature. It is derived from absolute salinity, conservative temperature, and sea pressure.

Notes

The gsw.beta function from the Gibbs SeaWater (GSW) Oceanographic Toolbox is utilized for this calculation. More information about this function can be found at the TEOS-10 website.

The haline contraction coefficient is calculated using the following equation from TEOS-10:

\[\beta^\theta = \frac{1}{\rho} \frac{\partial \rho}{\partial S_A} \bigg|_{\theta, P}\]

where: - \(\rho\) is the in-situ density of seawater, - \(S_A\) is the absolute salinity, - \(\theta\) is the conservative temperature, - \(P\) is the sea pressure.

This method adds a new column for the haline contraction coefficient in the dataset. It ensures that absolute salinity and conservative temperature are present in the data, calculating them if necessary, before computing the haline contraction coefficient.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.add_haline_contraction_coefficient()
>>> # This will add a new column with haline contraction coefficient values to the dataset, calculated using the TEOS-10 formula.

Data Export#

CTD.get_df(pandas=False) → DataFrame | Any[source]#

Returns the dataframe of the CTD object for integration with custom pipelines.

Parameters:: pandas (bool, default False) – If True returns a pandas df, if False returns a polars DataFrame. Defaults to False.

Examples

Accessing CTD data as a polars dataframe

>>> from ctdfjorder import CTD
>>> ctd_data = CTD('CC1531002_20181225_114931.csv')
>>> ctd_data.remove_non_positive_samples()
>>> output = ctd_data.get_df()
>>> print(output.head(3))
shape: (3, 13)
┌──────────────┬──────────┬─────────────┬──────────────┬───┬────────────┬───────────────────────────────┬────────────┬────────────┐
│ sea_pressure ┆ depth    ┆ temperature ┆ conductivity ┆ … ┆ profile_id ┆ filename                      ┆ latitude   ┆ longitude  │
│ ---          ┆ ---      ┆ ---         ┆ ---          ┆   ┆ ---        ┆ ---                           ┆ ---        ┆ ---        │
│ f64          ┆ f64      ┆ f64         ┆ f64          ┆   ┆ i32        ┆ str                           ┆ f64        ┆ f64        │
╞══════════════╪══════════╪═════════════╪══════════════╪═══╪════════════╪═══════════════════════════════╪════════════╪════════════╡
│ 0.15         ┆ 0.148676 ┆ 0.32895     ┆ 28413.735648 ┆ … ┆ 0          ┆ CC1531002_20181225_114931.csv ┆ -64.668455 ┆ -62.641775 │
│ 0.45         ┆ 0.446022 ┆ 0.316492    ┆ 28392.966662 ┆ … ┆ 0          ┆ CC1531002_20181225_114931.csv ┆ -64.668455 ┆ -62.641775 │
│ 0.75         ┆ 0.743371 ┆ 0.310613    ┆ 28386.78011  ┆ … ┆ 0          ┆ CC1531002_20181225_114931.csv ┆ -64.668455 ┆ -62.641775 │
└──────────────┴──────────┴─────────────┴──────────────┴───┴────────────┴───────────────────────────────┴────────────┴────────────┘

Accessing CTD data as a pandas dataframe

>>> from ctdfjorder import CTD
>>> ctd_data = CTD('CC1531002_20181225_114931.csv')
>>> ctd_data.remove_non_positive_samples()
>>> output = ctd_data.get_df(pandas=True)
>>> print(output.head(3))
   sea_pressure     depth  temperature  conductivity  specific_conductivity  ...  pressure  profile_id                       filename   latitude  longitude
0          0.15  0.148676      0.32895  28413.735648           56089.447456  ...   10.2825           0  CC1531002_20181225_114931.csv -64.668455 -62.641775
1          0.45  0.446022     0.316492  28392.966662           56076.028991  ...   10.5825           0  CC1531002_20181225_114931.csv -64.668455 -62.641775
2          0.75  0.743371     0.310613   28386.78011           56076.832208  ...   10.8825           0  CC1531002_20181225_114931.csv -64.668455 -62.641775
[3 rows x 13 columns]

Returns:: CTD data in pandas when pandas=True, polars when pandas=False.
Return type:: pl.DataFrame | pd.DataFrame

Notes

There is no supported method to reinsert the dataframe back into the CTD object. Any changes made on this dataframe will not be reflected in the CTD objects internal data.

CTD.save_to_csv(output_file: str, null_value: str | None)[source]#

Renames the columns of the CTD data table based on a predefined mapping and saves the data to the specified CSV file.

Parameters:

output_file (str) – The output CSV file path.
null_value (str or None) – The value to represent null cells in the saved file.

Notes

This method will rename the columns of the CTD dataset based on a predefined mapping. After renaming, the dataset is saved to the specified CSV file. If a file with the same name already exists at the specified path, it will be overwritten.

The procedure is as follows:

Rename the columns of the CTD data table using a predefined mapping.
Save the modified data table to the specified CSV file path.

The predefined column mapping ensures that the column names in the output CSV file adhere to a specific naming convention or format required for further analysis or sharing.

Raises:: IOError – If there is an error in writing to the specified file path.

Examples

>>> ctd_data = CTD('example.csv')
>>> ctd_data.save_to_csv(output_file='path/to/output.csv')
>>> # This will rename the columns of the CTD dataset and save it to 'path/to/output.csv'.
>>> # Any existing file with the same name at that location will be overwritten.

See also

utils.save_to_csv: Utility function used to save the data to a CSV file.