AI#

class ctdfjorder.ai.ai.AI[source]#

This class should not be imported, see ctdfjorder.CTD.CTD.clean() to use the methods here.

clean_salinity_ai(profile_id: int) DataFrame[source]#

Cleans salinity using a GRU (Gated Recurrent Unit) machine learning model.

Parameters:
  • profile (pl.DataFrame) – Single profile of CTD (Conductivity, Temperature, Depth) data.

  • profile_id (int) – Profile number, used for identification and logging.

Returns:

CTD data with cleaned salinity values.

Return type:

pl.DataFrame

Notes

  • This process bins the data every 0.5 dbar of pressure.

  • The GRU model uses a 16-width layer with a loss function designed to penalize decreasing salinity values with respect to pressure.

  • The penalization term is weighted to emphasize the importance of non-decreasing salinity with increasing pressure.

GRU Model#

The GRU model consists of:

  • An input layer that takes the binned CTD data.

  • A GRU layer with 16 units.

  • An output layer that predicts the salinity values.

Loss Function#

The custom loss function used in this model is the Mean Absolute Error (MAE) with an additional penalty term for salinity predictions that decrease with pressure. This is given by:

\[L = \text{MAE}(y_{true}, y_{pred}) + \lambda \cdot \text{mean}(\text{penalties})\]

where penalties are calculated as:

\[\text{penalties} = \text{where}(\Delta s_{pred} < 0, \min(\Delta s_{pred}, 0), 0)\]

Here, \(\Delta s_{pred}\) represents the change in predicted salinity values, and \(\lambda\) is the weighting factor for the penalty term.

Examples

>>> cleaned_data = AI.clean_salinity_ai(profile, profile_id=1)
>>> cleaned_data.head()