AI#
- class ctdfjorder.ai.ai.AI[source]#
This class should not be imported, see
ctdfjorder.CTD.CTD.clean()to use the methods here.- static clean_salinity_ai(profile: DataFrame, profile_id: int) DataFrame[source]#
Cleans salinity using a GRU (Gated Recurrent Unit) machine learning model.
- Parameters:
profile (pl.DataFrame) – Single profile of CTD (Conductivity, Temperature, Depth) data.
profile_id (int) – Profile number, used for identification and logging.
- Returns:
CTD data with cleaned salinity values.
- Return type:
pl.DataFrame
Notes
This process bins the data every 0.5 dbar of pressure.
The GRU model uses a 16-width layer with a loss function designed to penalize decreasing salinity values with respect to pressure.
The penalization term is weighted to emphasize the importance of non-decreasing salinity with increasing pressure.
GRU Model#
The GRU model consists of:
An input layer that takes the binned CTD data.
A GRU layer with 16 units.
An output layer that predicts the salinity values.
Loss Function#
The custom loss function used in this model is the Mean Absolute Error (MAE) with an additional penalty term for salinity predictions that decrease with pressure. This is given by:
\[L = \text{MAE}(y_{true}, y_{pred}) + \lambda \cdot \text{mean}(\text{penalties})\]where penalties are calculated as:
\[\text{penalties} = \text{where}(\Delta s_{pred} < 0, \min(\Delta s_{pred}, 0), 0)\]Here, \(\Delta s_{pred}\) represents the change in predicted salinity values, and \(\lambda\) is the weighting factor for the penalty term.
Examples
>>> cleaned_data = AI.clean_salinity_ai(profile, profile_id=1) >>> cleaned_data.head()