TRAIN MODEL
Purpose
Use the TRAIN MODEL statement to build a model by training a modeltype on the columns in a given table.
Syntax
Diagram
trainModel
trainDataClause
columnNameList
trainDataConditionClause
trainSampleClause
trainModelOptionsClause
optionKeyValue
Keywords and Parameters
modelName
This is an identifier that specifies the name of the model to be built.
modeltypeName
This is an identifier that specifies the name of the modeltype to be used for model training.
UPDATE
Use the UPDATE clause if you want to update the model by training additional data on an existing model.
LIKE
Use the LIKE clause if you want to train a model with the same columns as the existing model.
exModelName
This is an identifier that specifies the name of the existing model.
trainDataClause
Specify the target data for model training. To train a model on columns from multiple tables, specify them using the JOIN clause.
schemaName
This is an identifier that specifies the name of the schema that contains the training target table. If not specified, the default (current) schema is used.
tableName
This is an identifier that specifies the name of the training target table.
columnNameList
Specify the target columns for model training. Multiple columns can be specified as a comma-separated list.
trainDataConditionClause
Specify the conditions for retrieving target data for model training. This clause is used to specify join conditions for training a model on multiple tables, or to filter target data for updating an existing model.
trainSampleClause
Use the SAMPLE caluse if you want to use only a part of the original table as training data.
trainModelOptionsClause
Specify the model training options, including hyperparameters like epochs. The options that can be specified depend on the modeltype.
‘optionKey’
This is a string literal that specifies the key of the option.
optionValue
This is a string literal or a numeric value that specifies the value of the option.
Examples
Training a Model
The following statement trains a model tgan of the tablegan modeltype on the columns reordered and add_to_cart_order of the order_products table in the instacart schema.
TRAIN MODEL tgan MODELTYPE tablegan
ON instacart.order_products(reordered, add_to_cart_order);
By adding the OPTIONS clause, the epochs hyperparameter can also be specified.
TRAIN MODEL tgan MODELTYPE tablegan
ON instacart.order_products(reordered, add_to_cart_order)
OPTIONS ( 'epochs' = 100 );
It is possible to train a model with data from multiple tables, as shown below.
TRAIN MODEL tgan_multi_tables MODELTYPE tablegan
FROM instacart.order_products(reordered, add_to_cart_order, order_id)
JOIN instacart.orders(order_id, order_dow)
ON orders.order_id = order_products.order_id;
Updating a Model
The following statements train a model rspn_op of the rspn modeltype on the columns reordered and add_to_cart_order of the order_products table in the instacart schema, then train a new model rspn_op_update by updating the model with additional data.
TRAIN MODEL rspn_op MODELTYPE rspn
FROM instacart.order_products(reordered, add_to_cart_order);
TRAIN MODEL rspn_op_update UPDATE rspn_op
ON order_products.order_id > 3000000;