Formula
Add new features to your dataset.
Inputs
- Data: input dataset
Outputs
- Data: dataset with additional features
Formula allows computing new columns by combining the existing ones with a user-defined expression. The resulting column can be categorical, numerical or textual.
For numeric variables, it sufices to provide a name and an expression.
 
- List of constructed variables
- Add or remove variables
- New feature name
- Expression in Python
- Select a feature
- Select a function
- Produce a report
- Press Send to communicate changes
The following example shows construction of a categorical variable: its value is "lower" is "sepal length" is below 6, "mid" if it is at least 6 but below 7, and "higher" otherwise. Note that spaces need to be replaced by underscores (sepal_length).
 
- List of variable definitions
- Add or remove variables
- New feature name
- Expression in Python
- If checked, the feature is put among meta attributes
- Select a feature to use in expression
- Select a function to use in expression
- Optional list of values, used to define their order
- Press Send to compute and output data
Hints
If you are unfamiliar with Python math language, here's a quick introduction.
Expressions can use the following operators:
- +,- -,- *,- /: addition, subtraction, multiplication, division
- //: integer division
- %: remainder after integer division
- **: exponentiation (for square root square by 0.5)
- <,- >,- <=,- >=less than, greater than, less or equal, greater or equal
- ==equal
- !=not equal
- if-else: value ifcondition else other-value (see the above example
See more here.