The most substantial part of creating a Dataflow is unquestionably the Transform & Map step, and while this of course is covered in the general guide, Introduction to the Data Manager: How to set up a Dataflow, a more detailed look at the options that this feature offers in terms of converting and transforming columns may be necessary. Thus, this list.
1: Convert
The most straightforward way to adjust your data is to simply convert it from one form or format to another. In most cases, it is possible to select a fallback value that will be employed as a default if a line is empty or invalid.
When selecting Convert, you will have the following options:
- Boolean
A format used for purely digital datapoints of the type true/false. In some cases, the data will only contain a 'true' where relevant, and leave other fields blank - making it particularly important to use the Fallback Value field in this case, by defining it as 'false'. Also note that ONLY values of 'true' and 'false' can be read, so if any other format is used - such as 'yes/no' or '0/1', a Replace-transformation will be required in order to make the column readable.
- Currency
Used to convert financial data from one currency to another, based on current exchange-rates. Available currencies include Danish, Swedish and Norwegian Kroner, US Dollars, British Pounds, Euros, Swiss Francs and Polish Zloty. You must also make sure to select which kind of decimal point is used by the column - with en being a period and da being a comma. The output number is also automatically converted to the Decimal format.
- DateTime
Used for timestamps of various kinds, this frequently-needed conversion allows you to choose from a broad selection of common date and date-time formats, or enter a custom one that fits the data. Since this provides an often-crucial timestamp, DateTime-format fields are frequently mandatory.
- Decimal
Used for decimal-numbers, such as cash values. You must select whether the numbers in question use a period as a decimal point (as in most anglophone countries, including the US and the UK) or a comma, as in many other cultures including Denmark and Norway. Be careful in selecting the correct one, since major errors can otherwise occur!
Used for converting E-Mail lists into a format readable by the system. Typos in the e-mails can cause issues, with common ones including anything that ends with .con or .vom instead of .com - hence why the 'Clear out invalid emails' option is toggled on by default. If there is a large number of these, keep in mind that the Transform option makes it possible to do a Replace while converting the data into the Email format.
- Enum
This advanced conversion allows you to set one or more specific values, be they numbers or text-strings. Only these known values will be accepted, and anything outside of them will generate an error. Useful for cases where you want a very rigid output and no margin of error, such as output from radio-buttons or checkboxes.
- Guid
A specific format used for personal ID's, often customer IDs. The most common example of this is likely the Cookie ID that may be used in cases when you have nothing more solid, such as e-mails or similar, to fall back on.
- Integer
This simply converts a datastring into the Integer-format, used for clean numbers without decimals. Often used for non-monetary counts such as inventory, sales-numbers, visits, and much, much more.
- Regex
This conversion allows you to apply a Regular Expression - a 'Regex' term - to the column. Only values that match this term will be retained, all other data will be discarded. For example, if you have a column featuring the full name of customers, you could use ^\w+ to isolate the first name only, or \w+$ to isolate the last name only. The output is formatted as a regular String.
- String
A String can be virtually anything, and is the 'default' format that the system imports all columns in. In the case of text-strings, such as product descriptions or lists of names, the column will generally remain in this format. In the case of a pure Convert, there is little reason to use this option, but it is there for a reason - having utility when using Transform instead since it allows you to run Transform-functions on a String without changing its format.
2: Transform
Transform is a more complex and powerful way to convert your data into the desired form. Not only does it allow for multiple columns to be combined into a single one, it can also apply a number of unique Transformations to a selected column prior to Converting them as usual.
When transforming a column, you have the following options:
- Add Default Value
Similar to the Fallback Value setting available in many Conversions, this sets a value that will be applied to any empty line in the column.
- AES Decryption
Used for AES Encrypted columns - enter the correct Decryption Key to return the data to a readable format, which can then be Converted as normal.
- Format
Mostly used for text-strings. You can type in whatever you like, using {0} to represent the actual column content. For example, if the column is a list of brands, and you wish to output a text-string that states "Produced by [Brand]", you would simply type 'Produced by {0}' in the field.
- Regex Replace
Use a Regular Expression to identify values. Those that match the expression will then be replaced by the set Replace Value. Essentially a more complex version of the regular Replace Transformation.
- Regex Select
Effectively identical to the Regex Conversion-option, allowing you to select values based on a Regular Expression before applying a different Conversion to a given column. For example, if one has a list of payments featuring decimals and currency-type, but you want to convert it into an Integer-format showing just the whole numbers, you could use the expression \d+ to isolate just the first whole number on each line - at which point it can be easily Converted into an Integer.
- Replace
This transformation allows you to effectively run a search-and-replace on the column. Whatever is selected as the Current Value will be transformed into whatever is selected as the New Value. This can apply to text-strings or numbers as needed. One possible use for this is to reduce the number of invalid addresses in an e-mail list by replacing common typos such as .con and .vom with .com.
- Substring
Allows you to select a specific segment of a column to be used, discarding the rest. Startindex denotes where to begin, while Length determines how far beyond that point to continue. For example, selecting '5' and '10' as the values will cause the first 4 digits to be discarded, the 10 that follows to be retained, and anything after that to be similarly ignored. Can, for example, be used to pick a relevant value out of an otherwise regular URL.
When combining columns, you have the following options:
- Fallback
By selecting a specific value, with 'empty' being an obvious option, you can set the system to use the next column in the order instead. For example, if Column 1 is empty, the corresponding line from Column 2 will be used instead, if both are empty then Column 3 is used... A 'backup' fallback can also still be set, which will then be utilized if all of the included columns are empty.
- Format
Similar to the Format Transformation, this allows you to similarly enter a desired string, inserting all the selected columns wherever desired by using {0}, {1}, {2}, etc. to denote column 1, 2, 3 and so forth. Keep in mind that it always counts from 0. Example: You are selling clothes, and column 1 contains the type of item, while column 2 is the color. You could thus enter 'A {0} {1}' to get output such as 'A red dress' or 'A blue shirt'.
- Join
The values are of the columns are simply combined in a single field, using the selected symbol - such as a comma or dash - to denote the point where they are are joined.
Comments
0 comments
Please sign in to leave a comment.