You are on page 1of 41

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.

Training objectives
What will you be able to do at the end of the training?

• Build Anatella scripts to transform data.


• Extract and load data to and from Anatella.
• Become proficient at using Anatella.

Presentation Exercises

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


3

Training Agenda
• What is Anatella

• The Anatella Environment

• Basic Operations of Anatella

• Anatella boxes you cannot live without

• Practical exercise

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


4

What is Anatella?
Anatella is an ETL: it extracts, transforms, and
loads data
Anatella is a data Transformation tool
• known as an “ETL tool”, an acronym for “Extract, Transform and Load”

Transformations
Transfomed
Data file Extract Load results

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


is user-friendly.
Anatella is User-Friendly:
• Most data-transformations are meta-data-free: you don’t need to care about the meta-type of a column. In this
regard, Anatella is like MS-Excel: In MS-Excel, you don’t need to specify the data-type of your columns/cells, neither
do you in Anatella. Anatella is only slightly more complex than MS-Excel.
• Most data-transformations are code-free: You only need to connect "boxes":
Filter
(where) Group by order by

Select education, count (education) as count from table


where sex="Female"
group by education
order by count
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
The advantage of Antella 6

Fast to execute & debug, easier to understand


Fast to execute
• It has been optimized to the maximum
Fast to debug
• Data can be viewed after each step (not only at the final output)
Easy to understand graphic interface accessible to non-programmers
• Step-by-step logic; no coding required.

Anatella
SQL

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


6
7

Training Agenda
• What is Anatella

• The Anatella Environment

• Basic Operations of Anatella

• Anatella boxes you cannot live without

• Practical exercise

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


The Anatella environment 8

Anatella has a user friendly interface with no


coding required.
Menu:
Menus list for quick
actions.

Data Table:
Results of
action Action Properties:
Log:
displayed here Where action box
Log file of actions
properties are modified
and transformations
kept here

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


The Anatella environment 9

Data is Extracted, Transformed and Loaded in an


easy intuitive way
Extraction from
scratch, from text
files, from gel, from
DB

Transformations on
the data: sorts,
aggregations,
calculations, graph
analysis, …

Load into flat files,


gel, DB

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


10

Training Agenda
• What is Anatella

• The Anatella Environment

• Basic Operations of Anatella

• Anatella boxes you cannot live without

• Practical exercise

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


Building and running Anatella scripts 11

Anatella transforms graphs composed of linked


boxes using arrows
Arrows indicate that the data at
the output pin of one box is used
as input for the following box

Boxes indicate
operations on
the underlying
data
The flag is used to show the
termination of the graph

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


Building and running Anatella scripts 12

Example of how to build a transformation graph


1. Select the “connect” mode to build arrows. In this
mode, click on the output pin of the outgoing box then
on the input pin of the incoming box to create an arrow 2. Drag and drop boxes
From the right panel to
the middle frame to add
them to the graph

3. Double-click on a
4. Right-clicking on the
box to edit its
flag and selecting the
properties in the
green arrow will run
lower left frame
the complete graph

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


Building and running Anatella scripts 13

Testing scripts for intermmediate results


In run mode, when clicking on an
output pin, the graph runs from
Click on “run” to the last saved result till this
switch to run mode output pin The status bar
shows the
overall
progress of the
calculation

This icon (the rotating


cube) shows that the
box is currently running

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


14
Building and running Anatella scripts
Running a transformation graph
Box Description

Run to finish line:


Click on this flag to run the graph from the last cached point to the flag. This is a very
useful method of testing scripts efficiently and quickly however it is not advised to
production situations.

Delete all caches and run to finish line:


Click on this flag to run the graph from the beginning of the graph until the flag. This
method deletes all saved caches on the graphs. It is best practice to use this method
in production.

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


15

Training Agenda
• What is Anatella

• The Anatella Environment

• Basic Operations of Anatella

• Most essential Anatella boxes

• Practical exercise

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


Operations of Standard Boxes 16

It is important to understand the most commonly


used boxes.
Extraction and Loading Transformation

Automation

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


17

Extraction boxes
Boxes Description

Extraction types:
The boxes above are used for extracting data from flat file or
from Gel files. Gel files are highly optimized data file formats
that is unique to the Anatella software.

Example of box parameters: Read .csv

File name
Column
delimiter

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


18

Automation boxes
Box Description

Global runner :
Often scripts will be required to run automatically. The global runner box is used as
an “end” box for a script. In Anatella’s top menu there is a global runner icon, if that
is clicked, all boxes linked to a global runner box will run.

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


19

Automation boxes
Box Description

Parallel run:
The parallel run box is used to run a series of Anatella scripts using one script. Often
scripts are built in isolation to do a certain transformation, the parallel run box
allows users to create a list of scripts and runs all the scripts in the specified order.

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


20

Transformation boxes
Box Description

Append box:
This box is used to append on table to another. It is equivalent to a “UNION” statement in SQL

Office Agent Amount Union


DBN Adam R 400 Office Agent Amount
JHB Paul R 450 DBN Adam R 400
CPT Lilly R620 JHB Paul R 450
GMR Jenny R300 CPT Lilly R620
Office Agent Amount GMR Jenny R300
PTA Dela R 320 PTA Dela R 320
ELN Chris R 470 ELN Chris R 470
CPT Adam R 800 CPT Adam R 800
JHB John R 120 JHB John R 120

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


21

Transformation boxes
Box Description

Single Join box:


Used to joining two tables on a specific key value. The key value must be sorted beforehand.

Office Agent Amount


DBN Adam R 400 Left Join
JHB Paul R 450 Office Agent Amount Area
CPT Lilly R620 DBN Adam R 400 21
GMR Jenny R300 JHB Paul R 450 63
CPT Lilly R620 112
Office Area
GMR Jenny R300
JHB 63
DBN 21
PTA 83
CPT 112

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


22

Transformation boxes
Box Description

Single Join box:


Used to joining two tables on a specific key value. The key value must be sorted beforehand.

Office Agent Amount


DBN Adam R 400 Inner Join
JHB Paul R 450
Office Agent Amount Area
CPT Lilly R620
DBN Adam R 400 21
GMR Jenny R300
JHB Paul R 450 63
Office Area CPT Lilly R620 112
JHB 63
DBN 21
PTA 83
CPT 112

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


23

Transformation boxes
Box Description

Single Join box:


Used to joining two tables on a specific key value. The key value must be sorted beforehand.

Office Agent Amount


Full Outer Join
DBN Adam R 400
Office Agent Amount Area
JHB Paul R 450
DBN Adam R 400 21
CPT Lilly R620
JHB Paul R 450 63
GMR Jenny R300
CPT Lilly R620 112
Office Area GMR Jenny R300
JHB 63 PTA 83
DBN 21
PTA 83
CPT 112

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


24

Transformation boxes
Box Description
Multi Join box:
Similar to the single join key, however tables can be joined on multiple join keys specified by
the user. The keys do not have to sorted as the complete slave tables will be stored in memory.
Office Agent Amount
DBN Adam R 400 Multiple Left Join
JHB Paul R 450
Office Agent Amount Area Target
CPT Lilly R620
GMR Jenny R300
DBN Adam R 400 21 R 750
Office Area JHB Paul R 450 63
JHB 63 CPT Lilly R620 112 R 750
DBN 21 GMR Jenny R300
PTA 83
CPT 112

Agent Target
Adam R 750
Lilly R 750
Michel R 650

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


24
25

Transformation boxes
Box Description

Sort box:
This box is used to sort data. It is very commonly used in Anatella as a mandatory task
to do before other tasks can be complete. For example, data has to be sorted on the
join key before joining.

Office Agent Amount Office Agent Amount


DBN Adam R 400 CPT Lilly R 620
JHB Paul R 450 JHB Paul R 450
CPT Lilly R 620 DBN Adam R 400
GMR Jenny R 300 GMR Jenny R 300

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


26

Transformation boxes
Box Description

Aggregation box:
All aggregation processes are done with this box. It is equivalent to a “GROUP BY” statement in SQL.

Office Agent Amount


DBN Adam R 400 Office Amount_sum
JHB Paul R 450 DBN R 400
CPT Lilly R620 JHB R 570
GMR Jenny R300 CPT R 1420
PTA Dela R 320 GMR R300
ELN Chris R 470 PTA R 320
CPT Adam R 800 ELN R 470
JHB John R 120

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


27

Transformation boxes
Box Description

Data type wizard box:


Used to convert data types. For example change integers to float values or to string values

String manipulation box:


This box is used to treat/clean strings (text). E.g.: remove brackets, convert all letters to
capitals of replace words with other words.

Column rename box:


This box is used to rename columns in your data. It is often useful to use before loading the
data to Excel/Tableau/Qlickview.

Column selection box:


This box is used to choose certain columns in the data. It is equivalent to a “SELECT” statement in SQL.

Date formatter box:


This is box is used to format dates to a specific format.
For example, “2012-02-02” to “12/02/02”

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


28

Transformation boxes
Box Description

Calculator box:
Used to perform calculations based on several columns and various data types.

With this box, you can create and/or updates columns. Below are examples of how to
use this box. Use the help tab for more information about available functions:
Calculating Profit:
Qty * (Price_per_unit – Cost_per_unit)

Concatenate name and surname


name//”-”//surname

Return a “yes” if x is > 10:


X>10? “yes” : ”no”

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


29

Transformation boxes
Box Description

Calculator box:
Used to perform calculations based on several columns and various data types.

With this box, you can create and/or updates columns. Below are examples of how
to use this box. Use the help tab for more information about available functions:
Calculating Profit example:
Qty * (Price_per_unit – Cost_per_unit)

Office Price per unit Cost per unit Quantity Profit


Yokohama Tires R 40 R 30 1200 R 12 000
Dunlop Tires R 57 R 45 1400 R 16 800

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


30

Transformation boxes
Box Description

Calculator box:
Used to perform calculations based on several columns and various data types.

With this box, you can create and/or updates columns. Below are examples of how
to use this box. Use the help tab for more information about available functions:
Concatenate name and surname example:
name//”-”//surname

Agent Surname Concatenation


Jacob Zuma Jacob-Zuma
Helen Zille Helen-Zille
Tony Stark Tony-Stark

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


31

Transformation boxes
Box Description

Calculator box:
Used to perform calculations based on several columns and various data types.

With this box, you can create and/or updates columns. Below are examples of how
to use this box. Use the help tab for more information about available functions:
Return a yes if x is > 10 example:
X>10? “yes” : ”no”

Agent Millions Rich


Jacob 210 Yes
Helen 70 Yes
Tony 6 No

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


32

Transformation boxes
Box Description

Filter box:
This box is used to filter out rows of data that fit a certain criteria. It is
equivalent to a “WHERE” statement in SQL

Below are examples of how to use this box. Use the help tab for more
information about available functions:

Filter only waybills from CPT with a weight greater than 6


Loading == “CPT” && weight > 6

Remove null from File Reference:


FileRef != “NULL”

Filter names with first three letters “Dav”


left(name,3) == “Dav”

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


33

Transformation boxes
Box Description

Filter box:
This box is used to filter out rows of data that fit a certain criteria. It is equivalent to
a “WHERE” statement in SQL

Use the help tab for more information about available functions.

Filter only waybills from CPT with a weight greater than 6


Loading == “CPT” && weight > 6

Waybill Loading Weight


1234 CPT 5
1235 CPT 12
Waybill Loading Weight
1236 JHB 14
1235 CPT 12
1237 JHB 7
1238 JHB 5
1239 DBN 14
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
34
Transformation boxes
Box Description

Filter box:
This box is used to filter out rows of data that fit a certain criteria. It is equivalent to
a “WHERE” statement in SQL

Use the help tab for more information about available functions.

Remove null from File Reference example:


not(isNull(FileRef))
Filref Loading Weight
HANLB2 CPT 5 Filref Loading Weight
D4355 CPT 12 HANLB2 CPT 5
C3452 JHB 14 D4355 CPT 12
JHB 7 C3452 JHB 14
JHB 5 23NULL3 DBN 14
23NULL3 DBN 14

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


35
Transformation boxes
Box Description

Filter box:
This box is used to filter out rows of data that fit a certain criteria. It is equivalent to
a “WHERE” statement in SQL

Use the help tab for more information about available functions.

Filter names with first three letters “Dav”


left(name,3) == “Dav”

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


36

Transformation boxes: calculator & filterRows


Below a short summary of the main functions available:
Operators: + - * / ^

Comparison: ==, >, <, <=, >=, !=

Logical: &&,||

Condition: (x>a?”True”:”False”)

Format: ftoa, atof, itoa

Math: abs, floor, ceil, round, sum, max, min, sqrt…

Char: right, left, substr, strlen, toupper, tolower, indexof…

Special: isNull, nDaysInMonth,nvl

Constants: _pi, _e, _n, _null

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


37

Training Agenda
• What is Anatella

• The Anatella Environment

• Basic Operations of Anatella

• Most essential Anatella boxes

• Practical exercise

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


Thank you for your Attention

For more information, please visit our website :


http://www.business-insight.com
© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.
Backup up Slides

The following slides are not part of the


presentation. They are used occasionnaly to
answer to some specific technical question.

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


Is it possible to comment code?

Yes, it is.
You can put comments
everywhere:
• Directly on the graph.
• In the javascript.

• In the SQL
(put "--" at the beginning
of a line)

© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.


Can we edit directly .Anatella files?
Yes, for a technician.
• This file is a simple XML file (a text file) that is formatted so that a human can directly and easily
understand and change it.
• For example: you can directly and easily edit the SQL statements inside the .anatella file:

You can use any “unicode” text editor


to edit .anatella files. For example,
you can use the free editor
“EditPadLite7”.
Equivalent© 2014 TIMi S.A.S. – TIMi: Faster predictions, better decisions.