Scenarios

[1]:
import pandas as pd

dataframe = pd.read_csv("data/example.csv")
dataframe
[1]:
ID Date Continent Country City Color Quantity Price
0 1 2019-01-01 Europe France Paris red 1000.0 500.0
1 2 2019-01-02 Europe France Lyon red 2000.0 400.0
2 3 2019-01-05 Europe France Paris blue 3000.0 420.0
3 4 2018-01-01 Europe France Bordeaux blue 1500.0 480.0
4 5 2019-01-01 Europe UK London green 3000.0 460.0
5 6 2019-01-01 Europe UK London red 2500.0 500.0
6 7 2019-01-02 Asia China Beijing blue 2000.0 410.0
7 8 2019-01-05 Asia China HongKong green 4000.0 350.0
8 9 2018-01-01 Asia India Dehli red 2200.0 360.0
9 10 2019-01-01 Asia India Mumbai blue 1500.0 400.0
[2]:
import activeviam as av

session = av.create_session()
store = session.read_pandas(dataframe, keys=["ID"], store_name="First store")
cube = session.create_cube(store, "FirstCube")
cube.query()
[2]:
Price.AVG Price.SUM Quantity.AVG Quantity.SUM contributors.COUNT
0 428.0 4280.0 2270.0 22700.0 10

Scenarios from modified sources

Load a modified version of your data into the store to compare it to the original version.

First, let’s modify the source dataset with pandas:

[3]:
dataframe.loc[(dataframe["City"] == "Paris"), "Color"] = "purple"
dataframe
[3]:
ID Date Continent Country City Color Quantity Price
0 1 2019-01-01 Europe France Paris purple 1000.0 500.0
1 2 2019-01-02 Europe France Lyon red 2000.0 400.0
2 3 2019-01-05 Europe France Paris purple 3000.0 420.0
3 4 2018-01-01 Europe France Bordeaux blue 1500.0 480.0
4 5 2019-01-01 Europe UK London green 3000.0 460.0
5 6 2019-01-01 Europe UK London red 2500.0 500.0
6 7 2019-01-02 Asia China Beijing blue 2000.0 410.0
7 8 2019-01-05 Asia China HongKong green 4000.0 350.0
8 9 2018-01-01 Asia India Dehli red 2200.0 360.0
9 10 2019-01-01 Asia India Mumbai blue 1500.0 400.0

We can now load this dataset into a new scenario:

[4]:
store.scenarios["Purple Paris"].load_pandas(dataframe)
store.scenarios["Purple Paris"]
[4]:
  • First store
    • ID
      • key : True
      • nullable : False
      • type : long
    • Date
      • key : False
      • nullable : True
      • type : LocalDate[yyyy-MM-dd]
    • Continent
      • key : False
      • nullable : True
      • type : string
    • Country
      • key : False
      • nullable : True
      • type : string
    • City
      • key : False
      • nullable : True
      • type : string
    • Color
      • key : False
      • nullable : True
      • type : string
    • Quantity
      • key : False
      • nullable : False
      • type : double
    • Price
      • key : False
      • nullable : False
      • type : double

If you want the store to appear in application’s Source Simulation widget, you will need to enable this parameter on the store

[5]:
store.source_simulation_enabled = True

Scenarios measure scaling

You can also perform simulations by modifying the value of certain measures. There are four available simulation methods:

  • multiply: multiplies the measure’s value with the provided weight

  • replace: replaces the measure’s value with the given value

  • add: increments the measure’s value with the provided value

[6]:
help(cube.setup_simulation)
Help on method setup_simulation in module activeviam.cube:

setup_simulation(name: 'str', multiply: 'Optional[Collection[Measure]]' = None, replace: 'Optional[Collection[Measure]]' = None, add: 'Optional[Collection[Measure]]' = None, per: 'Optional[Sequence[Level]]' = None, base_scenario_name: 'str' = 'BASE') -> 'Simulation' method of activeviam.cube.Cube instance
    Create a simulation for the given measures.

    This method creates a store to configure the simulation. You cannot use the same measure
    in several methods.

    You can create as many scenarios as you want for each simulation you create.

    Args:
        name: The name of the simulations family
        multiply: Collection of measures whose values we want to multiply
        replace: Collection of measures whose values we want to replace
        add: Collection of measures wose values we want to increment
        per: Sequence as levels to simulate on
        base_scenario_name: The name of the base scenario, defaults to BASE
    Returns:
        The simulation used to configure the different scenarios

[7]:
lvl = cube.levels
m = cube.measures

Calling the cube.setup_simulation method will create a special type of store which controls the simulation. This store’s key fields are the levels you provide, as well as the column with the name of the simulation, which contains the scenario names.

[8]:
simulation = cube.setup_simulation(
    "First simulation",
    per=[lvl["Continent"]],
    multiply=[m["Quantity.SUM"], m["Quantity.AVG"]],
    replace=[m["Price.AVG"]],
)
simulation
[8]:
  • First simulation
    • Levels
      1. Continent
    • Scenarios
      1. BASE
    • multiply
      1. Quantity.SUM
      2. Quantity.AVG
    • replace
      1. Price.AVG

In order to populate this simulation with scenarios, you can either populate the simulation the same way you would a regular store, with a CSV file or a DataFrame. Or you can create a scenario for this simulation and then insert rows into it either manually or from a source file.

When filling a scenario, (not the simulation), there are several things which are done automatically. For instance, you don’t need to specify the priority.

[9]:
asian_growth = simulation.scenarios["Asian growth"]
asian_growth += ("Asia", 1.05, 1.05, 20)
asian_growth += ("Europe", 0.9, 0.9, 0.85)
asian_growth
[9]:
First simulation_Quantity.SUM_value First simulation_Quantity.AVG_value First simulation_Price.AVG_value Priority
Continent
Asia 1.05 1.05 20.00 1.0
Europe 0.90 0.90 0.85 1.0

Calling head on the scenario will display the rows of the Simulation handled by this scenario.

[10]:
cube.query(
    m["Quantity.SUM"],
    m["Quantity.AVG"],
    m["Price.AVG"],
    levels=[lvl["Continent"], lvl["First simulation"]],
)
[10]:
Quantity.SUM Quantity.AVG Price.AVG
Continent First simulation
Asia BASE 9700.0 2425.000000 380.00
Asian growth 10185.0 2546.250000 20.00
Europe BASE 13000.0 2166.666667 460.00
Asian growth 11700.0 1950.000000 0.85

You can create as many scenarios as you want. However, every scenario of a simulation uses the same methods. If you want to use a different method you must create a new simulation.

When loading a dataframe into the simulation, you can get the required headers from the simulation. The same is true for a scenario. For the Priority you can either use the provided ‘av.Priority’ values, or any numeric value you want. If you use the provided values, they will be converted into integer values by the API. When loading a DataFrame into a scenario, the priority column is optional.

[11]:
growth = simulation.scenarios["Growth"]
rows = [
    ["Asia", av.Priority.NORMAL, 1.1, 1.1, 22],
    ["Europe", av.Priority.NORMAL, 1.2, 1.3, 15],
]
df = pd.DataFrame(rows, columns=growth.columns(True))
df
[11]:
Continent Priority First simulation_Quantity.SUM_value First simulation_Quantity.AVG_value First simulation_Price.AVG_value
0 Asia Priority.NORMAL 1.1 1.1 22
1 Europe Priority.NORMAL 1.2 1.3 15
[12]:
growth = simulation.scenarios["Growth"].load_pandas(df)
growth
[12]:
First simulation_Quantity.SUM_value First simulation_Quantity.AVG_value First simulation_Price.AVG_value Priority
Continent
Asia 1.1 1.1 22.0 1.0
Europe 1.2 1.3 15.0 1.0

You can delete a simulation from the cube’s simulation dict

[13]:
del cube.simulations["First simulation"]

You can create simulations wihout any levels. These simulations can only have one value per scenario. For simulations whithout any fields and acting on only one measure, you can assign values to scenarios instead of inserting tuples.

[14]:
no_field_sim = cube.setup_simulation("No Field Sim", multiply=[m["Price.SUM"]])
no_field_sim.scenarios["- 10 %"] = 0.9
no_field_sim.scenarios["+ 20%"] = 1.2
no_field_sim.head()
[14]:
No Field Sim_Price.SUM_value Priority
0 0.9 1.0
1 1.0 1.0
2 1.2 1.0

The priority rules are used to supersed the default behaviour between conflicting rules in the simulation. The basic rule is the more wildcard fields there are and the further to the left these wildcards are, the lower the priority. So in the following example, the rule on Europe overrides the one on France.

[15]:
priority_sim = cube.setup_simulation(
    "Priority Simulation",
    per=[lvl["Continent"], lvl["Country"]],
    multiply=[m["Quantity.AVG"]],
)
priority_scenario = priority_sim.scenarios["growth"]
priority_scenario += ("Europe", "*", 1.1)
priority_scenario += ("*", "France", 1.2)
priority_scenario.head()
[15]:
Priority Simulation_Quantity.AVG_value Priority
Continent Country
Europe * 1.1 1.0
* France 1.2 1.0
[16]:
cube.query(
    m["Quantity.SUM"],
    levels=[lvl["Continent"], lvl["Country"], lvl["Priority Simulation"]],
)
[16]:
Quantity.SUM
Continent Country Priority Simulation
Asia China BASE 6000.0
growth 6000.0
India BASE 3700.0
growth 3700.0
Europe France BASE 7500.0
growth 7500.0
UK BASE 5500.0
growth 5500.0

Let’s imagine we wanted our rule on France to override the one on Europe. By default, we assign the av.Priority.NORMAL value to any row for which the priority is not specified. If we give the rule on France a higher Priority than that, then it will override the other rules impacting France.

[17]:
priority_scenario += ("*", "France", 120, av.Priority.IMPORTANT)
priority_scenario
[17]:
Priority Simulation_Quantity.AVG_value Priority
Continent Country
Europe * 1.1 1.0
* France 120.0 2.0
[18]:
cube.query(
    m["Quantity.SUM"],
    levels=[lvl["Continent"], lvl["Country"], lvl["Priority Simulation"]],
)
[18]:
Quantity.SUM
Continent Country Priority Simulation
Asia China BASE 6000.0
growth 6000.0
India BASE 3700.0
growth 3700.0
Europe France BASE 7500.0
growth 7500.0
UK BASE 5500.0
growth 5500.0