Advanced Store Manipulations

[1]:
import activeviam as av

session = av.create_session()
store = session.read_csv("data/example.csv", keys=["ID"], store_name="MyStore")
cube = session.create_cube(store, "FirstCube")
cube.query()
[1]:
Price.AVG Price.SUM Quantity.AVG Quantity.SUM contributors.COUNT
0 428.0 4280.0 2270.0 22700.0 10

Store description

The lengh of a store is the number of rows in the store

[2]:
len(store)
[2]:
10

You can get the shape of the store, which is the number of columns and rows of the store

[3]:
store.shape
[3]:
{'rows': 10, 'columns': 8}

The description of the columns and types of the store

[4]:
store
[4]:
  • MyStore
    • ID
      • key : True
      • nullable : True
      • type : int
    • Date
      • key : False
      • nullable : True
      • type : LocalDate[yyyy-MM-dd]
    • Continent
      • key : False
      • nullable : True
      • type : string
    • Country
      • key : False
      • nullable : True
      • type : string
    • City
      • key : False
      • nullable : True
      • type : string
    • Color
      • key : False
      • nullable : True
      • type : string
    • Quantity
      • key : False
      • nullable : True
      • type : double
    • Price
      • key : False
      • nullable : True
      • type : double

Specify other types for the columns

Activeviam automatically detect the type of the columns. It is possible to bypass this behaviour and specify the types of some columns manually.

[5]:
types = {
    "ID": av.types.DOUBLE,
    "City": av.types.STRING,
    "Quantity": av.types.STRING,
    "Price": av.types.FLOAT,
}
[6]:
custom_store = session.read_csv(
    "data/example.csv", keys=["ID"], store_name="Custom", types=types
)
custom_store
[6]:
  • Custom
    • ID
      • key : True
      • nullable : False
      • type : double
    • Date
      • key : False
      • nullable : True
      • type : LocalDate[yyyy-MM-dd]
    • Continent
      • key : False
      • nullable : True
      • type : string
    • Country
      • key : False
      • nullable : True
      • type : string
    • City
      • key : False
      • nullable : True
      • type : string
    • Color
      • key : False
      • nullable : True
      • type : string
    • Quantity
      • key : False
      • nullable : True
      • type : string
    • Price
      • key : False
      • nullable : False
      • type : float

Insert new rows

New records can be inserted into the store. If a record has the same key columns as an existing record, the previous record will be overriden.

[7]:
# New key
store.insert_rows(
    (11, "2019-03-01", "Europe", "Germany", "Berlin", "yellow", 1000, 400)
)

# Existing key
store += (1, "2019-03-01", "Europe", "France", "Paris", "red", 2000, 600)

store.head()
[7]:
Date Continent Country City Color Quantity Price
ID
1 2019-03-01 Europe France Paris red 2000.0 600.0
2 2019-01-02 Europe France Lyon red 2000.0 400.0
3 2019-01-05 Europe France Paris blue 3000.0 420.0
4 2018-01-01 Europe France Bordeaux blue 1500.0 480.0
5 2019-01-01 Europe UK London green 3000.0 460.0

Append data to a store

A CSV file with the same structure as the initial source can be appended into an existing store:

[8]:
store.load_csv("data/additional_example.csv")
store.head()
[8]:
Date Continent Country City Color Quantity Price
ID
1 2019-03-01 Europe France Paris red 2000.0 600.0
2 2019-01-02 Europe France Lyon red 2000.0 400.0
3 2019-01-05 Europe France Paris blue 3000.0 420.0
4 2018-01-01 Europe France Bordeaux blue 1500.0 480.0
5 2019-01-01 Europe UK London green 3000.0 460.0
[9]:
len(store)
[9]:
14

Join stores

Stores can be joined together using a mapping between columns. The mapping columns of the target store must be included in its key columns. If no mapping is specified, the columns with the same names are used.

[10]:
capital_store = session.read_csv(
    "data/capitals.csv", keys=["Country name"], store_name="Capital"
)
capital_store.head()
[10]:
Capital
Country name
France Paris
UK London
China Beijing
India Dehli
[11]:
store.join(capital_store, mapping={"Country": "Country name"})
[12]:
cube = session.create_cube(store, "Cube")

The joined columns can be displayed when printing the store

[13]:
store.head(n=10, joined_columns=True)
[13]:
Date Continent Country City Color Quantity Price MyStore_to_Capital/Country name MyStore_to_Capital/Capital
ID
1 2019-03-01 Europe France Paris red 2000.0 600.0 France Paris
2 2019-01-02 Europe France Lyon red 2000.0 400.0 France Paris
3 2019-01-05 Europe France Paris blue 3000.0 420.0 France Paris
4 2018-01-01 Europe France Bordeaux blue 1500.0 480.0 France Paris
5 2019-01-01 Europe UK London green 3000.0 460.0 UK London
6 2019-01-01 Europe UK London red 2500.0 500.0 UK London
7 2019-01-02 Asia China Beijing blue 2000.0 410.0 China Beijing
8 2019-01-05 Asia China HongKong green 4000.0 350.0 China Beijing
9 2018-01-01 Asia India Dehli red 2200.0 360.0 India Dehli
10 2019-01-01 Asia India Mumbai blue 1500.0 400.0 India Dehli