Copper: Post-Processors made easy

Straightforward implementation of business logic in ActivePivotPost-processors are the pieces of code that allow customers to insert their own business logic in ActivePivot. As business cases are often complex and nuanced, post-processors are generally fairly complex and lengthy to write.

In an attempt to ease the work of ActivePivot server developers, we have created CoPPer, which stands for “Core Post-processors”. It is an API designed to perform calculations in ActivePivot by creating the post-processors for the developers without in-depth ActivePivot Training. It uses a syntax inspired by Apache Spark and defines measures, which are calculated once a user executes a query.

What it means, practically, is that it can achieve the exact same results as “traditional” post-processors with dramatically less lines of codes and zero impact on performance. As a result, it significantly speeds up the implementation of projects in several ways:

Efficiency

CoPPer allows the integration of business logic inside ActivePivot without previous knowledge of post-processors writing, and makes the writing of measures in the cube substantially easier. Complicated concepts around cube locations are abstracted away, and what used to require a full Java class (i.e at least 50 lines of code), can now be done with just a few short lines.

Accessibility

With CoPPer, users need only a relatively basic knowledge of coding to create powerful new post-processors. Data scientists, for instance, can learn to use CoPPer fairly easily and become empowered to conduct their analyses deeper and more freely.

Iterating

Because it’s so short and quick, post-processors written with CoPPer can be tested right away and potential errors in the code are easier to find.

Optimization

CoPPer leverages all types of ActivePivot optimizations from the core product, such as prefetching and real-time impact calculations.

Let’s have a look at the various use cases for CoPPer:

Finance

  • Dynamic Bucketing

Allows grouping of data on-the-fly.

context
  .createDatasetFromFacts()
  .withColumn(“Bucket”, col(“Date”).map(date -> findBucket(date, bucketer))
  .asHierarchy(“TimeBucketDynamic”)
  .inDimension(“Time”)))

Bucketing is done when the query is made, no need to wait for the data to be loaded, it is especially useful for grouping trades by maturity for example. CoPPer makes this quick and easy.

  • Value at Risk

Value at Risk (VaR) quantifies the level of financial risk within a firm, portfolio or position over a specific time frame, used to determine the extent and occurrence ratio of potential losses in the aforementioned portfolios, and requires fairly complex calculations. The real-time calculation of VaR at various level of aggregations is one of the most common tasks performed in ActivePivot.

context.createDatasetFromFacts()
  .agg(sum(“PnlVector”))
  .withColumn(“VaR95”, col(“PnlVector.SUM”).map((IVector v) -> v.quantileDouble(0.95d)))

With CoPPer, setting up this calculation and analysis takes a mere three lines of code, as exemplified above.

  • Intraday Liquidity

How to have ActivePivot calculate the daily maximum liquidity usage of a bank within the framework of the Basel Committee on Banking Supervisions requirements, in this case the BCBS 248 regulation.

public static Datasets.Dataset buildCumulativeNetAmount(Datasets.Dataset dataset) {
    return dataset
             .agg(sum(PAYMENT__AMOUNT))
             .select(sum(PAYMENT__AMOUNT + ".SUM").over(Window.orderBy(PAYMENT__TIME)).as("Net Cumulative Amount"))
             .agg(max("Net Cumulative Amount"), min("Net Cumulative Amount"));
}
public static Datasets.Dataset buildCumulativeAmountSentAndReceived(Datasets.Dataset dataset) {
    return dataset
             .agg(
                      longSum(PAYMENT__AMOUNT).as("Amount.SUM.IN"),
                      customAgg(PAYMENT__AMOUNT, "SHORT").as("Amount.SUM.OUT")) // missing shortSum for now!
             .withColumn("Cumulative Received", sum("Amount.SUM.IN").over(Window.orderBy(PAYMENT__TIME)))
             .withColumn("Cumulative Sent", sum("Amount.SUM.OUT").over(Window.orderBy(PAYMENT__TIME)));
}
public static Datasets.Dataset buildPercentSentAndReceived(Datasets.Dataset dataset) {
    return buildCumulativeAmountSentAndReceived(dataset)
             .agg(
                      max("Cumulative Received"),
                      min("Cumulative Sent"))
             .withColumn("% Received",
                      col("Cumulative Received.MAX").divide(col("Cumulative Received.MAX").drillUp(PAYMENT__TIME))
             .withFormatter("DOUBLE[0.00%]"))
             .withColumn("% Sent",
                      col("Cumulative Sent.MIN").divide(col("Cumulative Sent.MIN").drillUp(PAYMENT__TIME))
                              .withFormatter("DOUBLE[0.00%]"));
}

And now add everything to our cube:

public static IActivePivotInstanceDescription createCubeDescription() {
    return StartBuilding.cube(PIVOT_ID)
             .withSingleLevelDimension(PAYMENT__ID)
             .withSingleLevelDimension(PAYMENT__TIME)
             .withSingleLevelDimension(PAYMENT__DIRECTION)


             // this is where we add the CoPPeR calculations to the cube
             .withDescriptionPostProcessor(
                      com.activeviam.builders.StartBuilding.copperCalculations()
                              .withDefinition(LiquidityTrainingUtil::calculations)
                              .build())


             .build();
}


public static void calculations(BuildingContext context) {
    Datasets.Dataset dataset = context.createDatasetFromFacts();
    buildCumulativeNetAmount(dataset).publish();
    buildCumulativeAmountSentAndReceived(dataset).doNotAggregateAbove().publish();
    buildPercentSentAndReceived(dataset).publish();
}

Before we introduced CoPPer, a typical Intraday Liquidity project might contain thousands of lines of post-processor code to set up those calculations. With CoPPer, we have reduced it to less than 100 lines.

Pricing

Some retail KPIs calculations are quite complex such as the evolution of sales for a given sub-shelf from one year to the next, and would require lengthy developments with regular post-processors, not so with CoPPer.

  • Turnover

Computing turnover may appear easy, but drilldown to the receipt level is required.

//Computing the CA TTC avant remises, = StoreSellingPrice * Quantity (computation has to be done at sale level)
dss = context.createDatasetFromFacts();


dss = dss
    .groupBy(Columns.col(FIELD_SALE_ID), Columns.col(FIELD_LIGNE_TICKET))
    .agg(
             Columns.avg(FIELD_STORE_SELLING_PRICE).as(FIELD_STORE_SELLING_PRICE),
             Columns.sum(FIELD_QUANTITY).as(FIELD_QUANTITY)
)
    .withColumn("Turnover before discounts ",
             Columns.col(FIELD_STORE_SELLING_PRICE).multiply(Columns.col(FIELD_QUANTITY)))
    .agg(Columns.sum("Turnover before discounts ").as("Turnover before discounts ").withinFolder("Turnover"));


dss.publish();
  • Average basket

Average basket calculation also requires drilldown to the receipt level, aggregation of global amounts such as sums, and then averaging aggregation above the receipt level.

//Computing the Panier client
dss = context.createDatasetFromFacts();


dss = dss
    .groupBy(
          Columns.col(FIELD_SALE_ID))
    .agg(
          Columns.sum(FIELD_AMOUNT).as("Amount"))
    .select(
          Columns.col("Amount "))
    .agg(
          Columns.avg("Amount")
          .as("Basket ")
          .withinFolder("Sales"));


dss.publish();
  • Number of checkouts
//Computing the Passage caisse
dss = context.createDatasetFromFacts();


dss = dss
    .agg(
          Columns.distinctCount(FIELD_SALE_ID).as("Checkout").withinFolder("Sales"));


dss.publish();

Conclusion

CoPPer enables users to manipulate and assemble bricks of optimized Post-Processors written in the core product, allowing developers to focus on their business logic.
It doesn’t replace post-processors completely, but their logic can now be implemented more quickly, more efficiently and in a more straightforward syntax. It’s as easy a writing an Excel Macro, but with the full processing power, performance and optimization of ActivePivot.

Like this blog?
Follow us on LinkedIn
to stay up-to-date with the latest blog posts!

This entry was posted in Big Data Analytics, Finance, Miscellaneous, Pricing, Technology. Bookmark the permalink.

Comments are closed.