PERSONALISED MARKETING

 
Skjermbilde 2017-10-31 kl. 12.15.05.png

We develop new methods, strategies and algorithms for individualised marketing, customer retention, optimised communication with users, personalised pricing and personalised recommendations or to maximise the probability of purchase of a product or other actions of the users. We exploit users’ behavioural measurements in addition to their more standard characteristics and external data (including competitors’ activity, market indicators, financial information, and geographic information). We exploit network topologies, informative missigness and temporal relations. A key point is to identify the actionable causes of customer behaviour.

Illustrasjon: Ellen Hegtun, Kunst i Skolen

Bayesian methodology for recommender systems

BigInsight has over some years developed two new Bayesian approaches to recommendations. The first is based on the Bayesian Mallows Model, and has been shown to perform as well as the state-of-the-art, and in addition providing a higher level of diversity, meaning that users potentially get more interesting, less obvious, recommendations, and the catalogue of offered items is explored more. However, the original methodology had slow convergence, so making it scalable has been of great importance in order for it to be production ready.

Our new proposed Variational Bayesian approximation algorithm speeds up convergence drastically. In 2022, we submitted the paper entitled PseudoMallows for Efficient Probabilistic Preference Learning for publication, and published it open on arXiv.

Stochastic models for early prediction of viral customer behavior on networks

BigInsight has over many years developed methodology for early prediction of the adoption of new products with viral potential on social networks. The proposed stochastic model is agent-based at the individual level and allows for influence both from viral word-of-mouth effects and external factors, such as marketing campaigns. Inference is based on fitting the model via maximum likelihood to data on the early history of adoptions of the product on the social network of the customer. Prediction of the future is performed by simulation with a computationally very efficient algorithm. The method is validated on simulated data, as well as a real telecom dataset provided by Telenor. In 2022 the first paper on this, entitled An agent-based model with social interactions for scalable probabilistic prediction of performance of a new product, was accepted and published in International Journal of Information Management Data Insights. Two follow-up papers are in progress.

The first one was in 2022 revised for a new submission, and concerns studying the effect of having a partially observed network on results and conclusions. In the paper we show that if less than half of the links are missing, the results are still qualitatively reasonable. This is important because in practice, the full social networks cannot be fully observed, and in particular, observing all social links is not possible.

The second paper following up on the published methodology, for which in 2022 all the theory was developed, as well as experiments and simulations conducted, extends the initial model to also include a preliminary tattling stage. The extension assumes additionally that each vertex connected with an infectious neighbor may also spread influence to their susceptible neighbors, representing tattling. When a vertex turns infectious, it may thus exert influence directly and indirectly, with the latter through the tattlers. The model is shown to be stable and identifiable, and to accurately identify tattling if it is indeed present.

Sales prediction

DNB Bedrift (former DNB Puls) is an app for people running small or medium sized businesses. The app is very popular, currently having 65000 users. Among other things, it produces forecasts of future income based on time series of previous values. In this project, the aim is to improve the current income predictions. This is a difficult problem, because the historical time series data are very noisy with irregular patterns and many missing values. We use a new combination of traditional time series methods and more recent machine learning methods. The developed code has been transferred to DNB and is implemented in the app.

Explanation of predictions from Black-Box models

In some applications, complex hard-to-interpret machine learning models like deep neural networks are currently outperforming the traditional regression models. Interpretability is crucial when a complex machine learning model is to be applied in areas where trust in the algorithm is required, like for example in clinical applications, fraud detection or credit scoring. There has been a big interest in explaining AI in the context of personalised marketing. In 2022 we have been working with a type of methods denoted counterfactual explanations. More specifically, we have developed MCCE (Monte Carlo sampling of valid and realistic Counterfactual Explanations), a counterfactual explanation method that generates actionable and valid counterfactuals by modelling the joint distribution of the features with an autoregressive generating model where the conditionals are estimated using decision trees. A paper describing the method has been submitted.

Risk impact of weather conditions on car crash counts

In this project we have studied the effect of weather conditions on claim frequencies for motor vehicles. Gjensidige wants to know whether any of the regional differences and/ or yearly changes observed for claim frequencies may be attributed to different weather conditions. To investigate this, we use a Generalized Additive Model (GAM) to model the claims. The main conclusion is that including meteorological variables slightly improves predictive performance. The yearly changes are however almost the same before and after including meteorological variables. For the regional differences, the picture is a bit different. When taking temperature and rainfall into account, the eastern part of Norway gets a slightly higher risk, while the western and northern part of Norway gets a slightly lower risk. Lastly, we have evaluated the effect of climate change on the number of car insurance claims. The results show that there will be an average decline in claim frequencies in the future. This is mainly due to the projected increase in temperature and the estimated negative relationship between temperature and the number of claims.

Synthetic data generation

The demand and volume of data containing sensitive information on persons or enterprises have increased significantly over the last years. At the same time, privacy protection principles and regulations impose restrictions on access and use of individual data. Recently, there has been several initiatives to generate synthetic data. If the resulting synthetic data closely resembles the original data, it will make it easier for institutions in the private and public sector to share realistic individual-level micro data while minimizing the risk of disclosing confidential and sensitive information. In this project we have identified the most promising methods for generating synthetic data suggested in the literature, both by investigating the statistical properties and the utility of the methods, and by applying a “motivated intruder” test framework. As far as the latter is concerned, we have focused on what is called differential privacy, investigating how generative models like probabilistic graphical models, diffusion models and normalizing flows may be modified to fulfil the differential privacy criterion.

Automatic index generation

Many of Statistics Norway’s traditional surveys are time consuming and labour intensive. There is therefore a need for more efficient, semi-automatic and continuously updated statistics utilizing detailed streams of data and machine learning. To enable this exciting development, we have also in 2022 developed new machine learning methods suited to handle and impute missing values when estimating nutritional values from consumption data. Our work was presented at the 29th Nordic Statistical Meeting, and our methods were supposed to be applied in the forthcoming Survey of consumer expenditure from Statistics Norway, but this has been delayed due to a legal dispute between the Norwegian Data Protection Authority and Statistics Norway. We have also contributed with related methods in Statistics Norway’s forthcoming publication of the updated time use survey.

“Artificial intelligence holds the key to
delivering more human and relevant
marketing experiences at scale”
— WPP May 2019

Principal Investigator Kjersti Aas

Principal Investigator
Kjersti Aas

co-Principal Investigator  Ida Scheel

co-Principal Investigator
Ida Scheel