apr.
23.
2:15 p.m.14:15

Seminar series in Statistics and Data Science: Knut Rand

We are very pleased to invite you to our next Tuesday seminar of Seminar series in Statistics and Data Science

Speaker:  Knut Dagestad Rand, Researcher at HISP, Department of Informatics, University of Oslo

Title: Bayesian Time Series Modelling of Climate-Health Data

When? Tuesday 23.04.2024, 14:15-15:15

Where?  Erling Svedrups plass and Zoom 
https://uio.zoom.us/j/66618660796?pwd=bVVOenVLeDFPT25LZ3RLdnRQaC9udz09 

Abstract
Climate and weather can affect disease prevalence in different ways. For instance, humidity and temperature affect the life cycles of mosquitos which can greatly influence the prevalence of vector-borne diseases like malaria and dengue. Modelling this relationship is very important, both in the short term for outbreak preparedness, and in the long term, for health systems to adapt to the changing climate. However, this modelling is difficult because of low amounts of quality health data, complexities in spatial-temporal modelling, and the many different domains (vector biology, climate, epidemiology).
In this talk I will present our work on building a framework both for developing modularized and adaptable climate-health models, and for rigorously evaluating the utility of these models.

Welcome!
Best regards,
Sven Ove Samuelsen & Aliaksandr Hubin

Vis arrangement →
apr.
9.
2:15 p.m.14:15

Seminar series in Statistics and Data Science: Steffen Grønneberg

We are extremely pleased to invite you to our postponed Tuesday seminar of
Seminar series in Statistics and Data Science

When? Tuesday 09.04.2024, 14:15-15:15 

Where?  Erling Svedrups plass and Zoom https://uio.zoom.us/j/62093083083?pwd=MmxrWGEyU0NNMWVrdDJmakx0ckhYdz09 

Speaker: Steffen Grønneberg, Professor BI

Title: Non-parametric regression among factor scores: Motivation and diagnostics for nonlinear structural equation models

Abstract: Structural equation models are simultaneous equation regression models, whose variables are latent, and measured via a confirmatory factor model (that is, with measurement error and repeated measurements). When the functional form of the simultaneous equation system is unknown, it has previously been observed in simulations that factor scores inputted into non-parametric regression methods approximate the true functional form. Factor scores estimate the latent variables (per person), and several types exist. We provide a theoretical (though population-based) analysis of this procedure, and provide assumptions under which it is theoretically justified in using Bartlett factor scores, which are simple linear transformations of the data. In simulations, we compare this suggestion to an already available though understudied non-linear and computationally heavy procedure, and observe that the simple Bartlett approach appears to work better.

Welcome!
Best regards,
Sven Ove Samuelsen & Aliaksandr Hubin

Steffen Grønneberg is Professor at the Department of Economics at BI (Norwegian Business School), Oslo.

Vis arrangement →
apr.
4.
2:15 p.m.14:15

Seminar series in Statistics and Data Science: Ian McKeague

We are very pleased to invite you to our next seminar of

Seminar series in Statistics and Data Science that will be held on Thursday this time

Speaker:  Ian McKeague, Professor of biostatistics at Mailman School of Public Health, Columbia University

Title: Empirical likelihood based inference for concurrent functional linear regression with applications to wearable device data

When? Thursday 04.04.2024, 14:15-15:15

Where?  Erling Svedrups plass and Zoom https://uio.zoom.us/j/67486742080?pwd=VHJPam5BSEc2Wkl6SjlhN2p5WjA2Zz09 

Abstract
This talk discusses a nonparametric inference framework for occupation time curves derived from wearable device data. Such curves provide the total time a subject maintains activity above a given level as a function of that level. Taking advantage of the monotonicity and smoothness properties of these curves, we develop a likelihood ratio approach to construct confidence bands for mean occupation time curves.  An extension to fitting concurrent functional regression models is also developed. Application to wearable device data from an ongoing study of an experimental gene therapy for mitochondrial DNA depletion syndrome will be discussed. Based on joint work with Hsin-Wen Chang (Academia Sinica).

Welcome!
Best regards,
Sven Ove Samuelsen & Aliaksandr Hubin

Vis arrangement →
mar.
12.
2:15 p.m.14:15

Seminar series in Statistics and Data Science: Guilherme Clarindo Marcos

We are very pleased to invite you to our next Tuesday seminar of

Seminar series in Statistics and Data Science

Speaker:  Guilherme Clarindo Marcos, Researcher and Ph.D. candidate in the Marine Environment Group from Centre for Marine Technology and Ocean Engineering (CENTEC), University of Lisbon, Portugal

Title: Robust estimation and representation of climatic wave spectrum

When? Tuesday 12.03.2024, 14:15-15:15

Where?  Erling Svedrups plass and Zoom 

https://uio.zoom.us/j/68412228703?pwd=Y2FFZDlCSzBZbDZ4Rkw0S2NQWHpTQT09  

Abstract

The climatic ocean wave spectrum serves as a pivotal tool in comprehending the long-term characteristics and variations of wave patterns across different regions of the world's oceans. The presentation explores the methodologies employed to derive wave spectra from observational data. Basically, consists of a statistical approach that provides a quantitative understanding of the variability and extremes of wave conditions. In essence, an ocean wave spectrum is a representation of the distribution of energy among different wave frequencies and wavelengths. So, engineers rely on this valuable information to mitigate risks and design solutions that can withstand the dynamic forces of ocean waves. However, it is necessary to present such information in a robust and practical mode to better comprehend the variations. In this way, a robust and resistant approach will be presented to define such variabilities, thus reducing uncertainties and representing the climatic wave spectrum in a compact and informative way.

Welcome!
Best regards,
Sven Ove Samuelsen & Aliaksandr Hubin

Guilherme Clarindo Marcos

Vis arrangement →
feb.
13.
2:15 p.m.14:15

Seminar series in Statistics and Data Science: Michael Scheuerer

After a short festive break, we are extremely pleased to invite you again to our Tuesday seminar of

Seminar series in Statistics and Data Science

Speaker:  Michael Scheuerer, Senior Research Scientist, Norwegian Computing Center

Title: Decadal inflow projections for catchments in Brazil

When? Tuesday 13.02.2024, 14:15-15:15

Where?  Erling Sverdrups plass and Zoom 

https://uio.zoom.us/j/69724784636?pwd=Z2cyakFuMjJBVFQwcGtZaEJGdHFTQT09  

Abstract
Our project partner Statkraft owns and operates several hydropower plants in Brazil and requires information about the future potential for hydropower production in this region. To provide inflow projections for the next several decades, we use climate model output in combination with a regression model that links meteorological variables such as precipitation and temperature to inflow over various catchments in the region. The relatively short time period for which observation data are available raises concerns about overfitting. We therefore explore an alternative model fitting approach that retains the original, easily interpretable regression model but estimates the regression coefficients within an artificial neural network (ANN) framework which permits spatial and temporal regularization and thus prevents overfitting. We show some examples of the inflow projections obtained with that methodology and discuss some caveats and limitations.

Welcome!
Best regards,
Sven Ove Samuelsen & Aliaksandr Hubin

Vis arrangement →
feb.
8.
2:30 p.m.14:30

Biostatistics Seminar: Martin Bladt

The next OCBE Biostatistics Seminar will be held on Thursday, February 8th, at 14:30-15:30 in Auditorium 13 (Domus Medica, Oslo).

Martin Bladt, Associate Professor, Department of Mathematical Sciences, University of Copenhagen, will talk about

Conditional Aalen–Johansen estimation

Abstract

Aalen–Johansen estimation targets transition probabilities in multi-state Markov models subject to right-censoring. In particular, it belongs to the standard toolkit of statisticians specializing in health and disability. We introduce for the first time the conditional Aalen-Johansen estimator, a kernel-based estimator that allows for the inclusion of covariates and, importantly, is also applicable in non-Markov models. We establish uniform strong consistency and asymptotic normality under lax regularity conditions; here, the theory of empirical processes plays a central role and leads to a transparent treatment. We also illustrate the practical implications and strength of the estimation methodology.

See also: https://www.med.uio.no/imb/english/research/centres/ocbe/events/biostat-seminar/ for updates or for subscribing to the events. If you'd like to be added in the "biostat-ext" mailing list, just let us know.

Best wishes,
Jon Michael Gran
OCBE

Vis arrangement →
feb.
1.
2:30 p.m.14:30

OCBE Biostatistics Seminar: Veronica Vinciotti

The next OCBE Biostatistics Seminar will be held on Thursday, February 1st, at 14:30-15:30 in Auditorium 13 (Domus Medica, Oslo).

Veronica Vinciotti, Associate Professor at the Department of Mathematics, University of Trento (Italy), will present about

Random graphical model of microbiome interactions in related environments

Abstract: The microbiome constitutes a complex microbial ecology of interacting components that regulates important pathways in the host. Measurements of microbial abundances are key to learning the intricate network of interactions amongst microbes. Microbial communities at various body sites tend to share some overall common structure, while also showing diversity related to the needs of the local environment. In this talk, I will describe a computational approach for the joint inference of microbiota systems from (count) metagenomic data for a number of body sites. The random graphical model (RGM) allows for heterogeneity across the different body sites via environment-specific copula graphical models, while quantifying their relatedness at the structural level via a joint generative model of the graphs. In addition, the model allows for the inclusion of external covariates at both the microbial and interaction levels, further adapting to the richness and complexity of microbiome data. In the last part of the talk, I will show how a similar methodology has been used to study cross-country cultural heterogeneity from (ordinal) survey data.

[Reference: V. Vinciotti, E. Wit, F. Richter. Random graphical model of microbiome interactions in related environments, 2023, https://arxiv.org/abs/2304.01956 ]

The OCBE biostatistics seminar series is organized by the Oslo Centre for Biostatistics and Epidemiology and is an arena for presenting new interesting work in biostatistics, both on the methodological and applied side, from researchers in Oslo, from Norway and abroad.

See also: https://www.med.uio.no/imb/english/research/centres/ocbe/events/biostat-seminar/ for updates or for subscribing to the events. If you'd like to be added in the "biostat-ext" mailing list, just let us know.

Other upcoming seminars this semester:
February 8th, 2024: Martin Bladt, University of Copenhagen, Denmark

Welcome!

Best wishes,
Valeria Vitelli
Associate Professor
Oslo Centre for Biostatistics and Epidemiology,
Department of Biostatistics, University of Oslo, Norway
mailto: valeria.vitelli@medisin.uio.no

webpage: http://www.med.uio.no/imb/english/people/aca/valeriv/

Vis arrangement →
jan.
10.
2:00 p.m.14:00

Explaining AI seminar

Speaker: Jiachen (Tianhao) Wang (Princeton University)

Location: Microsoft Teams

Title: Threshold KNN-Shapley: A Linear-Time and Privacy-Friendly Approach to Data Valuation

Abstract: Data valuation aims to quantify the usefulness of individual data sources in training machine learning (ML) models, and is a critical aspect of data-centric ML research. However, data valuation faces significant yet frequently overlooked privacy challenges despite its importance. This paper studies these challenges with a focus on KNN-Shapley, one of the most practical data valuation methods nowadays. We first emphasize the inherent privacy risks of KNN-Shapley, and demonstrate the significant technical difficulties in adapting KNN-Shapley to accommodate differential privacy (DP). To overcome these challenges, we introduce TKNN-Shapley, a refined variant of KNN-Shapley that is privacy-friendly, allowing for straightforward modifications to incorporate DP guarantee (DP-TKNN-Shapley). We show that DP-TKNN-Shapley has several advantages and offers a superior privacy-utility tradeoff compared to naively privatized KNN-Shapley in discerning data quality. Moreover, even non-private TKNN-Shapley achieves comparable performance as KNN-Shapley. Overall, our findings suggest that TKNN-Shapley is a promising alternative to KNN-Shapley, particularly for real-world applications involving sensitive data.

Vis arrangement →
des.
7.
2:30 p.m.14:30

OCBE Biostatistics Seminar: Geir Kjetil Sandve

Dear all, 

The next OCBE biostatistics seminar will be held on Thursday, December 7th, at 14:30-15:30 in Lille Auditorium (Domus Medica, Oslo).
Geir Kjetil Sandve, professor at the Biomedical Informatics Research Group (BMI), Section of Machine learning, Department of Informatics (UiO), will present about
Approaching intriguing problems with machine learning:
the full picture from data availability to methodology development and assessment, with the adaptive immune system as case.

Abstract: As statisticians and machine learners, we often talk quite exclusively about the methodology we develop, although we typically agree that appropriate data and method assessment is equally important. 

I will here present how we through a broad interdisciplinary collaboration have tried to approach the full spectrum of aspects that influence our success in approaching an intriguing problem with machine learning. At the core is the development of a novel deep learning architecture whose components are motivated by (tailored according to) insights from the application domain. Looking ahead, we have also critically analysed how large language models might improve on the more classic deep learning approaches to the problem. To support the methodology development, we have initiated separate projects to generate both experimental and synthetic data. And to support interoperability, reproducibility and rigorous assessment of the developed methodology, we have developed a software platform for machine learning in the domain, as well as initiating an international competition to benchmark competing methods in the field. 
The case (the machine learning problem) that is underlying the above developments is the question of how the adaptive immune system recognises foreign threats - e.g. viruses, bacteria or cancer. This is essentially a DNA sequence classification problem, known to be driven by complex, higher-order interactions. There is a strong interest in better solutions to this computational problem, as it could accelerate drug development and allow early diagnosis of disease.

The OCBE biostatistics seminar series is organized by the Oslo Centre for Biostatistics and Epidemiology and is an arena for presenting new interesting work in biostatistics, both on the methodological and applied side, from researchers in Oslo, from Norway and abroad.

See also: https://www.med.uio.no/imb/english/research/centres/ocbe/events/biostat-seminar/ for updates or for subscribing to the events. If you'd like to be added in the "biostat-ext" mailing list, just let us know.

Other upcoming seminars in the Spring semester:

February 1st, 2024: Veronica Vinciotti, Department of Mathematics, University of Trento (Italy)

February 8th, 2024: Martin Bladt, Department of Mathematical Sciences, University of Copenhagen (Denmark)

Welcome!
Best wishes,
Valeria Vitelli
Associate Professor
Oslo Centre for Biostatistics and Epidemiology,
Department of Biostatistics, University of Oslo, Norway

Vis arrangement →
nov.
21.
9:15 a.m.09:15

Seminar in Statistics and Data Science: Adam Lee

Dear all, 

We are extremely pleased to invite you to our next  seminar of Seminar series in Statistics and Data Science.

 Speaker: Adam Lee, Assistant Professor, Department of Data Science & Analytics, BI Norwegian Business School

 Title: Locally robust and efficient tests for non-regular semiparametric models

 When? Tuesday 21.11.2023, 09:15-10:15

 Where?  Erling Svedrups plass and Zoom https://uio.zoom.us/j/64061448393?pwd=L21KWjZxbFpWZDY4N3RBTUZHLzVvQT09 

Abstract

This paper considers hypothesis testing in semiparametric models which may be non – regular for certain values of a (potentially infinite dimensional) nuisance parameter. In such models no (locally) regular estimator of the parameter of interest exists. The situation for testing is somewhat different: I establish that $C(\alpha)$ – style test statistics achieve their limiting distributions in a (locally) regular manner under mild conditions, leading to tests with correct size in situations where standard tests fail to control size. Additionally, I characterise the appropriate limit experiment in which to study local (asymptotic) optimality of tests in the case where the efficient information matrix is singular. This permits the generalisation of classical power bounds to the non – regular case. I provide appropriate statements of these bounds and give conditions under which they are attained by the proposed $C(\alpha)$ – style 

Welcome!

Best regards,
Sven Ove Samuelsen & Aliaksandr Hubin

Vis arrangement →
nov.
9.
2:30 p.m.14:30

OCBE Biostatisctis Seminar: Antonio Canale

Biostatistical seminar with Antonio Canale, Assoc. Professor, Department of Statistical Sciences, University of Padova, Italy.

Time and place: Nov. 9, 2023 2:30 PM – 3:30 PM, Domus Medica, Auditorium 13

Title: Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering

Abstract: Bayesian mixture models are widely used for clustering of high-dimensional data with appropriate uncertainty quantification. However, as the dimension of the observations increases, posterior inference often tends to favor too many or too few clusters. This article explains this behavior by studying the random partition posterior in a non-standard setting with a fixed sample size and increasing data dimensionality. We provide conditions under which the finite sample posterior tends to either assign every observation to a different cluster or all observations to the same cluster as the dimension grows. Interestingly, the conditions do not depend on the choice of clustering prior, as long as all possible partitions of observations into clusters have positive prior probabilities, and hold irrespective of the true data-generating model. We then propose a class of latent mixtures for Bayesian clustering (Lamb) on a set of low-dimensional latent variables inducing a partition on the observed data. The model is amenable to scalable posterior inference and we show that it can avoid the pitfalls of high-dimensionality under mild assumptions. The proposed approach is shown to have good performance in simulation studies and an application to inferring cell types based on scRNAseq.

Joint work with Noirrit Kiran Chandra & David B. Dunson

Vis arrangement →
okt.
31.
2:15 p.m.14:15

Seminar Series in Statistics and Data Science: Valeria Vitelli

After a break, we are extremely pleased to invite you again to our Tuesday seminar of

Seminar series in Statistics and Data Science

Speaker:  Valeria Vitelli, Associate Professor, Department of Biostatistics, University of Oslo

Title: Rank-based covariate-informed clustering of high-dimensional data with variable selection

When? Tuesday 31.10.2023, 14:15-15:15

Where?  Erling Svedrups plass and Zoom 

https://uio.zoom.us/j/66849037258?pwd=NkNiR0lkbm5VK0VyMytVZW4vV0hNQT09

Abstract

Rank-based models can be used to estimate individual behaviours and preferences in several areas, such as marketing and politics. Often, combining the expressed preferences with additional user-related information (covariates) can potentially lead to a better accuracy in individual predictions, by enhancing the understanding of the users’ personal profiles. The Mallows model is a popular model for rankings, as it flexibly adapts to different types of preference data, and the previously proposed Bayesian Mallows Model (BMM) offers a computationally efficient framework for Bayesian inference also allowing capturing the users’ heterogeneity, via a finite mixture. However, the Mallows model does not seem realistic when the pool of items is large, and furthermore BMM does not currently allow the use of covariates. In this talk, I will introduce a recent extension of BMM that embeds covariate information in a joint rank-based clustering framework. The proposed method is based on a similarity function that a priori favours the aggregation of people into a cluster when their covariates are similar. A lower-dimensional version of BMM (lowBMM) that scales to large datasets has also been proposed and used in the context of cancer genomics; however, lowBMM does not perform clustering. We now propose to combine the Bayesian mixture of Mallows models with items selection, to jointly perform variable selection and clustering. Performance of both methods is investigated via simulation studies, and real-data examples in genomics and preference learning are also shown. This is joint work with Emilie Eliseussen, Arnoldo Frigessi, Haakon Muggerud, Ida Scheel.  

 Welcome!

 Best regards,

Sven Ove Samuelsen & Aliaksandr Hubin

Vis arrangement →
okt.
24.
10:00 a.m.10:00

Explaining AI-seminar

Dear all BigInsighters,

You are hereby invited to a new Explaining AI seminar. The seminar will be a webinar, since the speaker speaks all the way from Germany.

Speaker: Julia Herbinger (Ludwig-Maximilians-Universität München)

Location: Click here to join the meeting (Microsoft Teams)

Title: Decomposing Global Feature Effects Based on Feature Interactions

Abstract: Global feature effect methods, such as partial dependence (PD) plots, provide an intelligible visualization of the expected marginal feature effect. However, such global feature effect methods can be misleading, as they do not represent local feature effects of single observations well when feature interactions are present. In this talk, I will introduce a new framework called generalized additive decomposition of global effects (GADGET), which is based on recursive partitioning to find interpretable regions in the feature space such that the interaction-related heterogeneity of local feature effects is minimized. I will demonstrate its applicability to the most popular methods to visualize marginal feature effects, namely PD, accumulated local effects (ALE), and Shapley additive explanations (SHAP) dependence. Additionally, I will show that different measures to quantify and analyze feature interactions can be derived when GADGET is applied. To define the interacting feature subset for GADGET, I will introduce PINT, a novel permutation-based significance test to detect global feature interactions that is applicable to any feature effect method used within GADGET. I will demonstrate the applicability of the proposed methods based on simulation and real-world examples.

Welcome!

Vis arrangement →
okt.
19.
2:30 p.m.14:30

OCBE Biostatistics Seminar: Thomas Matcham

Biostatistical seminar with Thomas Matcham, Imperial College London, UK.

Time and place: Oct. 19, 2023 2:30 PM – 3:30 PM, Domus Medica, Auditorium 13

Title: A Proper Concordance Index for Time-Varying Relative Risk

Abstract: Harrel's concordance index is a commonly used discrimination metric for survival models, particularly for models where the relative ordering of the risk of individuals is time-independent, such as the proportional hazards model. There are several suggestions, but no consensus, on how it could be extended to models where relative risk can vary over time, e.g. in case of crossing hazard rates. We show that these concordance indices are not proper, in the sense that they are maximised in the limit by the true data generating model. Furthermore, we show that a concordance index is proper if and only if the risk score used is concordant with the hazard rate at the first event time for each comparable pair of events. Thus, we suggest using the hazard rate as the time-varying risk score when calculating concordance. Through simulations, we demonstrate situations in which other concordance indices can lead to incorrect models being selected over a true model, justifying the use of our suggested risk prediction in both model selection and in loss functions in, e.g., deep learning models.

Vis arrangement →
okt.
12.
2:30 p.m.14:30

OCBE Biostatistics Seminar: Jesse Hemerik

Biostatistical seminar with Jesse Hemerik, Assistant Professor, Department of Econometrics, Erasmus University Rotterdam.

Time and place: Oct. 12, 2023 2:30 PM – 3:30 PM, Auditorium 13, Domus Medica

Title: Robust testing in generalized linear models with many responses

Abstract: Generalized linear models (GLMs) are widely used in biostatistics, e.g. to model binary responses or counts. For example, when analyzing RNA-Seq data, it is common to fit many GLMs simultaneously. GLMs are often misspecified due to overdispersion and heteroscedasticity. Existing quasi-likelihood methods for testing in misspecified GLMs often do not provide satisfactory type I error rate control. We provide a novel semi-parametric test, based on a permutation-type approach. Our test often provides better type I error control than its competitors. Further, we consider the common scenario that there are multiple response variables. Think for example about RNA-Seq or neuroimaging data. For each of the responses, association with the predictor of interest is tested. The challenge is then to deal with the multiple testing problem in a powerful and reliable way. To achieve this, we combine our approach with powerful permutation-based multiple testing methods.

Vis arrangement →
sep.
14.
2:30 p.m.14:30

OCBE Biostatistics Seminar: Stein Emil Vollset

Our next OCBE biostatistics seminar take place Thursday next week, September 14, at 14:30-15:30, in Auditorium 13 (Domus Medica, Oslo).

Stein Emil Vollset, Professor, Department of Health Metrics Sciences, University of Washington, USA, will talk about

Forecasting the Global Burden of Disease Study
Abstract: I will present the Global Burden of Disease Study (GBD) that estimate disease burden for 204 countries and territories, and at the first administrative level for a subset of 22 countries. GBD also produce estimates of disease burden attributable to close to 70 risk factors. The main measures of disease burden are deaths, years of life lost, years lived with disability, DALYs (disability adjusted life years), prevalence and incidence for more 350 diseases and injuries, in 23 age groups and for males and females back to 1990. The main focus of the talk will be to present the ongoing effort to produce forecasts of disease burden and population  to 2050 or 2100. I will also present some computational and methodological challenges encountered by the project. 

See also: https://www.med.uio.no/imb/english/research/centres/ocbe/events/biostat-seminar/ for updates or for subscribing to the events. If you'd like to be added in the "biostat-ext" mailing list, let us know.

Welcome!

Jon Michael Gran
Oslo Centre for Biostatistics and Epidemiology

Vis arrangement →
jun.
15.
2:30 p.m.14:30

OCBE Biostatistics Seminar: Susanne Strohmaier

Remember our OCBE biostatistics seminar tomorrow, Thursday June 15, at 14:30-15:30, in Auditorium 13 (Domus Medica, Oslo).

Susanne Strohmaier, Medical University of Vienna, Austria, will talk about

Survival benefit of kidney transplantation compared to remaining waitlisted on dialysis - Results from an Austrian nation registry using target trial emulation

Abstract: For certain medical research questions, randomization is unethical or infeasible so causal effects have to be estimated from observational data. If confounding and exposure status are time-dependent, this requires sophisticated methodology such as target trial emulation involving longitudinal matching methods. Here we present an example from nephrology aiming to quantify the survival benefit of first kidney transplantation compared to remaining on dialysis and never receiving an organ, across ages and across times since waitlisting.

We analyzed data from the Austrian Dialysis and Transplant Registry comprising patients on dialysis and waitlisted for a kidney transplant with repeated updates on patient characteristics and waitlisting status. As often with registry data, a tricky task was data management, i.e. dealing with inconsistencies and incompleteness. Data availabilities also had to be taken into consideration when deciding on the most relevant causal effect that could be identified and estimated. We adapted the approaches of Gran et al. (2010) and Schaubel et al. (2006) by constructing a series of auxiliary trials, where each trial was initiated at the time of a transplantation (relative to time of first/second waitlisting). Transplanted patients contributed to the treatment group while patients with current active waitlisting status were classified to the control group. Controls were artificially censored if they were transplanted at a later time and their transplantation then initiated a further trial of the series. We applied pooled logistic regression adjusted for time-varying patient characteristics to estimate inverse probability of treatment weights (IPTWs) to achieve exchangeability and trial specific Cox proportional hazards models to compute yearly updated IPCWs to account for non-adherence to the assigned treatment. The marginal effect and the effect conditional on age and duration of waitlisting expressed on different scales (hazard ratios, survival probabilities and restricted mean survival times) were obtained from Cox models weighted by the product of IPTWs and IPCWs fitted to the stacked data set of all trials. A bootstrap approach was used to obtain confidence intervals. 

See also: https://www.med.uio.no/imb/english/research/centres/ocbe/events/biostat-seminar/ for updates or for subscribing to the events. If you'd like to be added in the "biostat-ext" mailing list, let us know.

Welcome!
Best wishes,
Valeria Vitelli, Manuela Zucknick and Jon Michael Gran

Vis arrangement →
jun.
6.
2:15 p.m.14:15

Seminar Series in Statistics and Data Science: Edward Austin

We are very pleased to invite you to our next seminar of

Seminar series in Statistics and Data Science

Speaker:  Edward Austin, Senior research associate, Lancaster University

Title: Detecting Emergent Anomalies in Functional Data 

When? Tuesday 06.06.2023, 14:15-15:15 

Where?  Erling Svedrups plass and Zoom https://uio.zoom.us/j/61696001256?pwd=ZTVSUXJSUjNLYW5VR3AxRnI1MU5Edz09 

Abstract
This talk will focus on recent work about the sequential detection of anomalies within partially observed functional data, motivated by a problem encountered by an industrial collaborator. Classical sequential changepoint detection approaches look for changes in the parameters, or structure, of a data sequence and are not equipped to handle the complex non-stationarity and dependency structure of functional data. Conversely, existing functional data approaches require the full observation of the curve before anomaly detection can take place. We propose a new method, FAST, that performs sequential detection of anomalies in partially observed functional data. This talk will introduce the approach, and some associated theoretical results, and highlight its application on telecommunications data.

Welcome!

Best regards,
Sven Ove Samuelsen & Aliaksandr Hubin

Vis arrangement →
jun.
1.
2:30 p.m.14:30

OCBE biostatistics & Sven Furberg Seminar: Jotun Hein

The next OCBE Biostatistics seminar will fall within the series of Sven Furberg Seminars in Bioinformatics and Statistical Genomics, and will be held on Thursday, June 1st, at 14:30-15:30 in Auditorium 13 (Domus Medica, Oslo).

Jotun Hein, professor of Bioinformatics, Department of Statistics, University of Oxford, UK, will present about
Algorithms for Recombination Detection With an Application to SARS CoV-2

Abstract:  The evolutionary process of genetic recombination has the potential to rapidly change the properties of a viral pathogen, and its presence is a crucial factor to consider in the development of treatments and vaccines. It can also significantly affect the results of phylogenetic analyses and the inference of evolutionary rates. The detection of recombination from samples of sequencing data is a very challenging problem and is further complicated for SARS-CoV-2 by its relatively slow accumulation of genetic diversity. The extent to which recombination is ongoing for SARS-CoV-2 is not yet resolved. To address this, we use a parsimony-based method to reconstruct possible genealogical histories for samples of SARS-CoV-2 sequences, which enables us to pinpoint specific recombination events that could have generated the data. We propose a statistical framework for disentangling the effects of recurrent mutation from recombination in the history of a sample, and hence provide a way of estimating the probability that ongoing recombination is present. We apply this to samples of sequencing data collected in England and South Africa and find evidence of ongoing recombination.

We investigate the probability of recovering the true topology of ancestral recombination graphs (ARGs) under the coalescent with recombination and gene conversion. We explore how sample size and mutation rate affect the inherent uncertainty in reconstructed ARGs; this sheds light on the theoretical limitations of ARG reconstruction methods.

The OCBE Biostatistics seminar series is organized by the Oslo Centre for Biostatistics and Epidemiology and is an arena for presenting new interesting work in biostatistics, both on the methodological and applied side, from researchers in Oslo, from Norway and abroad.

See also: https://www.med.uio.no/imb/english/research/centres/ocbe/events/biostat-seminar/
for updates or for subscribing to the events. If you'd like to be added in the "biostat-ext" mailing list, just let us know.

Last upcoming seminar this semester:

June 15th, 2023: Susanne Strohmaier, Medical University of Vienna, Austria

Welcome!
Best wishes,
Jon Michael Gran, Valeria Vitelli and Manuela Zucknick

Professor Jotun Hein

Vis arrangement →
mai
31.
2:15 p.m.14:15

Seminar Series in Statistics and Data Science: Herman van Dijk

Where?  Erling Svedrups plass and Zoom (link to follow)

Herman van Dijk: A Flexible Predictive Density Combination for Large Financial Data Sets in Regular and Crisis Periods.

A flexible predictive density combination is introduced for large financial data sets which allows for model set incompleteness. Dimension reduction procedures that include learning allocate the large sets of predictive densities and combination weights to relatively small subsets.  Given the representation of the probability model in extended nonlinear state-space form, efficient simulation-based Bayesian inference is proposed using parallel dynamic clustering as well as nonlinear filtering, implemented on graphics processing units. The approach is applied to combine predictive densities based on a large number of individual US stock returns of daily observations over a period that includes the Covid-19 crisis period.  Evidence on dynamic cluster composition, weight patterns and model set incompleteness gives valuable signals for improved modelling. This enables higher predictive accuracy and better assessment of uncertainty and risk for investment fund management.

Vis arrangement →
mai
25.
2:30 p.m.14:30

OCBE Biostatistics Seminar: Francesco Stingo

The next OCBE biostatistics seminar will be held on Thursday, May 25th, at 14:30-15:30 in Lille Auditorium (Domus Medica, Oslo)

Francesco Stingo, Associate Professor at the Department of Statistics, Computer Science and Applications, University of Florence, Italy, will present about
Recent Advances in Bayesian Graphical Models with Varying Structure

Abstract: We first focus on recent inferential and computational techniques for multiple graphical models, where the sub-group assignment depends on the value of an external observed covariate. We then introduce Bayesian  Gaussian  graphical  models  with  covariates (GGMx),  a  class  of multivariate Gaussian distributions with covariate-dependent sparse precision matrix.  We propose  a general  construction  of  a  functional  mapping  from  the covariate  space  to  the cone of sparse positive definite matrices, that encompasses many existing graphical models for heterogeneous settings. he flexible formulation of GGMx allows both the strength and the sparsity pattern  of  the  precision matrix  (hence  the  graph  structure)  change  with  the covariates. Extensive simulations and a case study in cancer genomics demonstrate the utility of the proposed models. Joint work with Yang Ni, Veerabhadran Baladandayuthapani, and Claudio Busatto.

The OCBE biostatistics seminar series is organized by the Oslo Centre for Biostatistics and Epidemiology and is an arena for presenting new interesting work in biostatistics, both on the methodological and applied side, from researchers in Oslo, from Norway and abroad.

See also: https://www.med.uio.no/imb/english/research/centres/ocbe/events/biostat-seminar/
for updates or for subscribing to the events. If you'd like to be added in the "biostat-ext" mailing list, just let us know.

Other upcoming seminars this semester:

June 1st, 2023:  Jotun Hein, Department of Statistics, University of Oxford, UK
June 15th, 2023: Susanne Strohmaier, Medical University of Vienna, Austria

Welcome!
Best wishes,
Jon Michael Gran, Valeria Vitelli and Manuela Zucknick

Vis arrangement →
mai
24.
2:15 p.m.14:15

Seminar Series in Statistics and Data Science: Kes Ward

We are very pleased to invite you to our next seminar within our traditional
Seminar series in Statistics and Data Science

Speaker:  Kes Ward, Senior Research Associate, Lancaster University

Title:  Constructing A Constant-per-Iteration Likelihood Ratio Test for Online Changepoint Detection

When? WEDNESDAY 24.05.2023, 14:15-15:15

Where?  Erling Svedrups plass and Zoom 
https://uio.zoom.us/j/63835179388?pwd=bUJEcmpRVWZuRHp5bG5OS0pYVENDUT09 

Abstract
Online changepoint detection algorithms based on likelihood-ratio tests have excellent statistical properties. However, a simple exact online implementation is computationally infeasible as, at time T, it involves considering O(T) possible locations for the change. To improve on this, we use functional pruning ideas to reduce the set of changepoint locations that need to be stored at time T to approximately log T. Furthermore, we show how we need only maximise the likelihood-ratio test statistic over a small subset of these possible locations. Empirical results show that the resulting exact online algorithm, which can detect changes under a wide range of models, has a constant-per-iteration cost on average. We consider applications of this algorithm in the context of detecting increases in radiation count that represent astronomical or nuclear events of interest.

Welcome!
Best regards,
Sven Ove Samuelsen & Aliaksandr Hubin

Vis arrangement →
mai
10.
2:15 p.m.14:15

Seminar series in Statistics and Data Science: Idris Eckley

We are very pleased to invite you to our next seminar within our traditional

Seminar series in Statistics and Data Science

Speaker:  Idris Eckley, Professor, Lancaster University 

Title:  Anomalies, the internet and a duck-billed platypus 

When? WEDNESDAY10.05.2023, 14:15-15:15 

Where?  Erling Svedrups plass and Zoom 
 https://uio.zoom.us/j/67779874759?pwd=NllPeTdwL2tzckRoSVp6STJPNWE4Zz09  

Abstract
This talk will introduce a recent suite of research focussed on the statistical detection of anomalous structure in online data settings. The challenge of efficiently identifying anomalies in data sequences is an important statistical problem that now arises in many applications. Whilst there has been substantial work aimed at making statistical analyses robust to outliers, or point anomalies, there has been much less work on detecting anomalous segments, or collective anomalies, particularly in those settings where point anomalies might also occur. This is the challenge we seek to address, demonstrating theoretical results in both the offline and online settings as well as introducing some applied case studies.

Welcome!

Best regards,
Sven Ove Samuelsen & Aliaksandr Hubin

Vis arrangement →
mai
4.
2:30 p.m.14:30

Sven Furberg Webinar in Bioinformatics and Statistical Genomics: Dr. Julien Gagneur

We are pleased to announce that Dr. Julien Gagneur will be the guest speaker of our upcoming Sven Furberg Webinar in Bioinformatics and Statistical Genomics on Thursday May 4th at 14:30 on Zoom (see details below and at https://www.mn.uio.no/sbi/english/furberg-seminars/dr.-julien-gagneur.html).

Dr. Julien Gagneur, Professor in Computational Molecular Medicine at the Technical University of Munich, Germany, will present his research on "Detecting and predicting aberrant splicing in human and Species-aware DNA language modeling."

Abstract
The first part of my talk covers a series of studies concerned with the detection aberrant splicing from RNA-seq data [1-2] and their predictions from genomics sequence [3-4]. This work has translational relevance for pinpointing the genetic cause of individuals affected with a genetically undiagnosed rare disorders [5]. The algorithms developed could also help for interpreting mutations found in tumours.

In the second part of my talk, I will present recent results using language models to identify conserved genomic sequences in an alignment-free fashion [6]. We train a masked language model on 3’ regions of ORFs across more than 800 fungal species spanning over 500 million years of evolution. We show that explicitly modeling species is instrumental in capturing conserved yet evolving regulatory elements. We demonstrate the utility of the resulting sequence embedding for range of regulatory genomics prediction tasks.

References: 1. Mertes, Scheller, et al. Detection of aberrant splicing events in RNA-Seq data with FRASER. Nature Communications, 2021; 2. Scheller et al. Improved detection of aberrant splicing using the Intron Jaccard Index. medRxiv 2023; 3. Cheng, et al., MMSplice: Modular modeling improves the predictions of genetic variant effects on splicing, Genome Biology. 2019; 4. Celik, Wagner, et al. Aberrant splicing prediction across human tissues. bioRxiv, 2022 (now in press Nat genet.); 5. Yépez, Gusic, et al. Clinical implementation of RNA sequencing for Mendelian disease diagnostics. Genome Medicine, 2022; 6. Gankin, Karollus, et al. Species-aware DNA language modeling. bioRxiv 2023

Looking forward to seeing you all at the webinar.

Zoom info:
Join Zoom Meeting at  https://uio.zoom.us/j/65895596084?pwd=b29ubjQ2VEtCNE1vM0NPRkluMGFKdz09.

Meeting ID:  658 9559 6084; Passcode: 505310
-- 
Anthony Mathelier, PhD
Associate Director - Centre for Molecular Medicine Norway (NCMM)
Group Leader - Computational Biology & Gene Regulation Group, NCMM
Adjunct Professor - Centre for Bioinformatics, University of Oslo
Adjunct Researcher - Dept. of Medical Genetics, Oslo University Hospital
http://mathelierlab.com
anthony.mathelier@ncmm.uio.no
(+47) 22840561

Vis arrangement →
apr.
27.
2:30 p.m.14:30

OCBE Biostatistics Seminar: William Denault

Dear all,
The next OCBE biostatistics seminar will be held on Thursday, April 27th, at 14:30-15:30 in Auditorium 13 (Domus Medica, Oslo). 

William Denault, Researcher, Oslo Centre for Biostatistics and Epidemiology, Oslo University Hospital, will present about

Introduction to the Sum of Single Effects (SuSiE) model and its recent extensions

Abstract: In this presentation, we present in detail the seminal work of Wang et al. "Sum of Single Effects" (SuSiE) model (JRSSB 2021) — introducing a simple new approach to variable selection in linear regression, with a particular focus on quantifying uncertainty in which variables should be selected. We also introduce a corresponding new fitting procedure — Iterative Bayesian Stepwise Selection (IBSS) — which is a Bayesian analogue of stepwise selection methods. IBSS shares the computational simplicity and speed of traditional stepwise methods, but instead of selecting a single variable at each step, IBSS computes a distribution on variables that captures uncertainty in which variable to select. We provide a formal justification of this intuitive algorithm by showing that it optimizes a variational approximation to the posterior distribution under the SuSiE model. Finally, we will discuss extension of the SuSiE model to summary statistics regression and functional phenotypes.

The OCBE biostatistics seminar series is organized by the Oslo Centre for Biostatistics and Epidemiology and is an arena for presenting new interesting work in biostatistics, both on the methodological and applied side, from researchers in Oslo, from Norway and abroad.

See also: https://www.med.uio.no/imb/english/research/centres/ocbe/events/biostat-seminar/
for updates or for subscribing to the events. If you'd like to be added in the "biostat-ext" mailing list, just let us know.

Other upcoming seminars this semester:

May 25th, 2023:  Francesco Stingo, University of Florence, Italy
June 15th, 2023: Susanne Strohmaier, Medical University of Vienna, Austria

Welcome!
Best wishes,
Jon Michael Gran, Valeria Vitelli and Manuela Zucknick

Vis arrangement →
apr.
19.
2:15 p.m.14:15

Seminar Series in Statistics and Data Science: Douglas Wiens

Dear all, 
We are very pleased to invite you to our next seminar within our traditional
Seminar series in Statistics and Data Science

Speaker:  Douglas Wiens, Professor, University of Alberta

Title:  Robustness of Design: A Survey

When? WEDNESDAY 19.04.2023, 14:15-15:15

Where?  Erling Svedrups plass and Zoom 

https://uio.zoom.us/j/67994824483?pwd=UkY3WUUzV1BuTzdXRXZBUGVsT2tPZz09

Abstract
I will discuss techniques of robustness of design, for experiments meant to give reliable predictions and inferences in the face of model discrepancies. Particular attention will be paid to (i) robustness against a misspecified linear response function; and (ii) robustness against violations of the assumed independence of the observations. The solutions to these problems involve a blend of mathematical and numerical optimization methods, which will be outlined. The methods are extended in a number of directions which will be discussed: (i) sampling schemes for a stochastic process whose autocorrelation function is possibly misspecified; (ii) active learning, as a further extension of the techniques from experimental design to sampling design; and (iii) discrimination problems in nonlinear regression, in which the competing models are only partially and perhaps incorrectly specified.

Welcome!
Best regards,
Sven Ove Samuelsen & Aliaksandr Hubin

Vis arrangement →
apr.
11.
2:15 p.m.14:15

Seminar Series in Statistics and Data Science: Florian Frommelt

We are very pleased to invite you to our next seminar within our traditional

Seminar series in Statistics and Data Science

Speaker:  Florian Frommelt, Associate Professor, Medical University Vienna

Title:  A neutral comparison of algorithms to minimize L0 penalties for high-dimensional variable selection

When? TUESDAY, 11.04.2023, 14:15-15:15

Where?  Erling Svedrups plass and Zoom 

https://uio.zoom.us/j/69667330348?pwd=emZkRGNndXgwVkFFVmlzWXhmS01Cdz09 

Abstract
Variable selection methods based on L0 penalties have excellent theoretical properties to select sparse models in a high-dimensional setting. There exist modifications of BIC which either control the family wise error rate (mBIC) or the false discovery rate (mBIC2) in terms of which regressors are selected to enter a model. However, the minimization of L0 penalties comprises a mixed integer problem which is known to be NP hard and therefore becomes computationally challenging with increasing numbers of regressor variables. This is one reason why alternatives like the LASSO have become so popular, which involve convex optimization problems which are easier to solve. The last few years have seen some real progress in developing new algorithms to minimize  L0 penalties. We will compare the performance of these algorithms in terms of minimizing L0 based selection criteria.
Simulation studies covering a wide range of scenarios which are inspired by genetic association studies are used to compare the values of selection criteria obtained with different algorithms. Additionally some statistical characteristics of the selected models and the runtime of algorithms are compared.

Welcome!

Best regards,
Sven Ove Samuelsen & Aliaksandr Hubin

Vis arrangement →
mar.
23.
2:30 p.m.14:30

OCBE Biostatistics Seminar: Johan Pensar

We are very happy to announce a series of new OCBE biostatistics seminars, with the first to be held on Thursday, March 23rd, at 14:30-15:30 in Auditorium 13 (Domus Medica, Oslo). 

Speaker: Johan Pensar, Associate Professor at the Department of Mathematics, University of Oslo

Title: Causal Inference with Graphical Models

Abstract: Understanding the behaviour of a system under the influence of interventions is the ultimate goal of many scientific studies. While such causal relationships are ideally inferred from interventional data (obtained through controlled experiments), in many applications one has only access to data that has been obtained by passively observing the system. Hence, there has lately been a growing interest in machine learning methods for inferring causal relationships from observational data given certain assumptions. In this talk, I will present the idea of using causal graphical models as the underlying framework for approaching this problem. More specifically, I will focus on a particular type of method that combines structure learning and causal calculus in order to produce causal effect estimates under an unknown causal structure. In particular, I will present some of our recent work where we adopt this approach to a Bayesian setting in order to better account for the uncertainty in the inference procedure.

The OCBE biostatistics seminar series is organized by the Oslo Centre for Biostatistics and Epidemiology and is an arena for presenting new interesting work in biostatistics, both on the methodological and applied side, from researchers in Oslo, from Norway and abroad.

See also: https://www.med.uio.no/imb/english/research/centres/ocbe/events/biostat-seminar/
for updates or for subscribing to the events. If you'd like to be added in the "biostat-ext" mailing list, just let us know.

Other upcoming seminars:
April 27th, 2023: William Denault, OCBE
May 25th, 2023:  TBD
June 15ht, 2023: Susanne Strohmaier, Medical University of Vienna, Austria

Welcome!
Best wishes,
Jon Michael Gran, Valeria Vitelli and Manuela Zucknick
Oslo Centre for Biostatistics and Epidemiology,
Department of Biostatistics, University of Oslo, Norway

Vis arrangement →
mar.
23.
9:00 a.m.09:00

Explaining AI seminar: Rich Caruana (Microsoft Research)

Speaker: Rich Caruana (Microsoft Research)

Location: Click here to join the meeting (Microsoft Teams)

Title: Friends Don’t Let Friends Deploy Black-Box Models: The Importance of Intelligibility in Machine Learning

Abstract:  In machine learning often tradeoffs must be made between accuracy and intelligibility: the most accurate models usually are not very intelligible, and the most intelligible models usually are less accurate.  This can limit the accuracy of models that can safely be deployed in mission-critical applications such as healthcare where being able to understand, validate, edit, and trust models is important.  EBMs (Explainable Boosting Machines) are learning method based on generalized additive models (GAMs) that are as accurate as full complexity models, more intelligible than linear models, and which can be made differentially private with little loss in accuracy.  EBMs make it easy to understand what a model has learned and to edit the model when it learns inappropriate things.  In the talk I’ll present several case studies where EBMs discover surprising patterns in data that would have made deploying black-box models risky.

About the speaker: Rich Caruana is a senior principal researcher at Microsoft Research. Before joining Microsoft, Rich was on the faculty in the Computer Science Department at Cornell University, at UCLA’s Medical School, and at CMU’s Center for Learning and Discovery.  Rich’s Ph.D. is from Carnegie Mellon University where he worked with Tom Mitchell and Herb Simon.  His thesis on Multi-Task Learning helped create interest in a new subfield of machine learning called Transfer Learning.  Rich received an NSF CAREER Award in 2004, best paper awards in 2005 (with Alex Niculescu-Mizil), in 2007 (with Daria Sorokina), and 2014 (with Todd Kulesza, Saleema Amershi, Danyel Fisher, and Denis Charles), and co-chaired KDD in 2007.  His current research focus is on learning for medical decision making, transparent modeling, and deep learning with LLMs.

Rich Caruana

Vis arrangement →
mar.
7.
2:15 p.m.14:15

Seminar Series in Statistics and Data Science: Thordis L. Thorarinsdottir

We are very pleased to invite you to our next seminar within our traditional

Seminar series in Statistics and Data Science

Speaker:  Thordis L. Thorarinsdottir, Associate Professor, University of Oslo

Title:  Statistical modelling of environmental extremes

When? TUESDAY,  07.03.2023, 14:15-15:15

Where?  Erling Svedrups plass and Zoom 
https://uio.zoom.us/j/67530026986?pwd=Ly9nd3ZaOVl6TllOU3A2ZGJEUUpWUT09

Abstract:  Estimates of environmental extremes are needed for a multitude of applications. For example, buildings, roads, bridges and dams must be designed to withstand extreme precipitation and flooding events of a certain size. Obtaining such estimates requires a combination of statistical theory and environmental process understanding to overcome data deficiencies: data on extremes are by definition sparse and regulations often require estimates for events that have yet to be observed. We will present approaches to obtain consistent estimates across spatial locations and accumulation periods, and discuss a few open questions on this topic. 

Welcome!
Best regards,
Sven Ove Samuelsen & Aliaksandr Hubin

Vis arrangement →