Blog

Junijsko predavanje: Testing Kaplan-Meier curves against an alternative of concave or convex non-proportional hazards (31. 5. 2019)

We show the optimality of the log-rank test for comparing two groups when the hazards are proportional. When they are not proportional we can see that this optimality is lost. Many authors have considered alternatives to the log-rank in situations of non-proportionality. We recall this work before considering two sub-classes of non-proportional hazards – concave and convex non-proportional hazards. Although less broad than just non-proportional hazards, the two classes represent a very large class of possibilities. We will see that situations such as delayed effect (immunotherapy trials), situations of non-responders and situations of group attribution errors all belong to these classes. For these two classes, optimal tests can be identified theoretically although, in practice, are not typically available. Nonetheless we describe tests that are close to optimal for the concave and convex classes. Some practical concerns are described. These include efficiency losses and sample size calculation for concave and convex alternatives to the null.

Naslov predavanja:	Testing Kaplan-Meier curves against an alternative of concave or convex non-proportional hazards
Datum:	petek, 31. 5. 2019
Lokacija:	IBMI
Predavatelj:	John O’Quigley, University of Paris-Sorbonne

Photo by Nicole Honeywill / Sincerely Media on Unsplash

May 24, 2019

Majsko predavanje: Metode umetne inteligence za polifarmacijo in predpisovanje zdravil (21. 5. 2019)

Podatkovno-bogata omrežja interakcij prežemajo medicinske raziskave. Glavni izziv je, kako izluščiti znanje iz teh omrežij, ki segajo od interakcij med molekulami v našem telesu vse do interakcij med pacienti v družbi.

Predstavljene bodo metode umetne inteligence, ki se naučijo, kako vložiti vozlišča v omrežjih v nizko dimenzionalni prostor, pri čemer je geometrija tega prostora optimizirana, da odraža strukturo interakcij med vozlišči. Vložitve omrežij so tehnično jedro metode Decagon, prvega pristopa za napovedovanje neželenih stranskih učinkov kombinacij zdravil. Decagon najprej zgradi veliko omrežje, ki opisuje, kako proteini v naših telesih delujejo drug z drugim in kako različna zdravila vplivajo na te proteine. Decagon uporabi to omrežje za razpoznavanje vzorcev o neželenih stranskih učinkih, kar algoritmu omogoči, da avtomatično gradi napovedi o kombinacijah zdravil, ki so varne za paciente. Omenjeno bo tudi, kako napovedi umetne inteligence ovrednotimo v praksi – v sodelovanju s Stanford School of Medicine, Harvard Medical School in Massachusetts General Hospital.

Predavanje: Metode umetne inteligence za polifarmacijo in predpisovanje zdravil
Datum: torek, 21. maj 2019 – 12:00
Lokacija predavanja: IBMI
Predavatelj: doc. dr. Marinka Žitnik, Department of Computer Science, Stanford University

May 14, 2019
Aprilsko predavanje: Kako s preiskovalnim novinarstvom doseči spremembe na bolje (18. 4. 2019)

Uredništvo portala Pod črto je letošnji dobitnik priznanja Statističnega društva Slovenije za odličnost statističnega poročanja v medijih. V obrazložitvi je bilo poudarjeno, da si priznanje zasluži zaradi svojega kakovostnega in poglobljenega novinarskega dela, ki dokazuje, da se s številkami lahko identificira marsikateri družbeni problem, hkrati pa pokaže smernice za njegovo rešitev. Odgovorni urednik bo njihovo delo predstavil pri nas na predavanju naslednji četrtek.

Predavanje: Kako s preiskovalnim novinarstvom doseči spremembe na bolje
Datum: četrtek, 18. april 2019 – 12:30
Lokacija predavanja: IBMI
Predavatelj: Anže Voh Boštic, odgovorni urednik medija Pod črto
Vsebina:
Pri mediju za preiskovalno novinarstvo Pod črto si že od leta 2014 prek svojega novinarskega dela prizadevamo za doseganje sprememb na bolje v naši družbi. Razkrivamo primere korupcije in gospodarskega kriminala ter neustrezen odziv inštitucij na to vrsto kriminala. Preiskujemo probleme v slovenskem zdravstvu, predvsem na področju kakovosti zdravstva in obravnavanja poklicnih bolezni. Opozarjamo na posledice podnebnih sprememb. Razkrivamo sporne prakse delovanja slovenske policije in preiskujemo, kako izboljšati situacijo najbolj ranljivih skupin v naši družbi.
V predavanju bo pojasnjeno, na kakšen način delujemo, kako preiskujemo in zbiramo podatke za naše zgodbe, kako pri našem delu zbiramo, uporabljamo in analiziramo podatke ter kako jih statistično obdelamo. Prikazali bomo tudi najpomembnejše spremembe na bolje, ki smo jih dosegli z našim delom.

April 11, 2019
16th Applied Statistics, International Conference, 22-25 Sept 2019
International conference 16th Applied Statistics 2019 is organized by Statistical Society of Slovenia and University of Ljubljana.

Aims and Scope

Following the very successful conference series of previous AS conferences, the 16th AS2019 conference will be organized in Hotel Ribno, in the vicinity of the magnificent Lake Bled.

The main goal of Applied Statistics 2019 conference is to provide an opportunity for researchers in statistics, data analysts, and other professionals from various statistical and related fields to come together, present their research, and learn from each other. A four days program consists of invited paper presentations, contributed paper sections from diverse topics, and starts with a workshop. Cross-discipline and applied paper submissions are especially welcome.

Important Dates
- Abstract Submission: June 1
- Decision of acceptance: July 1
- Conference Registration and hotel reservation: July 15
- Reduced fee payment: July 15
- Payments deadline for presenting authors: July 31
More info: Conference website …
April 3, 2019
European Survey Research Association Conference (ESRA 2019)

8th European Survey Research Association Conference (ESRA 2019) will be held in Zagreb, Croatia, from 15th to 19th July 2019.

The scientific committee is now inviting researchers who are active in the field of survey research, survey methodology and data analysis to submit proposals for individual paper and poster presentations.

More at the conference website.

March 28, 2019
Pripravljamo novo spletno stran društva
Dobrodošli na osnutku nove spletne strani Statističnega društva Slovenije. Tu boste odslej našli vse pomembne informacije o delovanju društva. Vabljeni k sodelovanju!

Good company in a journey makes the way seem shorter. — Izaak Walton
March 28, 2019
Predavanje “Segmentacija glasbe: od predšolskega obdobja do najstništva”, 14. 3. 2019

Datum dogodka: četrtek, 14. marec 2019 – 13:00
Lokacija predavanja: IBMI
Predavatelj: dr. Lorena Mihelač, ŠC Novo mesto, Mednarodna podiplomska šola Jožefa Štefana

Predavanje obravnava percepcijo melodije in njeno segmentacijo, od predšolskega obdobja do najstništva. Prikazani bodo rezultati, pridobljeni v dveh eksperimentih, v obdobju od januarja do aprila 2018. V prvem eksperimentu je bila preizkušena hipoteza, da otroci/najstniki uporabljajo glasbene fraze v melodiji kot strategijo združevanja manjših glasbenih strukturalnih delov v večje pri pomnjenju melodije. V drugem eksperimentu, je bila preizkušena hipoteza, da je dihanje pri petju neke melodije v ozki korelaciji s segmentiranjem melodije.

Prikazani bodo prav tako rezultati podobnosti oz. razlik med človeško in računalniško segmentacijo istih testnih melodij.

March 7, 2019
Konferenca ESRA 2019 Zagreb, 15. – 19. 7. 2019

V Zagrebu bo med 15. in 19. julijem 2019 potekala 8 konferenca evropskega združenja za anketno raziskovanje (ESRA 2019). Vljudno vabljeni k udeležbi. Več na https://www.europeansurveyresearch.org/conferences/overview

February 7, 2019
Statistični dan 2019: Spoznajmo digitalizacijo

Tradicionalni Statistični dan se bo odvijal v torek, 12. 2. 2019, v Kongresnem centru Brdo na Brdu pri Kranju.

Tema letošnjega dogodka bo SPOZNAJMO DIGITALIZACIJO in bo potekalo v treh sklopih:

Predstavitve: Digitalizacija spreminja svet
Okrogla miza: Izzivi digitalizacije
Podelitev priznanj Statističnega društva Slovenije

Udeležba je brezplačna.

Prijave sprejemajo do 5. februarja 2019 prek prijavnega obrazca.

Več o temi posveta lahko preberete na spletni strani Statističnega urada RS

Dogodek bo trajal približno od 10. do 16. ure.

Za dodatna vprašanja so organizatorji vam na voljo na e-naslovu stat-d.surs@gov.si.

January 21, 2019
Metodološki zvezki, Vol. 15, No. 1 & 2, 2018

Advances in Methodology and Statistics

Comparing Two Partitions of Non-Equal Sets of Units

2018    Marjan Cugmas and Anuška Ferligoj; 15(1):1-21

Rand (1971) proposed what has since become a well-known index for comparing two partitions obtained on the same set of units. The index takes a value on the interval between 0 and 1, where a higher value indicates more similar partitions. Sometimes, e.g. when the units are observed in two time periods, the splitting and merging of clusters should be considered differently, according to the operationalization of the stability of clusters. The Rand Index is symmetric in the sense that both the splitting and merging of clusters lower the value of the index. In such a non-symmetric case, one of the Wallace indexes (Wallace, 1983) can be used. Further, there are several cases when one wants to compare two partitions obtained on different sets of units, where the intersection of these sets of units is a non-empty set of units. In this instance, the new units and units which leave the clusters from the first partition can be considered as a factor lowering the value of the index. Therefore, a modified Rand index is presented. Because the splitting and merging of clusters have to be considered differently in some situations, an asymmetric modified Wallace Index is also proposed. For all presented indices, the correction for chance is described, which allows different values of a selected index to be compared.

Download the paper

Web Survey Paradata on Response Time Outliers: A Systematic Literature Review

2018    Miha Matjašič, Vasja Vehovar and Katja Lozar Manfreda; 15(1):23-41

In the last two decades, survey researchers have intensively used computerised methods for the collection of different types of paradata, such as keystrokes, mouse clicks and response times, to evaluate and improve survey instruments as well as to understand the survey response process. With the growing popularity of web surveys, the importance of paradata has further increased. Within this context, response time measurement is the prevailing paradata approach. Papers typically analyse the time (measured in milliseconds or seconds) a respondent needs to answer a certain item, question, page or questionnaire. One of the key challenges when analysing the response time is to identify and separate units that are answering too quickly or too slowly. These units can have a poor response quality and are typically labelled as response time outliers. This paper focuses on approaches for identifying and processing response time outliers. It presents a systematic overview of scientific papers on response time outliers in web surveys. The key observed characteristics of the papers are the approaches used, the level of time measurement, the processing of response time outliers and the relationship between response time and response quality. The results show that knowledge on response time outliers is scattered, inconsistent and lacking systematic comparisons of approaches. Consequently, there is a need to improve and upgrade the knowledge on this issue and to develop new approaches that will overcome existing deficiencies and inconsistencies in identifying and dealing with response time outliers.

Download the paper

Download the supplementary information (Appendix)

Behind the Curve and Beyond: Calculating Representative Predicted Probability Changes and Treatment Effects for Non-Linear Models

2018    Bastian Becker; p. 15(1):43-58

Parameter coefficients from non-linear models are inherently difficult to interpret, and scholars frequently opt for computing and comparing predicted probabilities for variables of interest. In an influential article, Hanmer and Ozan Kalkan (2013) discuss the two most common approaches, the average case respectively observed values approach, and make a strong case for the latter. In this paper, I propose a further refinement of the observed values approach for the purpose of computing predicted probability changes. This refinement concerns the use of counterfactual values for the independent variable of interest. I demonstrate that accounting for non-linearities with regards to the variable of interest is important to avoid estimation biases. I also discuss the implications of this insight for estimating average treatment effects from observational data.

Download the paper

Download the supplementary information (Computer code)

Gumbel GARCH Model with Stock Application

2018    Mehrnaz Mohammadpour and Fatemeh Ziaeenejad; p. 15(1):59-72

The paper proposes a new GARCH model with Gumbel conditional distribution. Several statistical properties of the model are established, like autocorrelation function and stationarity. We consider two methods for estimating the unknown parameters of the model and investigate properties of the estimators. The performances of the estimators are checked by a simulation study. We investigate the application of the process using a real stock data.

Download the paper

Internal Evaluation Criteria for Categorical Data in Hierarchical Clustering: Optimal Number of Clusters Determination

2018    Zdeněk Šulc, Jana Cibulková, Jiřı́ Procházka and Hana Řezanková ; p. 15(2):1-20

The paper compares 11 internal evaluation criteria for hierarchical clustering of categorical data regarding a correct number of clusters determination. The criteria are divided into three groups based on a way of treating the cluster quality. The variability-based criteria use the within-cluster variability, the likelihood-based criteria maximize the likelihood function, and the distance-based criteria use distances within and between clusters. The aim is to determine which evaluation criteria perform well and under what conditions. Different analysis settings, such as the used method of hierarchical clustering, and various dataset properties, such as the number of variables or the minimal between-cluster distances, are examined. The experiment is conducted on 810 generated datasets, where the evaluation criteria are assessed regarding the optimal number of clusters determination and mean absolute errors. The results indicate that the likelihood-based BIC1 and variability-based BK criteria perform relatively well in determining the optimal number of clusters and that some criteria, usually the distance-based ones, should be avoided.

Download the paper
Download the supplementary information (Zip archive)

Mode Effects on Socially Desirable Responding in Web Surveys Compared to Face-to-Face and Telephone Surveys

2018    Nejc Berzelak and Vasja Vehovar ; p. 15(2):21-43

This paper elaborates upon differences in socially desirable responding as being the result of mode effects between web, telephone, and face-to-face survey modes. Social desirability is one of the main threats to comparability of data between different modes. The paper conceptualises socially desirable responding as a specific type of mode effect, which is not only a result of inherent characteristics of a survey mode, but is also mediated and moderated by complex interdependencies of specific survey implementations, contextual factors, and characteristics and behaviours of respondents. While web surveys are generally less prone to socially desirable responding, it is essential to be wary of circumstances that may reduce the perceived privacy of the survey situation and lead to biased reporting. The presented empirical study analyses the answers to a large number of items used in a pilot implementation of the Generations and Gender Survey across the three modes to gain insights into the incidence of socially desirable responding and its role in the observed differences in estimates. The comparison of means, distributions, and proportions of extreme responses to scale questions is performed across 89 survey items. The results are inline with the previous findings on lower susceptibility of web surveys to social desirability bias. More importantly, the findings suggest that the problem of socially desirable responding is likely to be a major contributor to the differences in mean estimates, response distributions, and the level of extreme responding between the studied modes.

Download the paper
Download the supplementary information (PDF file)

Estimation of Power Function Distribution Based on Selective Order Statistic

2018    Mohd T. Alodat, Mohammad Y. Al-Rawwash and Sameer A. Al-Subh; p. 15(2):45-56

In this article, we present the selective order statistic sampling scheme as a promising approach to estimate the parameter of the univariate power function distribution. We derive the maximum likelihood estimator and the method of moments estimator of the power function distribution parameter as well as the reliability parameter and the ratio of two means. Moreover, we derive the asymptotic properties of the proposed estimators. Finally, we conduct simulation studies to investigate the performance of the selective order statistic scheme and concluded that it suits the power function distribution and we found that the maximum likelihood estimator is better than the method of moments estimator under the selective order statistic sampling scheme.

Download the paper
Download the supplementary information (PDF file)

January 19, 2019

Blog

Aims and Scope

Important Dates

Advances in Methodology and Statistics

Comparing Two Partitions of Non-Equal Sets of Units

Web Survey Paradata on Response Time Outliers: A Systematic Literature Review

Behind the Curve and Beyond: Calculating Representative Predicted Probability Changes and Treatment Effects for Non-Linear Models

Gumbel GARCH Model with Stock Application

Internal Evaluation Criteria for Categorical Data in Hierarchical Clustering: Optimal Number of Clusters Determination

Mode Effects on Socially Desirable Responding in Web Surveys Compared to Face-to-Face and Telephone Surveys

Estimation of Power Function Distribution Based on Selective Order Statistic