This work is an intuitive visual map from a binary vector into a three-dimensional clock-like three-dimensional space to reveal the underlying temporal pattern of public transit users.

This is the extended version of the ICPRAM paper with some discussions about the computational complexity of our online outlier detection algorithm.

How do we predict odors? It seems easy at the first sight, since odor and chemicals are closely related, but
a mixture of several chemicals changes the odor perception. Sensors data are often unreliable. Here we describe and automatic method to validate the sensors data.

Dendrogram is a visualization tool to demonstrate evolution of data groupings. We generalized dendrogram to forestogram.

We develop a new version of a the random forest to predict the flight crew absence of airlines.

Data envelopment analysis is a widely used technique for computing the relative efficiency over decision making units.
It turns out that the frontier of the envelop
is composed of two sort of facets, strong facts and weak facets. We developed a theory for characterizing the weak facets of the envelop and provided
a mixed-integer programming that computes all of them.

We show how uncertainty can be quantified using Monte Carlo method in life cycle data analysis.

We show that a simple Bayesian variable selection adapted for classification and clustering
is comparable with many sophisticated exisiting variable selection methods.

We discuss complex challenges in public transport data mining and resolve some of the issues with simple tricks.

We discuss different methods for analyzing temporal data in public transport.

We reviewed algorithms that could be applied as a tool in virtual metrology.
Our simulation shows the neural networks regression is a strong candidate.

Elastic Net penalty is an effective tool for variable selection
in linear regression. This work generalizes the elastic net regularization penalty. We suggest a Bayesian perspective for estimation of the
regularization constant.

Repeatability and Reproducability, called R&R, are minimum requirments of a measurement system.
A test of R&R is often required as an improtant part of statistical process control. Here we show that Pearson statistic can be used in order to build a
formal statistical test of repeatability and reproducability for a pass-fail inspection measurement.

We discuss various challenges in public transport data analysis. We show various handy tricks to analyze complex public transport data.

Testing multiple variance components is a non-standard, important, and a difficult problem with a lot of applications in biology, medicine, ecology and many others.
Here we propose a permutation test based on a simple test statistic. This work introduces a methodology for testing even a subset of variance components with zero.

We show how Bayesian clustering with variable selection can be done using our package bclust.
The bclust package implements a new Bayesian framework for variable selection in high-dimensional clustering.

We show how a hierarchical tree can be extracted from random samples of groupings. We also explain the ideas using one of our R packages published on CRAN.

We show metabolite fingerprinting
is an effective method of classification forward genetic mutants of Arbidopsis Thaliana.