Small binary-classification dataset for the Binomial section of the vignette.
Source:R/data.R
BinomialExample.RdA synthetic logistic-regression dataset used to illustrate pic() with
the Binomial family. It contains \(n = 300\) observations of
\(p = 50\) predictors, of which \(s = 5\) carry signal; the
remaining 45 are noise.
Format
A list with two components:
XNumeric matrix of dimension \(300 \times 50\) with column names
gene_*andnoise_*.yInteger vector of length \(300\) containing \(0/1\) class labels.
Details
Column names follow the same convention as
QuickStartExample: active variables are labeled
gene_1, ..., gene_5 and inactive ones noise_1, ...,
noise_45, interleaved in random order.
The binary response is generated as $$Y_i \sim \mathrm{Bernoulli}\!\left(\frac{1}{1 + e^{-X_i\beta}}\right),$$ with non-zero coefficients drawn uniformly in \([1.5,\, 3]\) with random sign. The class balance is roughly \(45\%\) of positives.