Three penalties are supported, identified by lowercase name to match
the C++ registry. Each penalty enters the pic() objective as
$$\mathrm{pen}(\beta) = \sum_{j=1}^p p_\lambda(|\beta_j|),$$
where \(p_\lambda(\cdot)\) depends on the penalty.
Details
"lasso"L1 (soft-thresholding) penalty: $$p_\lambda(|t|) = \lambda |t|.$$ Convex, gives the strongest shrinkage on large coefficients bias does not vanish as \(|t| \to \infty\).
"scad"(Smoothly Clipped Absolute Deviation, Fan & Li 2001)Non-convex penalty with concavity parameter
scad_a > 2(default 3.7): $$p_\lambda'(|t|) = \lambda\!\left\{ \mathbf{1}\{|t| \le \lambda\} + \frac{(a\lambda - |t|)_+}{(a - 1)\lambda} \mathbf{1}\{|t| > \lambda\}\right\}.$$ Behaves like the lasso for small \(|t|\), then tapers off so large coefficients are barely penalized - yields nearly unbiased estimates on strong signals."mcp"(Minimax Concave Penalty, Zhang 2010)Non-convex penalty with concavity parameter
mcp_gamma > 1(default 3.0): $$p_\lambda'(|t|) = \left(\lambda - \frac{|t|}{\gamma}\right)_+.$$ Similar motivation as SCAD but a smoother transition: starts at the lasso derivative for small \(|t|\) and tapers linearly to zero at \(|t| = \gamma\lambda\).
The actual evaluation and proximal operators live in C++
(src/penalty_*.cpp). Larger scad_a / mcp_gamma make the
penalty closer to the lasso; smaller values amplify the
non-convexity (and the bias reduction on strong signals).