Skip to content

Create a summary table of all the feature data based on a grouping variable of meta data. Defaults summary statistics include:

  • min

  • median

  • mean

  • sd (standard deviation)

  • mad (median absolute deviation)

  • max

Usage

create_summ_tbl(
  data,
  group_var,
  .funs = c("min", "median", "mean", "sd", "mad", "max")
)

Arguments

data

A data.frame or tibble object containing data for summary.

group_var

character(1). An unquoted (or quoted) string containing the indices to group the statistics, e.g. Group. If missing, ungrouped statistics are returned.

.funs

character(n). String(s) of the functions used to summarize the data. Each function must take a vector of data as input and return a summary scalar, e.g. mean().

Value

A tibble object with rows (features) and columns as the summary statistics listed in .funs.

See also

Author

Stu Field

Examples

create_summ_tbl(simdata)
#> # A tibble: 40 × 7
#>    Feature       min median  mean    sd   mad   max
#>    <chr>       <dbl>  <dbl> <dbl> <dbl> <dbl> <dbl>
#>  1 seq.2802.68 1393.  2805. 2798.  515.  495. 4325.
#>  2 seq.9251.29 1598.  2801. 2807.  546.  624. 3986.
#>  3 seq.1942.70 1543.  2668. 2710.  556.  603. 3882.
#>  4 seq.5751.80 1496   2727. 2739.  561.  576. 3948.
#>  5 seq.9608.12 1056.  2708. 2753.  642.  553. 4905.
#>  6 seq.3459.49 1232.  2496. 2500.  504.  528. 3800.
#>  7 seq.3865.56  865.  2487. 2500   501.  526. 3631.
#>  8 seq.3363.21 1179.  2489. 2500.  493.  477. 3999.
#>  9 seq.4487.88 1304.  2501. 2500   509.  465. 3752.
#> 10 seq.5994.84 1297   2508. 2500.  509.  498. 4231.
#> # ℹ 30 more rows
create_summ_tbl(simdata, gender)
#> # A tibble: 40 × 13
#>    Feature   min_F min_M median_F median_M mean_F mean_M  sd_F  sd_M mad_F
#>    <chr>     <dbl> <dbl>    <dbl>    <dbl>  <dbl>  <dbl> <dbl> <dbl> <dbl>
#>  1 seq.2802… 1393. 1765.    2710.    2891.  2730.  2869.  568.  449.  433.
#>  2 seq.9251… 1738. 1598.    2758     2824.  2736.  2881.  535.  553.  598.
#>  3 seq.1942… 1543. 1668.    2594.    2714.  2658.  2765.  580.  529.  576.
#>  4 seq.5751… 1766. 1496     2651.    2783.  2697.  2782.  546.  580.  618.
#>  5 seq.9608… 1056. 1357.    2854.    2501.  2914.  2585.  671.  571.  598.
#>  6 seq.3459… 1461. 1232.    2434.    2610   2446.  2556.  529.  475.  486.
#>  7 seq.3865… 1607.  865.    2497.    2486   2545.  2453.  506.  498.  486.
#>  8 seq.3363… 1179. 1536.    2493.    2484.  2511.  2489.  543.  441.  490.
#>  9 seq.4487… 1304. 1455.    2514.    2487.  2501.  2499.  549.  469.  479.
#> 10 seq.5994… 1527. 1297     2524.    2468.  2509.  2490.  549.  470.  514.
#> # ℹ 30 more rows
#> # ℹ 3 more variables: mad_M <dbl>, max_F <dbl>, max_M <dbl>

# Arbitrary 3 groupings
simdata$group <- sample(1:3, nrow(simdata), replace = TRUE)
create_summ_tbl(simdata, "group")
#> # A tibble: 40 × 19
#>    Feature     min_1 min_2 min_3 median_1 median_2 median_3 mean_1 mean_2
#>    <chr>       <dbl> <dbl> <dbl>    <dbl>    <dbl>    <dbl>  <dbl>  <dbl>
#>  1 seq.2802.68 1393. 1798. 1505.    2665.    2941.    2805.  2680.  2952.
#>  2 seq.9251.29 1738. 1598. 1862.    2828.    2758     2805.  2833.  2824.
#>  3 seq.1942.70 1543. 1680. 1668.    2715.    2594.    2785   2716.  2682.
#>  4 seq.5751.80 1756. 1788. 1496     2908.    2698.    2645.  2836.  2656.
#>  5 seq.9608.12 1056. 1633. 1500.    2802.    2741.    2451.  2816.  2773.
#>  6 seq.3459.49 1232. 1533. 1325.    2461.    2592.    2513.  2451.  2529.
#>  7 seq.3865.56 1344.  865. 2081.    2513.    2419.    2774.  2509.  2386.
#>  8 seq.3363.21 1179. 1627. 1370.    2458.    2574.    2628.  2486.  2545.
#>  9 seq.4487.88 1304. 1448  1702.    2459.    2487.    2572.  2449.  2490.
#> 10 seq.5994.84 1297  1566. 1843.    2442.    2545.    2555.  2420.  2561.
#> # ℹ 30 more rows
#> # ℹ 10 more variables: mean_3 <dbl>, sd_1 <dbl>, sd_2 <dbl>, sd_3 <dbl>,
#> #   mad_1 <dbl>, mad_2 <dbl>, mad_3 <dbl>, max_1 <dbl>, max_2 <dbl>,
#> #   max_3 <dbl>