AVR-LibC  2.3.0
Standard C library for AVR-GCC
 

AVR-LibC Manual

AVR-LibC Sources

Main Page

User Manual

Lib­rary Refe­rence

FAQ

Exam­ple Pro­jects

Index

Loading...
Searching...
No Matches
Benchmarks

The results below can only give a rough estimate of the resources necessary for using certain library functions. There is a number of factors which can both increase or reduce the effort required:

  • Expenses for preparation of operands and their stack are not considered.
  • In the table, the size includes all dependent functions.
  • Expenses of time of performance of some functions essentially depend on parameters of a call, for example, qsort() is recursive, and sprintf() receives parameters in a stack.
  • Different versions of the compiler can give a significant difference in code size and execution time. For example, the dtostre() function, compiled with avr-gcc 3.4.6, requires 930 bytes. After transition to avr-gcc 4.2.3, the size become 1088 bytes.

A few of libc Functions

avr-gcc version is 15.2.1

The size of a function is given in view of all picked up functions. By default AVR-LibC is compiled with -mcall-prologues. In parenthesis the size without taking into account the code for the prologue and epilogue routines is shown. Both sizes can coincide when no prologue and epilogue routines are present, and then only one code size is shown.

  • qsort sorts an array of char with 100 elements.
  • The used size of double is 4, which plays a role in the float type promotions of arguments of varargs functions like sprintf.
  • For an overview of the different AVR architectures, see avr-gcc: Command Line Options.
    Function Units avr2 avr25 avr4 avr6
    atoi ("12345") Flash bytes
    Stack bytes
    Cycles
    82
    4
    155
    78
    4
    149
    74
    4
    149
    76
    6
    167
    atol ("12345") Flash bytes
    Stack bytes
    Cycles
    126
    5
    221
    118
    5
    209
    106
    4
    205
    110
    6
    223
    ftostre (1.2345f, s, 6, 0) Flash bytes
    Stack bytes
    Cycles
    1128 (1016)
    20
    1339
    1058 (948)
    20
    1178
    1058 (948)
    20
    1178
    1078 (968)
    22
    1191
    ftostrf (1.2345f, 15, 6, s) Flash bytes
    Stack bytes
    Cycles
    1648 (1536)
    40
    1645
    1524 (1414)
    40
    1469
    1524 (1414)
    40
    1469
    1548 (1438)
    43
    1486
    ktoa (123.45k, s, 2) Flash bytes
    Stack bytes
    Cycles
    316
    8
    479
    306
    8
    466
    298
    8
    466
    306
    10
    488
    itoa (12345, s, 10) Flash bytes
    Stack bytes
    Cycles
    110
    2
    879
    102
    2
    875
    102
    2
    875
    106
    3
    880
    ltoa (12345678L, s, 10) Flash bytes
    Stack bytes
    Cycles
    138
    2
    2766
    130
    2
    2762
    130
    2
    2762
    136
    3
    2767
    lltoa (12345678LL, s, 10) Flash bytes
    Stack bytes
    Cycles
    206
    3
    2239
    194
    3
    2209
    194
    3
    2209
    202
    4
    2214
    ulltoa_base10 (12345678ULL, s) Flash bytes
    Stack bytes
    Cycles
    182
    11
    1908
    172
    11
    1852
    168
    11
    1458
    168
    12
    1461
    malloc (1) Flash bytes
    Stack bytes
    Cycles
    668
    6
    101
    606
    6
    97
    606
    6
    97
    610
    7
    100
    realloc ((void*) 0, 1) Flash bytes
    Stack bytes
    Cycles
    1242 (1130)
    6
    101
    1114 (1004)
    6
    97
    1114 (1004)
    6
    97
    1096
    7
    100
    qsort (s, sizeof(s), 1, cmp) Flash bytes
    Stack bytes
    Cycles
    1130 (1018)
    146
    54683
    952 (842)
    146
    50635
    938 (828)
    146
    45705
    984 (874)
    153
    47930
    rand () Flash bytes
    Stack bytes
    Cycles
    132
    2
    95
    120
    2
    90
    120
    2
    90
    124
    3
    93
    random () Flash bytes
    Stack bytes
    Cycles
    678 (566)
    16
    1410
    634 (524)
    16
    1394
    630 (520)
    18
    805
    584
    22
    820
    sprintf_min (s, "%d", 12345) Flash bytes
    Stack bytes
    Cycles
    1218 (1106)
    59
    2077
    1086 (976)
    59
    1940
    1082 (972)
    59
    1935
    1102
    62
    1887
    sprintf (s, "%d", 12345) Flash bytes
    Stack bytes
    Cycles
    1636 (1524)
    59
    1885
    1494 (1384)
    59
    1795
    1478 (1368)
    61
    1796
    1514
    64
    1734
    sprintf_flt (s, "%e", 1.2345) Flash bytes
    Stack bytes
    Cycles
    3250 (3138)
    68
    3089
    3000 (2890)
    68
    2866
    2976 (2866)
    69
    2873
    3088 (2978)
    72
    2695
    sscanf_min ("12345", "%d", &i) Flash bytes
    Stack bytes
    Cycles
    2804 (2692)
    57
    1380
    2502 (2392)
    57
    1294
    2498 (2388)
    57
    1294
    2728
    61
    1311
    sscanf ("12345", "%d", &i) Flash bytes
    Stack bytes
    Cycles
    1798 (1686)
    57
    1380
    1614 (1504)
    57
    1294
    1614 (1504)
    57
    1294
    1716
    61
    1311
    sscanf ("point,color", "%[a-z]", s) Flash bytes
    Stack bytes
    Cycles
    1798 (1686)
    90
    2748
    1614 (1504)
    90
    2601
    1614 (1504)
    90
    2601
    1716
    94
    2502
    sscanf_flt ("1.2345", "%e", &x) Flash bytes
    Stack bytes
    Cycles
    4866 (4754)
    40
    410
    4438 (4328)
    40
    364
    4414 (4304)
    40
    364
    4714 (4604)
    43
    341
    strtof ("1.2345", &end) Flash bytes
    Stack bytes
    Cycles
    1648 (1536)
    24
    1289
    1490 (1380)
    24
    1172
    1462 (1352)
    24
    969
    1698 (1588)
    37
    1282
    strtol ("12345", &end, 0) Flash bytes
    Stack bytes
    Cycles
    390
    14
    606
    368
    14
    583
    350
    12
    351
    364
    15
    373
    strtoll ("12345", &end, 0) Flash bytes
    Stack bytes
    Cycles
    578 (466)
    20
    832
    540 (430)
    20
    785
    516 (406)
    18
    488
    540 (430)
    21
    513

Math Functions from libm

The following tables contain benchmark values for some floating-point functions over the indicated range(s) of input values.

Notice that the values for relative error and the Worst Case Execution Time Cyclesmax are only lower bounds. The best achievable accuracy for IEEE single with its 23 fractional bits in the mantissa is log10(2−24 ≈ 6·10−8) ≈ −7.22.

The poor performance of sinf, cosf and tanf occurs for values that are close to the poles (if any) or close to the non-zero zeros.

libm benchmarks for ATmega128 (avr51, with MUL)
Function Size x0 x1 Cyclesavr Cyclesmax log10(Errmax)
acosf 1102 -1 1 1957 2464 -6.66
asinf 1092 -1 1 1896 2454 -6.47
atanf 1058 -10 10 2879 3073 -6.92
cbrtf 514 -1e+06 1e+06 2573 2665 -6.92
ceilf 258 -1e+05 1e+05 108 177 -7.22
cosf 904 -1.57 1.58 1775 2126 -2.86
coshf 1366 -20 20 3053 3439 -5.78
expf 1320 -20 20 2588 3247 -5.78
floorf 258 -1e+05 1e+05 108 180 -7.22
frexpf 154 -1e+05 1e+05 40 40 -7.22
logf 1076 0 100 2392 2866 -6.69
log10f 1076 0 100 2397 2866 -6.62
log2f 1052 0 100 2252 2723 -6.83
modff 484 -1e+05 1e+05 365 456 -7.22
roundf 236 -1e+05 1e+05 111 156 -7.22
sinf 910 0 3.15 1744 2146 -3.67
sincosf 976 -1.57 3.15 3482 3883 -3.43
sinhf 1466 -20 20 3043 3461 -5.78
sqrtf 256 0 1e+06 474 510 -7.22
tanf 1178 0 3.15 2178 2946 -2.86
tanhf 1494 -20 20 3148 3620 -6.37
truncf 234 -1000 1000 140 178 -7.22

 

libm benchmarks for ATmega128 (avr51, with MUL)
Function Size x0 x1 y0 y1 Cyclesavr Cyclesmax log10(Errmax)
+ 380 -1e+10 1e+10 -1e+10 1e+10 102 256 -7.22
* 380 -1e+10 1e+10 -1e+10 1e+10 129 139 -7.22
/ 390 -1e+10 1e+10 -1e+10 1e+10 469 501 -7.22
atan2f 1206 -10 10 -10 10 2882 3455 -6.82
fdimf 446 -1e+10 1e+10 -1e+10 1e+10 75 218 -7.22
fmaxf 62 -1e+10 1e+10 -1e+10 1e+10 30 34 -∞
fminf 62 -1e+10 1e+10 -1e+10 1e+10 30 34 -∞
fmodf 312 -1e+10 1e+10 -1e+10 1e+10 88 324 -7.22
hypotf 1092 -1e+10 1e+10 -1e+10 1e+10 850 927 -6.92
ldexpf 238 -1e+10 1e+10 -10 10 40 40 -7.22
powf 1858 0 1e+04 -10 10 5182 5833 -4.65
__builtin_powif 732 0 1e+04 -10 10 648 1223 -6.39

For devices without MUL instruction, the following applies:

  • The execution times for multiplication and for the transcendental functions are roughly twice the time for devices that have MUL.
  • The execution times for the remaining functions are roughly the same.
  • The maximal relative errors are the same, i.e. independent of MUL.
libm benchmarks for AT90S8515 (avr2, no MUL)
Function Size x0 x1 Cyclesavr Cyclesmax log10(Errmax)
acosf 1104 -1 1 3513 3888 -6.66
asinf 1096 -1 1 3452 3879 -6.47
atanf 1054 -10 10 5280 5541 -6.92
cbrtf 538 -1e+06 1e+06 2702 2795 -6.92
ceilf 250 -1e+05 1e+05 105 174 -7.22
cosf 906 -1.57 1.58 3441 3798 -2.86
coshf 1348 -20 20 4966 5346 -5.78
expf 1312 -20 20 4512 5140 -5.78
floorf 250 -1e+05 1e+05 105 177 -7.22
frexpf 150 -1e+05 1e+05 39 39 -7.22
logf 1076 0 100 4562 5023 -6.69
log10f 1076 0 100 4568 5035 -6.62
log2f 1060 0 100 4205 4632 -6.83
modff 490 -1e+05 1e+05 365 456 -7.22
roundf 230 -1e+05 1e+05 109 154 -7.22
sinf 912 0 3.15 3408 3818 -3.67
sincosf 978 -1.57 3.15 6813 7225 -3.43
sinhf 1434 -20 20 4941 5358 -5.78
sqrtf 252 0 1e+06 474 510 -7.22
tanf 1164 0 3.15 4080 4785 -2.86
tanhf 1462 -20 20 5055 5544 -6.37
truncf 226 -1000 1000 137 175 -7.22

 

libm benchmarks for AT90S8515 (avr2, no MUL)
Function Size x0 x1 y0 y1 Cyclesavr Cyclesmax log10(Errmax)
+ 376 -1e+10 1e+10 -1e+10 1e+10 102 253 -7.22
* 378 -1e+10 1e+10 -1e+10 1e+10 346 386 -7.22
/ 374 -1e+10 1e+10 -1e+10 1e+10 467 499 -7.22
atan2f 1192 -10 10 -10 10 5287 5844 -6.82
fdimf 436 -1e+10 1e+10 -1e+10 1e+10 74 219 -7.22
fmaxf 66 -1e+10 1e+10 -1e+10 1e+10 31 36 -∞
fminf 66 -1e+10 1e+10 -1e+10 1e+10 30 36 -∞
fmodf 302 -1e+10 1e+10 -1e+10 1e+10 86 322 -7.22
hypotf 1068 -1e+10 1e+10 -1e+10 1e+10 1297 1387 -6.92
ldexpf 232 -1e+10 1e+10 -10 10 39 39 -7.22
powf 1820 0 1e+04 -10 10 9490 10180 -4.65
__builtin_powif 734 0 1e+04 -10 10 1143 2140 -6.39

Math Functions for IEEE double from LibF7

The following tables contain benchmark values for some IEEE double floating-point functions over the indicated range(s) of input values. LibF7 is a IEEE double implementation hosted by libgcc since GCC v10.

The code sizes include all dependencies with the exception of potential prologue and epilogue routines (__prologue_saves__, __epilogue_restores__).

The sizes of functions don't add up. For example, sinl, cosl, asinl, acosl and sqrtl together occupy only 4744 bytes of code including the prologue and epilogue routines. With -mrelax the code size reduces further to around 4400 bytes.

Notice that the values for relative error and the Worst Case Execution Time Cyclesmax are only lower bounds. The best achievable accuracy for IEEE double with its 52 fractional bits in the mantissa is log10(2−53 ≈ 1.1·10−16) ≈ −15.95.

LibF7 Benchmarks for ATmega128 (avr51, with MUL)
Function Size x0 x1 Cyclesavr Cyclesmax log10(Errmax)
acosl 3342 -1 1 16235 17997 -15.65
asinl 3342 -1 1 16124 18008 -15.65
atanl 2820 -10 10 20363 21255 -15.65
cbrtl 4102 -1e+06 1e+06 32368 33415 -15.33
ceill 1772 -1e+05 1e+05 1430 1768 -15.95
cosl 3900 -1.57 1.58 11241 13649 -15.66
coshl 3654 -20 20 20036 21466 -15.65
expl 3562 -20 20 16965 18417 -15.65
floorl 1710 -1e+05 1e+05 1354 1690 -15.95
frexpl 1256 -1e+05 1e+05 748 765 -15.95
logl 2944 0 100 15099 15825 -15.65
log10l 2954 0 100 15746 16437 -15.59
log2l 2954 0 100 15751 16418 -15.65
roundl 1796 -1e+05 1e+05 1664 1760 -15.95
sinl 3898 0 3.15 11976 14028 -15.65
sincosl 3880 -1.57 3.15 19913 22412 -15.65
sinhl 3862 -20 20 19842 21536 -15.51
sqrtl 1410 0 1e+06 3009 3087 -15.65
tanl 3946 0 3.15 22448 24870 -15.65
tanhl 3678 -20 20 20646 22347 -14.70
truncl 1710 -1000 1000 1118 1173 -15.95

 

LibF7 Benchmarks for ATmega128 (avr51, with MUL)
Function Size x0 x1 y0 y1 Cyclesavr Cyclesmax log10(Errmax)
+ 1470 -1e+10 1e+10 -1e+10 1e+10 1531 1667 -15.95
* 1600 -1e+10 1e+10 -1e+10 1e+10 1620 1661 -15.65
/ 1572 -1e+10 1e+10 -1e+10 1e+10 3563 3756 -15.65
atan2l 3208 -10 10 -10 10 21115 22010 -15.65
fdiml 1730 -1e+10 1e+10 -1e+10 1e+10 1371 1802 -15.95
fmaxl 234 -1e+10 1e+10 -1e+10 1e+10 169 188 -∞
fminl 234 -1e+10 1e+10 -1e+10 1e+10 170 189 -∞
fmodl 2808 -1e+10 1e+10 -1e+10 1e+10 5175 5947 -15.95
hypotl 2312 -1e+10 1e+10 -1e+10 1e+10 5006 5166 -15.66
ldexpl 1254 -1e+10 1e+10 -10 10 738 756 -15.95
powl 4080 0 1e+04 -10 10 32440 34066 -14.46
__builtin_powil 2224 0 1e+04 -10 10 3396 6173 -15.12

Fixed-Point Functions from <stdfix.h>

The following tables contain benchmark values for some fixed-point functions over the indicated range of input values.

  • V+ denotes the smallest value that is larger than V for the considered fixed-point type. Similarly, V- denotes the largest value that is smaller than V for the considered type.
  • The code sizes include all dependencies.

Notice that the values for absolute error Errmax, and the Worst Case Execution Times Cyclesmax are only lower bounds.

Fixed-Point Benchmarks for ATmega128 (avr51, with MUL)
Function Size x0 x1 Cyclesavr Cyclesmax Errmax
log2uhk 78 0+ 10 52 75 1.25e-02
log21puhr 32 0 1- 22 22 4.28e-03
sinuhk_deg 272 0 256- 53 55 6.44e-05
cosuhk_deg 318 0 256- 73 77 6.44e-05
sqrthk 92 0 256- 293 309 3.91e-03
sqrtuhk 70 0 256- 277 293 3.91e-03
sqrthr 42 0 1- 100 100 3.84e-03
sqrtuhr 38 0 1- 98 98 3.90e-03
acosk 404 -1 1 416 572 5.39e-05
acosuk 328 0 1 385 526 4.51e-05
asink 386 -1 1 414 589 4.95e-05
asinuk 328 0 1 381 539 4.43e-05
atank 368 -1 1 242 264 4.04e-05
atank 368 1 10 888 913 4.65e-05
atanuk 298 0 1 203 206 2.52e-05
atanur 152 0 1- 188 191 2.46e-05
exp2k 214 -10 10 255 317 1.09e-02
exp2uk 164 0 10 245 293 1.06e-02
exp2m1ur 112 0 1- 177 180 2.13e-05
log2uk 184 0+ 10 257 305 6.03e-05
log21pur 114 0 1- 212 215 2.87e-05
cospi2k 182 -4 4 249 258 4.52e-05
sinpi2k 182 -4 4 247 256 4.54e-05
sinpi2ur 120 0 1- 215 219 2.80e-05
sqrtk 128 0 65536- 572 635 1.53e-05
sqrtuk 94 0 65536- 550 613 1.53e-05
sqrtr 88 0 1- 290 313 1.53e-05
sqrtur 66 0 1- 274 297 1.53e-05

 

Fixed-Point Benchmarks for ATtiny88 (avr25, no MUL)
Function Size x0 x1 Cyclesavr Cyclesmax Errmax
log2uhk 114 0+ 10 339 380 1.25e-02
log21puhr 70 0 1- 312 332 4.28e-03
sqrthk 88 0 256- 291 307 3.91e-03
sqrtuhk 68 0 256- 276 292 3.91e-03
sqrthr 54 0 1- 102 105 3.84e-03
sqrtuhr 50 0 1- 100 104 3.90e-03
acosk 446 -1 1 1097 1233 5.39e-05
acosuk 378 0 1 1070 1197 4.51e-05
asink 428 -1 1 1095 1249 4.95e-05
asinuk 378 0 1 1066 1210 4.43e-05
atank 412 -1 1 877 973 4.04e-05
atank 412 1 10 1538 1630 4.65e-05
atanuk 352 0 1 844 924 2.52e-05
atanur 212 0 1- 830 910 2.46e-05
exp2k 234 -10 10 993 1074 1.09e-02
exp2uk 184 0 10 982 1050 1.06e-02
exp2m1ur 134 0 1- 916 939 2.13e-05
log2uk 214 0+ 10 1222 1294 6.03e-05
log21pur 146 0 1- 1187 1228 2.87e-05
cospi2k 204 -4 4 1235 1285 4.52e-05
sinpi2k 204 -4 4 1233 1283 4.54e-05
sinpi2ur 146 0 1- 1203 1248 2.80e-05
sqrtk 124 0 65536- 570 633 1.53e-05
sqrtuk 92 0 65536- 549 612 1.53e-05
sqrtr 84 0 1- 288 311 1.53e-05
sqrtur 64 0 1- 273 296 1.53e-05