What Is the Longest Glibc Function Name?

nm, awk, sort

I was working on exposing the posix_spawn* related C functions in the Rust nix crate for use with a job control shell when I had a sudden thought: “What is the longest glibc function name?”.

I do! But regardless, the idea has arrived and It Must Be Done. At first, I wondered if I could retrieve a list of symbols from the excellent Elixir Cross Referencer project, but a cursory browse through the site and source code revealed nothing trivial. Then I remembered about the nm utility, and since I have a copy of glibc handy on my machine:

$ fd "libc.so" /usr/lib
/usr/lib/libc.so
/usr/lib/libc.so.6

Let’s check it out!

$ nm "/usr/lib/libc.so.6"
nm: /usr/lib/libc.so.6: no symbols

Apparently, according to this Stack Overflow post:

By default, nm reads the .symtab section in ELF objects, which is optional in non-relocatable objects. With the -D/--dynamic option, you can instruct nm to read the dynamic symbol table (which are the symbols actually used at run time). You may also want to use --with-symbol-versions because glibc uses symbol versioning extensively.

Let’s check it out! Since I want to eventually measure function names by length, I’ll use --without-symbol-versions instead.

$ nm -D --without-symbol-versions "/usr/lib/libc.so.6"
000000000003a020 T a64l
0000000000022466 T abort
...
0000000000108c00 T __xstat
0000000000108c00 T __xstat64

$ nm -D --without-symbol-versions "/usr/lib/libc.so.6" | wc -l
3038

Holy moly! Let’s clean this up a bit. Scrolling through the output, I see certain symbols I would like to not consider, like internal functions (eg. __fork) and glibc related constants (eg. GLIBC_2.10).

$ nm -SNIP- | awk '$NF ~ /^[^_]/ && $(NF - 1) ~ /^[^A]/'

000000000003a020 T a64l
0000000000022466 T abort
...
00000000001453d0 T xprt_register
0000000000145510 T xprt_unregister

Here, I am retrieving lines where the last column does not start with an underscore and where the second last column does not start with the letter A.

Since I’m already using awk, might as well use it to limit output to the last column as well as its length:

$ nm -SNIP- | awk '$NF ~ /^[^_]/ && $(NF - 1) ~ /^[^A]/ { print $NF, length($NF) }'

a64l 4
abort 5
...
xprt_register 13
xprt_unregister 15

All that’s left is to sort by the second column, which we can conveniently do with… sort! Let’s also filter the output to remove duplicates, which are in the output of nm for some reason.

$ nm --SNIP-- | awk -SNIP- | sort --key=2 --general-numeric-sort | uniq

abs 3
brk 3
...
posix_spawn_file_actions_addclosefrom_np 40
posix_spawn_file_actions_addtcsetpgrp_np 40

There we have it! The longest function name in glibc is posix_spawn_file_actions_addtcsetpgrp_np. What functions do we have with length greater or equal to 25?

argp_program_version_hook 25
posix_spawnattr_getpgroup 25
posix_spawnattr_setpgroup 25
pthread_attr_getguardsize 25
pthread_attr_getstackaddr 25
pthread_attr_getstacksize 25
pthread_attr_setguardsize 25
pthread_attr_setstackaddr 25
pthread_attr_setstacksize 25
pthread_condattr_getclock 25
pthread_condattr_setclock 25
pthread_mutexattr_destroy 25
pthread_mutexattr_gettype 25
pthread_mutexattr_settype 25
register_printf_specifier 25
posix_spawnattr_getsigmask 26
posix_spawnattr_setsigmask 26
pthread_attr_getschedparam 26
pthread_attr_getsigmask_np 26
pthread_attr_setschedparam 26
pthread_attr_setsigmask_np 26
pthread_getattr_default_np 26
pthread_rwlockattr_destroy 26
pthread_rwlock_clockrdlock 26
pthread_rwlock_clockwrlock 26
pthread_rwlock_timedrdlock 26
pthread_rwlock_timedwrlock 26
pthread_setattr_default_np 26
pthread_attr_getaffinity_np 27
pthread_attr_getdetachstate 27
pthread_attr_getschedpolicy 27
pthread_attr_setaffinity_np 27
pthread_attr_setdetachstate 27
pthread_attr_setschedpolicy 27
pthread_barrierattr_destroy 27
pthread_condattr_getpshared 27
pthread_condattr_setpshared 27
pthread_mutexattr_getrobust 27
pthread_mutexattr_setrobust 27
pthread_mutex_consistent_np 27
obstack_alloc_failed_handler 28
pthread_attr_getinheritsched 28
pthread_attr_setinheritsched 28
pthread_mutexattr_getkind_np 28
pthread_mutexattr_getpshared 28
pthread_mutexattr_setkind_np 28
pthread_mutexattr_setpshared 28
pthread_mutex_getprioceiling 28
pthread_mutex_setprioceiling 28
posix_spawnattr_getschedparam 29
posix_spawnattr_getsigdefault 29
posix_spawnattr_setschedparam 29
posix_spawnattr_setsigdefault 29
posix_spawn_file_actions_init 29
program_invocation_short_name 29
pthread_kill_other_threads_np 29
pthread_mutexattr_getprotocol 29
pthread_mutexattr_setprotocol 29
pthread_rwlockattr_getkind_np 29
pthread_rwlockattr_getpshared 29
pthread_rwlockattr_setkind_np 29
pthread_rwlockattr_setpshared 29
posix_spawnattr_getschedpolicy 30
posix_spawnattr_setschedpolicy 30
pthread_barrierattr_getpshared 30
pthread_barrierattr_setpshared 30
pthread_mutexattr_getrobust_np 30
pthread_mutexattr_setrobust_np 30
posix_spawn_file_actions_adddup2 32
posix_spawn_file_actions_addopen 32
posix_spawn_file_actions_destroy 32
pthread_mutexattr_getprioceiling 32
pthread_mutexattr_setprioceiling 32
posix_spawn_file_actions_addclose 33
posix_spawn_file_actions_addchdir_np 36
posix_spawn_file_actions_addfchdir_np 37
posix_spawn_file_actions_addclosefrom_np 40
posix_spawn_file_actions_addtcsetpgrp_np 40

Hmm, posix_spawn / pthread dominance.

Not really, unfortunately. For people looking to reproduce the results, I have version 2.37 of glibc. The glibc functions with the longest names are interestingly enough non-portable gnu extensions to posix_spawn. For people running older machines, the longest function name is likely to be posix_spawn_file_actions_addclose, which is part of the POSIX standard.

We can also look at the shortest functions:

tee 3
ftw 3
ffs 3
err 3
dup 3
div 3
brk 3
abs 3

Ping me if you manage to come up with a sentence using as many of these function names as you can.

data, data

Let’s look at some interesting (at least to me) data! Here’s a plot of function name length against count:

303525201510505010015020025040
Mean
10.47
Median
9
Mode
7
Variance
30.21
Standard Deviation
5.50

Speaking of Python, let’s do further analysis on the list of function names!

>>> import collections
>>> unique_letters = collections.defaultdict(list)
>>> for word in words:
...     letters = "".join(sorted(set(word)))
...     pair = (word, letters)
...     unique_letters[len(letters)].append(pair)
...
>>> max_unique_letters = unique_letters[max(unique_letters)]
>>> len(max_unique_letters[0][1])
18
>>> print("\n".join(str(t) for t in max_unique_letters))
('pthread_mutexattr_setprioceiling', '_acdeghilmnoprstux')
('posix_spawnattr_getschedpolicy', '_acdeghilnoprstwxy')
('pthread_mutex_setprioceiling', '_acdeghilmnoprstux')

With 18 unique letters, these three functions share the prize for Function Name With The Most Unique Letters. Which does make sense! pthread, mutex, ceiling, policy all cover many different letters.

Wonder no more!

>>> import textwrap
>>> with open("/usr/share/dict/words") as f:
...     english_words = f.read().splitlines()
...
>>> valid_words = sorted(set(words).intersection(set(english_words)))
>>> print(textwrap.fill(" ".join(valid_words)))
abort abs accept access acct advance alarm atoll bind clock clone
close connect daemon daylight div err error exit finite flock fork
free gets glob index ioctl kill labs link listen login logout mount
nice open pause personality pipe poll puts raise rand random read
reboot remove rename revoke rewind select send shutdown signal sleep
socket splice stat step swab sync system tee time times timezone
truncate wait warn write
>>> len(valid_words)
70

That’s a lot of english words!

I… don’t think so? Going down the rabbit hole: I got the list of words from the Arch Linux words package, which links the Aspell wordlist as upstream, which led me to Spell Checker Oriented Word Lists (SCOWL).

SCOWL (Spell Checker Oriented Word Lists) and Friends is a database of information on English words useful for creating high-quality word lists suitable for use in spell checkers of most dialects of English.

They also have a simple web frontend to look up words in the list. Searching for ioctl:

Result of the Aspell word lookup for “ioctl”
Looking up “ioctl” in the Aspell web page.

It’s the nerd “hacker” list. Explains a lot!

Conclusion

Not quite sure what to make of my brief excursion into glibc symbol names. Go on your own trip, maybe?