SWI-Prolog string handling has evolved over time. The functions that
create atoms or strings using char*
or wchar_t*
are "old school"; similarly with functions that get the string as
char*
or wchar_t*
. The PL_get_unify_put_[nw]chars()
family is more friendly when it comes to different input, output,
encoding and exception handling.
Roughly, the modern API is PL_get_nchars(), PL_unify_chars() and PL_put_chars() on terms. There is only half of the API for atoms as PL_new_atom_mbchars() and PL-atom_mbchars(), which take an encoding, length and char*.
For return values, char*
is dangerous because it can
point to local or stack memory. For this reason, wherever possible, the
C++ API returns a std::string
, which contains a copy of the
the string. This can be slightly less efficient that returning a
char*
, but it avoids some subtle and pervasive bugs that
even address sanitizers can't detect.16If
we wish to minimize the overhead of passing strings, this can be done by
passing in a pointer to a string rather than returning a string value;
but this is more cumbersome and modern compilers can often optimize the
code to avoid copying the return value.
Some functions require allocating string space using PL_STRINGS_MARK().
The PlStringBuffers
class provides a RAII wrapper
that ensures the matching PL_STRINGS_RELEASE() is done. The PlAtom
or PlTerm
member functions that need the string buffer use PlStringBuffers
,
and then copy the resulting string to a std::string
value.