Fixing the transcoding error in SAS
Here is why I am still not a 10x developer despite using AI … but first, the answer to your problem, which is probably how you found this blog in the first place. This answer was courtesy NOT of any generative AI program but from reading the SAS documentation. Or, as we used to say on the SAS listserv back in the day, RTFM.
Here was the error:
.. ERROR: Some character data was lost during transcoding in the dataset IN.PRE_POST. Either the data contains characters that are not representable in the new encoding or truncation occurred during transcoding.
And here is the code that fixes it
Libname in "~/data_analysis_examples/ " inencoding='any';
Data pre_post (encoding='any') ;
set in.pre_post ;
Why, sadly, AI has not made me a 10x coder
I used copilot to try to solve this problem initially. The above working code was generated NOT by copilot but with me by reading the SAS documentation. Copilot did get the idea of using encoding = any but:
1. It told me to include it after a / and without parentheses, which is wrong. I’m not showing that here because I don’t want another LLM picking up incorrect code.
2. After I responded with the correct statement (and also, it still didn’t fix the problem), it told me to put (inencoding = ‘any’) on the set statement. Also wrong syntax and wrong place to
Because this is not my first rodeo, after the incorrect solution from copilot, I searched ‘encoding = any’ and inencoding= ‘any’ in the SAS documentation and that provided me with the correct code ASAP. That, by the way, is my general recommendation for becoming a 1.15x developer with AI. Use it to come up with a solution in 30 seconds. Try that solution. If it doesn’t work, drop AI and search the documentation for the language, starting with the first AI recommendation. Some day, when I have time, I might write a post on why I think this is the best solution – the TL; DR; is that since the LLM is finding a high probability solution, the answer is likely to be something in the code it generated, just not an exact match. Like, in this example, encoding and inencoding were what I needed to sue.
The distinction between inencoding= on the LIBNAME statement and encoding= on the DATA step looks minor syntactically but they operate at different layers of how SAS reads and writes data.
What stands out to me is your workflow. Using AI to surface high probability keywords like encoding and inencoding is efficient. But then, shifting immediately to the SAS documentation to confirm exact syntax and scope is what turns a guess into a fix.