using a SHFL K8 instead of the MUL K100 will accomplish the same thing, and use less of CPU time doing it - it's not much time, but if you've ever been in a position where you needed to minimize the PLC scan, you'll take every few ms anywhere you can get them.
I'm not sure why you didn't want to use shift left or shift right to pack the bits and/or bytes into words, but hey .... to each his own