Be careful, the source code that you read is not necessarily the one that will be executed. Security researchers at the University of Cambridge have just revealed an attack called “Trojan Source” which allows malicious code to be incorporated into source code which, to the naked eye, appears completely normal, regardless of the underlying language.

In fact, the researchers tested their technique successfully for the languages ​​C, C ++, C #, Javascript, Java, Rust, Go and Python. Trojan Source is also a formidable supply chain attack. A backdoor hidden in a module open source can be found as is in all the software that would integrate it. However, some bricks open source are used in a lot of software. The potential for dissemination is therefore very great.

So how is this possible? Trojan Source is based on the fact that you can indicate different directionality in the same text encoded in the Unicode standard. This allows, for example, to correctly display a quote in Arabic or Hebrew within a French text. These changes of direction are made using invisible characters called “Bidi”. The characters “LRI” and “RLI” respectively indicate that the following words should be displayed from left to right, or from right to left.





However, the Unicode standard is also used for computer coding. Nothing therefore prevents a hacker from using these invisible characters to manipulate source code. By placing them cleverly, it can cause an unexpected return of a function or turn part of a comment into executable code. The character strings could also have a different value from the one seen in the displayed code. In short, the possibilities for manipulation are enormous and, at present, difficult to identify.

Indeed, the use of Bidi characters does not cause any particular alert in the compilers and development tools. They go completely under the radar.

DR – On the left the code as it will be executed, on the right as it will be displayed



A priori, this technique has not yet been used by hackers. Researchers scanned over 7,000 software repositories open source and found nothing malicious, other than use as code obfuscation.

The ball is now in the court of the compiler and development tool editors. The good news is that it is quite simple to correct the shooting, because it suffices to exclude the use of Bidi characters, except for perfectly defined uses. Unfortunately, the ecosystem is not very responsive.

The researchers alerted nineteen organizations, giving them 99 days to deliver a patch. Only half of them did, the rest are dragging their feet. Developers therefore have an interest in properly analyzing the third-party codes that they integrate. For example using the “vim” editor which clearly displays the Bidi characters without changing the directionality of the text.

Source: Trojan Source