CAE - Encoding (Lesson)
Encoding
Uses of Encoding
In this lesson, you will learn about encoding in cybersecurity -- the process of transforming data into a different format using a method that is publicly available. This transformation is done using a set of algorithmic rules, which means that anyone who knows the method or algorithm can easily decode and understand the original information. The primary purpose of encoding is not to keep the information secret, but to ensure its compatibility and proper formatting for reliable transmission, storage, or processing within different systems or platforms.
In computing, character encoding is used to represent a range of characters by some kind of conversion system. For example, the ASCII codes associate letters & numbers from the keyboard with a set of decimal, hex, octal, or binary numbers.
Encoding vs Encryption
It is important to differentiate between encryption (which is meant to keep a secret) and encoding (which does NOT provide confidentiality). Encoding is used primarily as a method of making some digital communication or transfer of data easier for computers.
The characters are represented by the computer in a different form than our usual alphabet, so humans don’t recognize them easily. No key is needed to convert the encoded characters into human readable language which means this is NOT cryptography!
Encoding Types
Binary
As we learned in a previous lesson, binary uses a base 2 number system. Its only digits are 0 and 1. Binary encoding uses one byte (8 bits) for each encoded character. Many versions of the ASCII chart do not include binary, so it is useful to convert the binary number to decimal, then look up the decimal value in the ASCII chart.
- Example: 01000001 binary/base 2 system is equivalent to 65 decimal/base 10 system. 65 is equivalent to letter A in the ASCII table. Remember, A is encoded differently from a, which is 1100001 in binary system and 97 in decimal system. Case matters!
Translate Binary Code Activity
Hexadecimal
Hexadecimal uses the base 16 number system. The digits are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F. Encoding is done in groupings of two digits.
You are probably wondering about the use of hexadecimal encoding. It is used in digital devices and code in many ways, often as a way of naming or addressing.
Hexadecimal encoding is often designated using 0x or \x. However, 0x is not included as part of the number!
Hexadecimal Code Activity
Base 64
Base 64 is not based on a number system. It uses 64 printable characters and a formula. The following characters are used: A-Z, a-z, 0-9, +, and /. You will often (but not always) see = or == characters appended to the end of encoding.
Base 64 system is used to encode image and sound files for embedding into HTML, CSS, etc. Because the encoding process is complicated, we typically use an online Base 64 encoder/decoder program.
For example, the word Cybersecurity encoded in Base64 is Q3liZXJzZWN1cml0eQ==
Hashing
Hashing means applying an algorithm to data input. Hashing is a process used to convert data of any size into a fixed-size value or key. This key represents the original string of data in a more compact form. Hashing is widely used in various applications, including data retrieval, data encryption, and digital signatures, among others.
No key is used in hashing so the hash can be decoded by anyone using the correct hashing algorithm. Hashing algorithms always produce the same size output, regardless of the input size.
There are many different hashing algorithms: MD5, SHA1, SHA256, and RIPEMD. We use an online or command line tool to encode or decode.
For example, the word Cybersecurity encoded in MD5 Hash is 06244f27e66bff1f199cc32bf37e27.
Threat Purposes for Encoding
As you already learned earlier in this lesson, encoding is NOT encryption, meaning that it can’t be used to protect confidentiality. However, encoding can be used by threat actors to hide information. There are two primary goals of a hacking activity:
- Obfuscation means hiding the intended meaning in communication, making communication confusing, willfully ambiguous, and harder to interpret.
- Example: Malware may obfuscate portions of the code so that users or anti-virus software cannot recognize its function.
- Exfiltration means unauthorized transfer of data from a computer, to smuggle information out of an organization.
- Example: Edward Snowden used a flash drive to exfiltrate data off the NSA servers.
Reflection and Wrap-up
In this lesson, you have learned about the importance of encoding. In computing, encoding plays a crucial role in maintaining data integrity and efficiency. It enables the safe and accurate transfer of data across diverse systems and networks by standardizing formats, which is essential for interoperability. For example, encoding schemes like Base64 are used to encode binary data into ASCII characters, making it possible to transmit binary files over text-based protocols like HTTP. Encoding also facilitates the compression of data, which can significantly reduce storage requirements and improve transmission speed.
However, because encoded data is not inherently secure, it must be complemented with encryption for sensitive information that requires confidentiality. Thus, while encoding enhances data usability and compatibility across different computing environments, it must be used judiciously, especially when dealing with sensitive information, to ensure that security is not compromised.
[CC BY-NC-SA 4.0 Links to an external site.] UNLESS OTHERWISE NOTED | IMAGES: LICENSED AND USED ACCORDING TO TERMS OF SUBSCRIPTION - INTENDED ONLY FOR USE WITHIN LESSON.