Home Assembly Overview
Post
Cancel

Assembly Overview

Overview

Assembly programming is a low-level computer programming language that is used to write software that runs directly on a computer’s hardware. It is considered a low-level language because it is closer to the machine code that a computer can execute directly, as opposed to high-level languages like C, C++, and Python, which are closer to human-readable languages and require additional processing by a compiler or interpreter.

CPU Architectures have differing ways of understanding and executing code also, several formats exist however arguably the most common and the one chosen for this post is the Intel x86 architecture.

Core Assembly Components

There are several fundamental components of assembly language, below is a list of some of the more commonly used components for malware analysis:

  1. Instructions: These are the basic building blocks of an assembly program, and they tell the CPU what to do. Each instruction consists of an operation and one or more operands. Examples of instructions include “mov”, “add”, and “cmp”.
  2. Registers: These are special memory locations inside the CPU that are used to store data temporarily during the execution of a program. There are different types of registers for different purposes, such as storing values, addressing memory, and holding flags.
  3. Operands: Operands are the values or addresses that are used by instructions. They can be registers, memory addresses, or constants. For example, in the instruction “mov eax, 5”, the register “eax” is the destination operand and the constant “5” is the source operand.
  4. Memory: This is where a program’s data and instructions are stored. Programs can access memory using addresses, which are assigned to specific locations in memory.
  5. Labels: These are used to give names to specific locations in a program’s code. Labels can be used to make it easier to read and understand the code, and they can also be used as destinations for jumps and calls. For example, a label might be used to mark the start of a loop or the location of a specific function.

Common Assembly

Below is a table of common assembly language instructions:

OperandDescription
MOVMove data from one location to another
ADDAdd two values and store the result
SUBSubtract one value from another and store the result
MULMultiply two values and store the result
DIVDivide one value by another and store the result
INCIncrement a value by 1
DECDecrement a value by 1
ANDPerform a bitwise AND operation on two values and store the result
ORPerform a bitwise OR operation on two values and store the result
XORPerform a bitwise XOR operation on two values and store the result
NOTPerform a bitwise NOT operation on a value (inverts all the bits)
SHLShift the bits of a value left by a specified number of positions
SHRShift the bits of a value right by a specified number of positions
JMPJump to a specified location in the program
JEJump to a specified location if the last comparison resulted in equal
JNEJump to a specified location if the last comparison did not result in equal
JGJump to a specified location if the last comparison resulted in greater than
JGEJump to a specified location if the last comparison resulted in greater than or equal
JLJump to a specified location if the last comparison resulted in less than
JLEJump to a specified location if the last comparison resulted in less than or equal
CMPCompare two values and set the status flags based on the result
PUSHPush a value onto the stack
POPPop a value off the stack
CALLCall a subroutine at a specified location
RETReturn from a subroutine
INTGenerate a software interrupt

Below is a table of common assembly language registers:

RegisterDescription
EAXThe accumulator register, used for arithmetic and data transfer operations
EBXThe base register, used to hold a memory address
ECXThe count register, used as a loop counter and for shifts and rotates
EDXThe data register, used for arithmetic and data transfer operations
ESIThe source index register, used as a pointer to data in memory
EDIThe destination index register, used as a pointer to data in memory
EBPThe base pointer register, used to point to the base of the current stack frame
ESPThe stack pointer register, used to point to the top of the stack
EIPThe instruction pointer register, holds the address of the next instruction to be executed
CFThe carry flag, set if the last arithmetic operation resulted in a carry or borrow
PFThe parity flag, set if the last operation resulted in an even number of 1 bits
AFThe auxiliary carry flag, used in binary-coded decimal (BCD) arithmetic
ZFThe zero flag, set if the last operation resulted in a zero value
SFThe sign flag, set if the last operation resulted in a negative value
OFThe overflow flag, set if the last operation resulted in an overflow or underflow

Here is a table of common assembly language labels:

LabelDescription
STARTThe label for the beginning of the program
ENDThe label for the end of the program
LOOPA label used to mark the beginning of a loop
LOOPENDA label used to mark the end of a loop
PROCA label used to mark the beginning of a subroutine or procedure
PROCENDA label used to mark the end of a subroutine or procedure
DATAA label used to mark the beginning of a data section in the program
BSSA label used to mark the beginning of a block of memory that will be used for data storage but does not have any initial values

Interpretation of Assembly Langague

In the example below, there are some core sections that require a bit of explaining.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
0x00401000      push    ebp
0x00401001      mov     ebp, esp
0x00401003      sub     esp, 8
0x00401006      lea     eax, [var_8h]
0x00401009      push    eax
0x0040100a      push    0xf003f    ; '?'
0x0040100f      push    0
0x00401011      push    str.SOFTWARE__Microsoft___XPS ; 0x40c040
0x00401016      push    0x80000002
0x0040101b      call    dword [RegOpenKeyExA] ; 0x40b020
0x00401021      test    eax, eax
0x00401023      je      0x401029
0x00401025      xor     eax, eax
0x00401027      jmp     0x401066

In this example, there are several instructions that are being executed. The first instruction, push ebp, pushes the value of the EBP register onto the stack. The next instruction, mov ebp, esp, moves the value of the ESP register (the stack pointer) into the EBP register (the base pointer).

The sub esp, 8 instruction subtracts 8 from the value of the ESP register, which is used to allocate space on the stack for local variables. The lea eax, [var_8h] instruction loads the address of the local variable var_8h into the EAX register.

Other instructions in this example include push eax, which pushes the value of the EAX register onto the stack, and call dword [RegOpenKeyExA], which calls the RegOpenKeyExA function and passes it the values on the stack as arguments. The test eax, eax instruction tests the value of the EAX register, and the je 0x401029 instruction jumps to the specified address if the test is equal to zero.

References

This post is licensed under CC BY 4.0 by the author.