Introduction to Assembly Language

Prerequisites

Running Assembly Programs on Intel Platform

First, convert the assembly code to machine code:

1$ yasm -f macho64 hello-world.asm

Running on ARM Platform

What is Assembly Language

The CPU is agile yet foolish. Essentially, it can only read the numbers 0 and 1, but it can perform millions of operations per second. This seemingly contradictory statement will make more sense after reading this article.

The programming languages we commonly write, such as Java, Python, and JavaScript, are considered "high-level languages." Computers (machines) cannot understand these codes directly. For example, for individuals with hearing impairments, a sign language interpreter is needed to convert spoken language into gestures.

These high-level programming languages cannot be executed directly on computer hardware; they need to be translated into machine language (usually machine code or bytecode) that the computer can understand.

Although high-level programming languages have advantages such as readability, ease of writing, and cross-platform compatibility, assembly language, as a low-level language, still holds significant importance in certain situations. It can:

  • Interact directly with system hardware
  • Consume fewer resources, which can be used to optimize performance

It is important to note that regardless of whether it is a high-level or low-level language, it ultimately needs to be converted to machine code to run.

By mastering assembly language, you can gain:

  • A deeper understanding of how computers work internally
  • The ability to optimize the performance of existing high-level language code
  • Skills to write operating systems and low-level drivers
  • The capability to reverse-engineer programs

Common types of assembly languages include:

  • X86
  • ARM
  • 6502

Machine Code

Machine code is the code that the CPU can execute directly. Typically, we do not need to write machine code manually; instead, we use an assembler to convert assembly code (or other high-level languages) into machine code.

Machine code looks something like this:

110111000000000000000011
200000000010100000000110

However, such code is difficult to remember and inconvenient to debug, which is why assembly language was created.

The core of assembly language is assembly instructions.

In assembly language, we can use registers to store variables. For example, we can use the mov instruction to store data in a register and the add instruction to add data from two registers.

Functions are an important concept in assembly language. Similar to functions in high-level languages, functions in assembly language can also accept parameters and return values. We can use the call instruction to invoke a function and the ret instruction to return the function result.

Next, we will explain common usages in assembly language.

Variable Operations

Conditional Control

In JavaScript, common conditional control syntax is as follows:

1if (condition) {
2  // do something
3} else {
4  // do something else
5}

In assembly language, we can use the cmp instruction for comparison and the je and jne instructions for jumping:

Where:

  • je stands for Jump if Equal
  • jne stands for Jump if Not Equal
1    mov eax, 1      ; set eax to 1
2    cmp eax, 1      ; compare eax to 1
3    je true_label   ; jump to true_label if equal
4    ; do something if not equal
5    jmp end_label   ; jump to end_label
6true_label:
7    ; do something if equal
8end_label:

The above code implements a simple if/else statement block.

Loop Processing

Due to the lack of many "syntactic sugars" found in higher-level languages, we often need to simulate some features of high-level languages in assembly language.

For example, a loop in JavaScript is written as follows:

1for (let i = 0; i < 10; i++) {
2  // do something
3}

In assembly language, we can use the cmp instruction for comparison and the jmp instruction for jumping:

1    mov rcx, 0      ; initialize counter
2loop_start:
3    cmp rcx, 10     ; compare counter to 10
4    jge loop_end    ; jump to end if greater than or equal to 10
5    ; do something
6    inc rcx         ; increment counter
7    jmp loop_start  ; jump to loop_start
8loop_end:

The above code implements a counter from 0 to 9.

Functions

Practical: Writing an Assembly Program