Skip to main content

bit, byte, KB, GB, MG, TB, PB

· One min read

1bit 是计算机中最小的数据单位,1bit 就对应一个高低电位。

在计算机上下文中,各单位的换算关系如下:

1bit×8=1byte1bit \times 8 = 1byte 1byte×1024=1KB(kilobyte)1byte \times 1024 = 1KB (kilobyte) 1KB×1024=1MB(megabyte)1KB \times 1024 = 1MB (megabyte) 1MB×1024=1GB(gigabyte)1MB \times 1024 = 1GB (gigabyte) 1GB×1024=1TB(terabyte)1GB \times 1024 = 1TB (terabyte) 1TB×1024=1PB(petabyte)1TB \times 1024 = 1PB (petabyte)
warning

在十进制或者国际单位制中,会用 1000 作为进率,而不是 1024。注意区分上下文。


在 UTF-8 编码中

  1. 一个 US-ASCII 字符只需要 1byte
  2. 带有变音符号的拉丁文、希腊文、西里尔字母、亚美尼亚语、希伯来文、阿拉伯文、叙利亚文等字符需要 2byte
  3. 其他语言(包括中日韩文字、东南亚文字、中东文字等)使用 3byte
  4. 极少数语言用 4byte

Swapping Values Without Intermediate Variables

· One min read

Here are three methods to swap values without using intermediate variables.

1. Addition and Subtraction

Assume we have variables:

let a = 2, b = 5

Swap:

a = a + b // a = 7

b = a - b // b = 2

a = a - b // a = 5

Disadvantages:

  • Can only handle numeric values
  • May overflow when adding large numbers

2. Multiplication and Division

Since we have addition and subtraction, we can naturally think of multiplication and division:

a = a * b // a = 10

b = a / b // b = 2

a = a / b // a = 5

Disadvantages:

  • Precision loss
  • Divisor cannot be 0

3. XOR Method

XOR is a mathematical operation where different values yield 1, same values yield 0:

Result
000
011
101
110

To perform XOR operations in computers, we need to convert values to binary first. a in binary is 010, b in binary is 101:

a = a ^ b // 010 ^ 101 = 111

b = a ^ b // 111 ^ 101 = 010

a = a ^ b // 111 ^ 010 = 101

Disadvantages:

  • Cannot handle floating-point variables

Summary

All three methods are clever tricks, only for learning purposes. It's best not to use them in production environments.

Simple Distinction Between Concurrency and Parallelism

· 2 min read

When I first started learning about concurrency and parallelism, I always got a bit confused. Now I'm finally taking the time to properly clarify the relationship between the two.

First, the word for concurrency is "Concurrency". The root "con-" means "together", and the root "current-" means "current", so the overall meaning is "things happening together".

The word for parallelism is "Parallelism". The root "parallel-" means parallel or side by side, so the overall meaning is "things happening in parallel", which can also be said as "things happening simultaneously".

Here's a vivid example mentioned by a Zhihu answerer:

You're eating a meal when the phone rings. You wait until you finish eating before answering. This means you support neither concurrency nor parallelism.

You're eating a meal when the phone rings. You stop to answer the phone, then continue eating after the call. This means you support concurrency.

You're eating a meal when the phone rings. You answer the phone while continuing to eat. This means you support parallelism.

The key to concurrency is having the ability to handle multiple tasks, not necessarily simultaneously. The key to parallelism is having the ability to handle multiple tasks simultaneously. So I think the most crucial point is: whether it's simultaneous.

From this Zhihu answerer, we can actually understand that: parallelism is a special case of concurrency. Concurrency only requires the ability to handle multiple tasks, while parallelism requires the ability to handle multiple tasks simultaneously.

With this in mind, let's clarify the concurrency and parallelism of multi-core and single-core CPUs.

First, for single-core CPUs: A single-core CPU is very clear - it can only do one thing at a time, so it cannot have parallelism, but it may have concurrency (if this CPU supports the ability to handle multiple tasks).

Second, for multi-core CPUs: Multi-core CPUs are composed of multiple single-core CPUs, so handling multiple tasks at the same time becomes effortless, thus having parallelism (having parallelism means definitely having concurrency, as we've already said parallelism is a special case of concurrency).

简要介绍进制和 ASCII 码表

· 4 min read

1. What is a Number System

  • Number System A number system, officially called "positional notation system", is a counting method. The most commonly used is the decimal system.
    • Decimal System The decimal system follows the "carry over at ten" rule, so it has 10 digits to represent numbers—0,1,2,3,4,5,6,7,8,9

In daily life, besides the decimal system, there are many other common number systems that you might not have noticed. For example, clocks use a base-60 system for seconds. The second hand goes from 0 to 59, then after one more second, it carries over to become one minute.

Speaking of computers, computers could also use the decimal system to represent numbers. So why do they specifically use binary? The reason is simple: binary only has 0 and 1, making it very simple to represent.

Why is binary simple? Because computers are fundamentally hardware, and choosing between two pathways versus ten pathways is definitely easier with the former. The only trade-off is that binary needs to take more steps to reach the same destination.

2. About Binary

Let's first look at how decimal numbers are represented. Suppose we have a decimal number (3107)10(3107)_{10}

Note

  1. Here the number (3107)10(3107)_{10} is enclosed in parentheses with a subscript 10, indicating it's in base 10 This subscript is used to indicate the number system. For example, (3107)8(3107)_{8} - notice the subscript is 8, so it's no longer decimal but octal Also, why don't we usually write it this way? Because decimal is so common that it's assumed by default
  2. Besides this notation, there's also the suffix letter method you might encounter: For example, 3107H - adding the letter H after a number indicates it's a hexadecimal number

This (3107)10(3107)_{10} can be broken down as:

3×103+1×102+0×101+7×1003×10^3+1×10^2+0×10^1+7×10^0

Now let's look at a binary number (1101)2(1101)_2, which can also be represented as:

1×23+1×22+0×21+1×201×2^3+1×2^2+0×2^1+1×2^0

Can you now understand why they're called decimal and binary systems?

For number system conversion, binary to decimal is simple. You take:

1×23+1×22+0×21+1×201×2^3+1×2^2+0×2^1+1×2^0

Then calculate in decimal:

8+4+0+18+4+0+1

Which equals (13)10(13)_{10}, so we can write (1101)2(1101)_2=(13)10(13)_{10}

3. What is ASCII Code

  • ASCII (American Standard Code for Information Interchange) - the name is quite long, but let's focus on the word "code"

We know that representing numbers in computers is relatively simple, but how do we represent text? For example, letters like a, b, c, d, punctuation marks, Chinese characters, etc.

Our predecessors came up with an excellent solution: create a table that stores these text symbols

Text Symbol
a
b
c

Then assign numbers to these text symbols in order:

NumberText Symbol
0a
1b
2c

When you need to retrieve them, just use the numbers. Want 'a'? Input 1. Want 'b'? Input 2, and so on (now you might understand the meaning of "code")

So ASCII code table was an early symbol table. How early? 1967, when Americans invented it. Americans only use 26 letters, plus various other symbols, so 128 characters were sufficient to cover everything.

Why 128? As mentioned earlier, computers use binary representation. Starting from number 0, going up to 127, that's exactly 128 numbers.

The smallest number is 0, the largest is 127. If we can represent the largest number, we can represent all smaller numbers too. (1111111)2(1111111)_2 equals 127, which we can expand as:

1×26+1×25+1×24+1×23+1×22+1×21+1×201×2^6+1×2^5+1×2^4+1×2^3+1×2^2+1×2^1+1×2^0

Also, this (1111111)2(1111111)_2 has 7 bits total, so we can say it's a 7-bit ASCII code table.