Two ways to achieve online status statistics of Java

100 million users online scenario analysis

Taking QQ online status statistics as an example, its typical characteristics include:Large data volume, high memory usage, and high real-time requirements. Traditional solutions (such as adding an online status field to each user in the database, setting it to 1 on the online and setting it to 0 on the offline) seem to be ineffective in this scenario. The reasons are as follows:

Database pressure: Frequent up-and-down operations will lead to a sharp increase in database IO pressure.
Real-time statistics difficulties: Frequent refresh of queries will drag down the database performance and will be difficult to meet real-time requirements.

Therefore, we need to find more efficient and more suitable solutions for large-scale scenarios.

2. Solution

For online status statistics of hundreds of millions of users, common solutions can be divided into two categories:

2.1 Statistical scheme based on total number

By maintaining a total number of online users, the counter is increased by 1 when the user is online and reduced by 1 when the user is offline, thereby achieving statistics on the number of online users.

advantage

Simple implementation and high efficiency.
Low memory usage.

shortcoming

It is impossible to accurately query the online status of a user at a certain moment.
In the case of abnormal exit from the application, it is difficult to achieve repeated offline judgments based on the online monitoring mechanism.

2.2 Statistical plan based on specific user details

Store the user's identity (such as QQ number) and online status in the set, and statistics are achieved through set operations.

advantage

The statistics are accurate and you can query the online status of a user at a certain moment.
In the case of abnormal exit from the application, the deduplication function of offline users can be accurately implemented.

shortcoming

Large memory usage.
Low efficiency.

3. Specific implementation

The following are the specific implementation methods of the two solutions:

3.1 Statistical scheme based on total number

Statistics based on total number can be achieved in the following two ways:

3.1.1 Redis-based incr and decr operations

Using Redisincr(plus 1) anddecr(Subtract 1) Operation to maintain the online user counter. Called when the user is onlineincr, called when offlinedecr。

3.1.2 Redis-based HyperLogLog

Redis's HyperLogLog (HLL) is a high-performance cardinality (deduplication) statistical data structure suitable for deduplication statistics of large-scale data. The advantage is that the space occupancy rate is extremely low (only 12KB of space can count about 1.8 billion data), but the disadvantage is that there is an extremely low error rate (about 0.81%). The characteristics of HLL are as follows:

The element cannot be removed.
Suitable for scenarios with high tolerance for errors.

3.2 Statistical implementation based on user identification

Based on the user ID (such as QQ number), you can use Redis'sBitmap (bit array)To achieve it. The structure of Bitmap is as follows:

Each subscript represents a specific number, a value of 1 means online, and a value of 0 means offline.
For example, the bit array space occupied by 1 billion numbers is1 billion bit = 0.116 GB, the space occupies very little.

Specific operation commands

Users are online:useSETBITThe command sets the corresponding position to 1.
User offline:useSETBITThe command sets the corresponding position to 0.
Determine whether the user is online:useGETBITOrder.
Statistics of online users:useBITCOUNTOrder.

3.3 Implementation in Spring Boot

In Spring Boot projects, you can useRedisTemplateImplement users’ online and offline settings and online number statistics. The specific code is as follows:

import ;
import ;
import ;

@Service
public class BitmapService {

    @Autowired
    private RedisTemplate&lt;String, Object&gt; redisTemplate;

    /**
      * Set the bits in Bitmap
      * @param key
      * @param offset offset
      * @param value value (0 or 1)
      */
    public void setBit(String key, long offset, boolean value) {
        ().setBit(key, offset, value);
    }

    /**
      * Get the bit in the Bitmap
      * @param key
      * @param offset offset
      * The value of the @return bit (0 or 1)
      */
    public boolean getBit(String key, long offset) {
        return ().getBit(key, offset);
    }

    /**
      * Calculate the number of bits with a median value of 1 in Bitmap
      * @param key
      * @return Number of bits with value 1
      */
    public Long bitCount(String key) {
        return ().bitCount(key);
    }
}

4. Summary

When dealing with online status statistics of billions of users, choosing the right solution is crucial. Although the statistical scheme based on the total number is simple and efficient, it lacks accuracy; while the statistical scheme based on the user ID is accurate, it consumes a lot of memory. Based on actual needs, the following paths can be selected:

If you have extremely high real-time and performance requirements and can tolerate a small amount of error, you can choose a Redis-based HyperLogLog or Bitmap solution.
If you need to accurately query the user's online status and have low memory usage and efficiency requirements, you can choose a collection plan based on user identification.

This is the end of this article about Java's example of online status statistics for billions of users. For more relevant content on online status statistics for billions of users, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!