Pre-partitioning method based on Internet of Vehicles Hbase time series data

一种时序数据、预分区的技术,应用在车联网领域,能够解决分区压力大、节点性能下降、资源分配不均等问题,达到资源分配均衡、解决磁盘I/O、节省存储空间的效果

Pending Publication Date: 2021-06-25
XIAMEN YAXON NETWORKS CO LTD
View PDF0 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] ① The total number of vehicles in the project must be fixed in order to obtain the splitKey value of the pre-partition, and with the operation of the project, it is inevitable to increase the number of vehicles;
[0005] ② When new vehicles are added, since the splitKey value of the obtained pre-partition is sorted by the vehicle id string, it is not convenient to specify and store the data of this batch of new vehicles by adding a new partition; and the newly added vehicles are sorted by characters The string sorting will be scattered on the partitions that have been created and allocated. Some partitions are under too much pressure, causing uneven resource allocation of each node in hbase, resulting in performance degradation of some nodes in the cluster, and affecting the normal operation of the business;
[0006] ③ After adding new vehicles, if the originally planned cluster size cannot bear the increased pressure on the cluster due to the addition of new vehicles, and new storage nodes need to be expanded to share the pressure on the cluster, the partitions of the nodes with high pressure will be migrated to the newly added nodes In order to reduce the pressure on the nodes, when a large number of partitions are migrated to the newly added storage nodes, in order to meet the data localization of hbase, these migrated partitions need to migrate the data managed by them to the newly added storage nodes. Such an operation Increase the burden of disk I / O and network communication between clusters, affecting the normal operation of the business;
[0007] ④ The data of the general vehicle network project needs to be kept for at least 2 to 3 years. The RowKey generation rule of "vehicle id + _ + data reporting time stamp (accurate to milliseconds) + _ + three-digit random number" is adopted, As a result, all the data of each car is stored in one partition, and as time goes by, the amount of data stored in each partition will increase, which will seriously affect the query performance of the partition; and for more than 2 to 3 years When cleaning expired historical data, because hbase deletes data only marks the data as deleted, it is not a real deletion. Only when performing a large merge on the partition will the expired data be deleted from the disk. A large number of partitions perform a large merge Sometimes it will cause disk I / O between clusters, network communication will be affected, and the normal operation of the business will be affected;
[0008] To sum up, although the traditional pre-partitioning method can maintain effective storage management for a certain period of time, the long-term operation and management will cause a serious burden on the cluster

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

example

[0030] ① First, according to the total number of vehicles S planned in the initial stage of the project, analyze how many partitions P need to be created each month when creating tables in Hbase, and how many vehicles N are allocated to each partition to ensure that the performance standards of the project plan are met. Assume that S=100,000, N =200, take this as an example.

[0031] ②According to the above-defined RowKey generation rules of Hbase time series data, obtain the RowKey value of 100,000 vehicles at a certain point in time. Assuming that the project officially started operation in November 2019, take the vehicle id = 1, data reporting time = "2019-11-01 10:11:12.000" as an example to generate the RowKey value.

[0032] (1) round the vehicle id=1 to an integer (1-1) / 200=0, and then convert the value 0 into a 4-digit fixed-length hexadecimal value 0000;

[0033] (2) Obtain the year 2019 of the timestamp of the reported time and convert it into a 3-digit hexadecimal ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a pre-partitioning method based on Internet of Vehicles Hbase time series data, and the method comprises the following steps: S1, defining a RowKey generation rule algorithm of the Hbase time series data, obtaining a RowKey value of a piece of data reported by a vehicle with a fixed length of 23 bits, and taking the first 8 bits of the RowKey value as spitKey values of pre-partitioning; S2, creating a pre-partition according to the splitKey value; and S3, distributing a partition for the newly-added vehicle according to the splitKey value, and when the partition corresponding to the splitKey value of the newly-added vehicle does not exist, creating a new partition according to the splitKey value and writing the data of the newly-added vehicle into the new partition. On one hand, resource allocation balance of each storage node can be ensured, and on the other hand, data storage space can be saved.

Description

technical field [0001] The present invention relates to the field of the Internet of Vehicles, in particular to a pre-partitioning method based on Hbase time series data of the Internet of Vehicles. Background technique [0002] With the rapid development of the Internet of Vehicles industry, the number of vehicles that the Internet of Vehicles service platform is facing has been upgraded from thousands to tens of thousands, millions, and tens of millions of vehicles. The Hbase time series data storage mode is used to store the history of the Internet of Vehicles service platform Trajectory data is fairly common. The storage capacity of time series data and efficient, convenient, and sustainable operation management have become indispensable indicators in the system architecture. As for the storage capacity of data, the horizontal expansion of storage nodes can be realized by means of distributed load balancing at the system architecture level, so as to achieve unlimited im...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F3/06
CPCG06F3/0608G06F3/061G06F3/0644G06F3/0647G06F3/067
Inventor 陈福林游锋锋杨俊辉曾夺
Owner XIAMEN YAXON NETWORKS CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products