Data inclination processing method and device, terminal equipment and storage medium

A technology of oblique processing and data, applied in the direction of electrical digital data processing, multi-channel program device, program control design, etc., can solve problems such as hindering system performance, and achieve the effect of load balancing

Inactive Publication Date: 2020-11-27
GUANGDONG POLYTECHNIC NORMAL UNIV +1
View PDF7 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The invention provides a data skew processing method to solve the technical problem that the existing partition algorithm has additional operations hindering the system performance, by adding a variable weight to predict the size of the partition, after sampling the dat

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Examples

Experimental program
Comparison scheme
Effect test

Example Embodiment

[0064] The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. Obviously, the described embodiments are only part of the embodiments of the present invention, but not all of them. Based on the embodiment of the present invention, all other embodiments obtained by ordinary technicians in the field without creative labor are within the scope of the present invention.

[0065] Nowadays, with the rapid development of informatization and digitalization, the scale of data increases exponentially. Therefore, Gartner has given the definition of 3V according to the definition of big data, specifically Volume, Variety and Velocity. On this basis, International Data Corporation (IDC) developed it into a 4V definition. He believes that the definition of big data should also have Value, so the processing of big data has great research value.

[0066] Theoretically

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data skew processing method and device, terminal equipment and a storage medium. The method comprises the steps: carrying out the sampling of data based on a preset samplingalgorithm, so as to obtain sample data with the same probability, and obtaining the size of the space occupied by each value through the accumulation calculation of the data; dividing the sample datainto tilt data and non-tilt data by using a data tilt detection model; and allocating the non-inclined data to a preset Hash partition, and dynamically allocating the inclined data to each storage partition based on a dynamic allocation algorithm so as to balance the Spark load. According to the embodiment of the invention, a variable weight is added to predict the partition size; after data is sampled, the data is classified into inclined data and non-inclined data by using a data inclination detection model, the size of a Reduce partition is predicted by using the non-inclined data, and theinclined data is distributed to each partition in a balanced manner, so that the Spark load is more balanced.

Description

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Owner GUANGDONG POLYTECHNIC NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products